51c视觉~CV~合集11

一、C++ 和 Python 融合以实现高效计算机视觉应用

计算机视觉应用既需要卓越的图像处理性能，也需要现代机器学习框架的灵活性。在本指南中，我们实现了一个模块化的计算机视觉流程，涵盖以下内容：

图像处理：使用 OpenCV 在 C++ 中应用快速高斯模糊，并使用 pybind11 呈现给 Python。

建模和分类：一个 Python 模块，用于提取简单特征、训练机器学习分类器（使用 scikit-learn）并对新图像进行分类。

这种架构允许您利用 Python 强大的生态系统进行模型构建和实验，并在需要时利用 C++ 的速度。

先决条件和环境设置

开始之前，请确保已安装以下软件：

Python 3.10.11

C++ 编译器：兼容 C++11（或更新版本）
OpenCV：用于 C++ 和 Python 中的图像处理
pybind11：用于创建 Python 到 C++ 代码的绑定
NumPy：用于 Python 中的数值运算
scikit-learn：用于构建和评估分类器

您可以使用 pip3 安装必要的 Python 包：

pip3 install numpy opencv-python-headless scikit-learn pybind11

按照你的 C++ 编程平台的指南安装 OpenCV 和 pybind11。使用 CMake（或直接从编译器）运行的标准编译命令可能如下所示：

c++ -O3 -Wall -shared -std=c++11 -fPIC \
    `python3 -m pybind11 --includes` fast_blur.cpp -o fast_blur`python3-config --extension-suffix` \
    $(pkg-config --cflags --libs opencv4)

第 1 部分：使用 C++ 和 Python 集成进行图像处理

在性能至关重要的图像处理中，C++ 可以显著加速流程。我们将使用 OpenCV 的高斯模糊构建一个基本的 C++ 模块。此代码将图片作为 NumPy 数组，应用高斯模糊，并输出结果图像。

C++代码：fast_blur.cpp

#include <pybind11/pybind11.h>
#include <pybind11/numpy.h>
#include <opencv2/opencv.hpp>
#include <stdexcept>
#include <cstring>

namespace py = pybind11;
// Applies a Gaussian blur to a 3-channel image.
// The input is expected to be a NumPy array of shape (height, width, channels).
py::array_t<unsigned char> gaussian_blur(py::array_t<unsigned char> input, int kernel_size, double sigma) {
    // Request a buffer descriptor from Python
    auto buf = input.request();
    if (buf.ndim != 3)
        throw std::runtime_error("Input image must be 3-dimensional");
    int height = buf.shape[0];
    int width  = buf.shape[1];
    int channels = buf.shape[2];
    // Wrap the raw buffer as a cv::Mat.
    cv::Mat img(height, width, CV_8UC3, (unsigned char*)buf.ptr);

    // Apply Gaussian Blur using OpenCV.
    cv::Mat blurred;
    cv::GaussianBlur(img, blurred, cv::Size(kernel_size, kernel_size), sigma);
    // Construct a NumPy array to hold the result.
    // Note: pybind11 will manage the memory for you.
    return py::array_t<unsigned char>(
        // shape
        {height, width, channels},  
        {width * channels * sizeof(unsigned char), channels * sizeof(unsigned char), sizeof(unsigned char)}, // strides
        // pointer to data
        blurred.data  
    );
}
PYBIND11_MODULE(fast_blur, m) {
    m.doc() = "Module for fast image blur using C++ and OpenCV";
    m.def("gaussian_blur", &gaussian_blur, "Apply Gaussian Blur to an image",
          py::arg("input"), py::arg("kernel_size"), py::arg("sigma"));
}

解释

缓冲区协议：我们使用 pybind11 的缓冲区接口直接处理 NumPy 数组。

OpenCV 集成：处理后，图像作为包裹在 cv::Mat 中的新 NumPy 数组返回。
编译：要编译此模块，请确保您的系统已正确设置 OpenCV 和 pybind11。

编译完成后，您可以在 Python 中将此模块导入为 fast_blur。

第 2 部分：用于图像处理和分类的模块化 Python 代码

C++ 模块准备就绪后，您可以将其添加到 Python 模块中，以进行额外的建模和计算。我们将流程分为两部分：一部分用于建模或分类，另一部分用于图像处理。

Python模块：image_processing.py

该模块利用我们的 C++ 模块实现快速高斯模糊，管理图片 I/O 和处理。

import cv2
import numpy as np
import fast_blur

def load_image(path):
    image = cv2.imread(path)
    if image is None:
        raise IOError("Unable to load image at " + path)
    return image
def process_image(image, kernel_size=5, sigma=1.0):
    # Convert to appropriate type if needed (OpenCV uses uint8 images)
    blurred = fast_blur.gaussian_blur(image, kernel_size, sigma)
    return blurred
if __name__ == "__main__":
    # Example usage: process an image and save the result
    image = load_image("sample.jpg")  # Make sure 'sample.jpg' exists
    blurred_image = process_image(image)
    cv2.imwrite("blurred_sample.jpg", blurred_image)
    print("Blurred image saved as 'blurred_sample.jpg'")

Python模块：classification.py

此模块演示了一个简单的分类流程。在这里，我们：

从图像中提取特征（使用颜色直方图）

从文件夹结构准备数据集（每个子文件夹被视为一个类）
使用 scikit-learn 训练 SVM 分类器
使用分类器预测新图像的类别

import cv2
import numpy as np
import os
from sklearn import svm
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

def extract_feature(image):
    # Calculate a 3D color histogram
    hist = cv2.calcHist([image], [0, 1, 2], None, [8, 8, 8], [0, 256, 0, 256, 0, 256])
    cv2.normalize(hist, hist)
    return hist.flatten()

def prepare_dataset(folder_path):
    features = []
    labels = []
    for label in os.listdir(folder_path):
        class_folder = os.path.join(folder_path, label)
        if os.path.isdir(class_folder):
            for filename in os.listdir(class_folder):
                img_path = os.path.join(class_folder, filename)
                image = cv2.imread(img_path)
                if image is not None:
                    feat = extract_feature(image)
                    features.append(feat)
                    labels.append(label)
    return np.array(features), np.array(labels)

def train_classifier(features, labels):
    clf = svm.SVC(kernel='linear')
    clf.fit(features, labels)
    return clf

def classify_image(image, classifier):
    feature = extract_feature(image)
    prediction = classifier.predict([feature])
    return prediction[0]

if __name__ == "__main__":
    # Make sure to organize your dataset accordingly.
    dataset_path = "dataset"  
    X, y = prepare_dataset(dataset_path)

    # Split the dataset into training and test sets
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

    # Train the classifier and evaluate its performance
    classifier = train_classifier(X_train, y_train)
    predictions = classifier.predict(X_test)
    print("Classification Accuracy:", accuracy_score(y_test, predictions))

解释

特征提取：为简单起见，我们使用像素强度的 3D 直方图作为特征向量。

数据集准备：该函数从由类标签构成的目录中读取图像。
分类器：我们使用来自 scikit-learn 的线性支持向量机 (SVM)。

第 3 部分：构建模块化项目

模块化项目结构使您的代码更易于管理、测试和扩展。组织文件的一种方法如下：

computer_vision_project/ 
── CMakeLists.txt             # 如果使用 CMake 作为 C++ 模块
── fast_blur.cpp              # 用于快速图像处理的 C++ 代码
── image_processing.py        # 用于图像处理的 Python 模块
── classified.py          # 用于特征提取和分类的 Python 模块
── main.py                    # 将模块绑定在一起的驱动程序脚本
└── dataset/                   # 包含每个类的子目录的示例数据集

main.py 示例

该文件将所有内容联系在一起：它使用图像处理模块预处理图片，并使用分类模块预测其类别。

from image_processing import load_image, process_image
from classification import classify_image, prepare_dataset, train_classifier
import cv2

def main():
    # Step 1: Preprocess an image using the C++ Gaussian blur module.
    image_path = "sample.jpg" 
    image = load_image(image_path)
    processed_image = process_image(image, kernel_size=7, sigma=2.0)
    cv2.imwrite("processed_sample.jpg", processed_image)
    print("Processed image saved as 'processed_sample.jpg'.")
    # Step 2: Train a classifier using a pre-organized dataset.
    dataset_path = "dataset"  
    features, labels = prepare_dataset(dataset_path)
    classifier = train_classifier(features, labels)

    # Step 3: Classify the processed image.
    predicted_class = classify_image(processed_image, classifier)
    print("Predicted class for the processed image:", predicted_class)

if __name__ == "__main__":
    main()

结论

本文中我们演示了如何构建一个实用的计算机视觉管道，它结合了以下内容：

C++ 和 Python 集成：利用 pybind11 集成 C++ 模块，以便使用 OpenCV 进行快速图像处理。

模块化 Python 代码：为图像处理和机器学习分类实现单独的模块，这使得项目更易于维护和扩展。

这种模块化方法不仅性能卓越，还能简化不同算法和模型的实验。您可以通过添加更高级的特征提取器、集成深度学习框架或优化 C++ 中其他性能关键部分来扩展此项目。

一、C++ 和 Python 融合以实现高效计算机视觉应用

二、

三、

基于Surprise和Flask构建个性化电影推荐系统：从算法到全栈实现

使用 Zig 实现英文数字验证码识别

百度文心快码：IT界的职场必备神器

51c大模型~合集119