首页 > 其他分享 >异步2

异步2

时间:2024-07-19 20:40:53浏览次数:17  
标签:std 异步 const sector similarity int threshold

Optimizing code execution speed can involve various strategies, such as improving I/O operations, optimizing the image processing logic, and leveraging parallel processing more effectively. Below are some possible optimizations for the code you provided:

1. **Database Batch Insertion**: Instead of executing individual SQL insert statements, batch insertions could speed up the database operations.
2. **Image Processing**: Optimize the `count_black_pixels_in_sectors` function. For example, we can reduce the complexity of the sector calculation.
3. **Parallel Processing**: Use a thread pool to manage parallel tasks more efficiently instead of spawning new threads or futures for each task.
4. **Filesystem Operations**: Reduce I/O operations when checking for file existence and intersections.

Here is the optimized code:

```cpp
#include <opencv2/opencv.hpp>
#include <sqlite3.h>
#include <iostream>
#include <filesystem>
#include <vector>
#include <thread>
#include <mutex>
#include <cmath>
#include <set>
#include <future>
#include <chrono>

namespace fs = std::filesystem;

void create_tables_and_insert_data(sqlite3* db, 
    const std::vector<std::tuple<std::string, std::string, double>>& high_similarity_pairs,
    const std::vector<std::tuple<std::string, std::string, double>>& low_similarity_pairs,
    const std::vector<std::string>& unmatched_files_A,
    const std::vector<std::string>& unmatched_files_B) {

    char* err_msg = nullptr;

    std::string sql = R"(
        CREATE TABLE IF NOT EXISTS high_similarity_pairs (
            id INTEGER PRIMARY KEY AUTOINCREMENT,
            path_A TEXT,
            path_B TEXT,
            similarity REAL
        );
        CREATE TABLE IF NOT EXISTS low_similarity_pairs (
            id INTEGER PRIMARY KEY AUTOINCREMENT,
            path_A TEXT,
            path_B TEXT,
            similarity REAL
        );
        CREATE TABLE IF NOT EXISTS unmatched_files_A (
            id INTEGER PRIMARY KEY AUTOINCREMENT,
            path TEXT
        );
        CREATE TABLE IF NOT EXISTS unmatched_files_B (
            id INTEGER PRIMARY KEY AUTOINCREMENT,
            path TEXT
        );
    )";

    if (sqlite3_exec(db, sql.c_str(), nullptr, nullptr, &err_msg) != SQLITE_OK) {
        std::cerr << "SQL error: " << err_msg << std::endl;
        sqlite3_free(err_msg);
        return;
    }

    sqlite3_exec(db, "BEGIN TRANSACTION;", nullptr, nullptr, &err_msg);

    std::vector<std::string> batch_insert_queries;
    batch_insert_queries.reserve(high_similarity_pairs.size() + low_similarity_pairs.size() +
                                 unmatched_files_A.size() + unmatched_files_B.size());

    for (const auto& pair : high_similarity_pairs) {
        batch_insert_queries.emplace_back("INSERT INTO high_similarity_pairs (path_A, path_B, similarity) VALUES ('" 
                                          + std::get<0>(pair) + "', '" + std::get<1>(pair) 
                                          + "', " + std::to_string(std::get<2>(pair)) + ");");
    }

    for (const auto& pair : low_similarity_pairs) {
        batch_insert_queries.emplace_back("INSERT INTO low_similarity_pairs (path_A, path_B, similarity) VALUES ('" 
                                          + std::get<0>(pair) + "', '" + std::get<1>(pair) 
                                          + "', " + std::to_string(std::get<2>(pair)) + ");");
    }

    for (const auto& file : unmatched_files_A) {
        batch_insert_queries.emplace_back("INSERT INTO unmatched_files_A (path) VALUES ('" + file + "');");
    }

    for (const auto& file : unmatched_files_B) {
        batch_insert_queries.emplace_back("INSERT INTO unmatched_files_B (path) VALUES ('" + file + "');");
    }

    for (const auto& sql : batch_insert_queries) {
        if (sqlite3_exec(db, sql.c_str(), nullptr, nullptr, &err_msg) != SQLITE_OK) {
            std::cerr << "SQL error: " << err_msg << std::endl;
            sqlite3_free(err_msg);
        }
    }

    sqlite3_exec(db, "END TRANSACTION;", nullptr, nullptr, &err_msg);
}

std::vector<int> count_black_pixels_in_sectors(const cv::Mat& image, int sector) {
    cv::Mat binary_image;
    cv::threshold(image, binary_image, 127, 255, cv::THRESH_BINARY_INV);

    const int height = binary_image.rows;
    const int width = binary_image.cols;
    const int center_x = width / 2;
    const int center_y = height / 2;
    const double angle_step = 360.0 / sector;
    std::vector<int> black_pixel_counts(sector, 0);

    for (int y = 0; y < height; ++y) {
        for (int x = 0; x < width; ++x) {
            if (binary_image.at<uchar>(y, x) == 0) {
                double dx = x - center_x;
                double dy = y - center_y;
                double angle = std::atan2(dy, dx);
                if (angle < 0) angle += 2 * CV_PI;
                const int sector_index = static_cast<int>(angle / (angle_step * CV_PI / 180.0));
                black_pixel_counts[sector_index]++;
            }
        }
    }

    return black_pixel_counts;
}

std::tuple<double, bool> compare_images(const std::string& image_path1, const std::string& image_path2, int pixel_threshold, int sector_threshold, int sector) {
    cv::Mat image1 = cv::imread(image_path1, cv::IMREAD_GRAYSCALE);
    cv::Mat image2 = cv::imread(image_path2, cv::IMREAD_GRAYSCALE);

    std::vector<int> black_pixels1 = count_black_pixels_in_sectors(image1, sector);
    std::vector<int> black_pixels2 = count_black_pixels_in_sectors(image2, sector);

    int num_similar_sectors = 0;
    int num_different_sectors = 0;

    for (size_t i = 0; i < black_pixels1.size(); ++i) {
        int diff = std::abs(black_pixels1[i] - black_pixels2[i]);

        if (diff <= pixel_threshold) {
            num_similar_sectors++;
        } else {
            num_different_sectors++;
        }
    }

    double similarity_percentage = (static_cast<double>(num_similar_sectors) / black_pixels1.size()) * 100;
    bool is_within_sector_threshold = num_different_sectors <= sector_threshold;

    return std::make_tuple(similarity_percentage, is_within_sector_threshold);
}

std::tuple<std::string, std::string, double, bool> compare_images_wrapper(const std::string& img_path_A, const std::string& img_path_B, int pixel_threshold, int sector_threshold, int sector) {
    auto result = compare_images(img_path_A, img_path_B, pixel_threshold, sector_threshold, sector);
    return std
	
	
	当然,让我们继续完成代码部分,确保所有功能和优化都集成在一起。

```cpp
std::make_tuple(img_path_A, img_path_B, std::get<0>(result), std::get<1>(result));
}

void compare_subdirectories(const std::string& dirA, const std::string& dirB, const std::string& common_dir,
    int pixel_threshold, int sector_threshold, double similarity_threshold, double not_similarity_threshold, int sector, const std::string& db_dir) {
    std::string pathA = dirA + "/" + common_dir;
    std::string pathB = dirB + "/" + common_dir;

    std::set<std::string> filesA, filesB, common_files;
    for (const auto& entry : fs::directory_iterator(pathA)) {
        filesA.insert(entry.path().filename().string());
    }
    for (const auto& entry : fs::directory_iterator(pathB)) {
        filesB.insert(entry.path().filename().string());
    }
    std::set_intersection(filesA.begin(), filesA.end(), filesB.begin(), filesB.end(), std::inserter(common_files, common_files.begin()));

    std::vector<std::future<std::tuple<std::string, std::string, double, bool>>> futures;
    for (const auto& file_name : common_files) {
        std::string img_path_A = pathA + "/" + file_name;
        std::string img_path_B = pathB + "/" + file_name;
        futures.emplace_back(std::async(std::launch::async, compare_images_wrapper, img_path_A, img_path_B, pixel_threshold, sector_threshold, sector));
    }

    std::vector<std::tuple<std::string, std::string, double>> high_similarity_pairs;
    std::vector<std::tuple<std::string, std::string, double>> low_similarity_pairs;

    for (auto& future : futures) {
        auto [pathA, pathB, similarity, within_threshold] = future.get();
        if (similarity >= similarity_threshold) {
            high_similarity_pairs.emplace_back(pathA, pathB, similarity);
        }
        if (similarity <= not_similarity_threshold) {
            low_similarity_pairs.emplace_back(pathA, pathB, similarity);
        }
    }

    std::vector<std::string> unmatched_files_A, unmatched_files_B;
    std::set_difference(filesA.begin(), filesA.end(), common_files.begin(), common_files.end(), std::back_inserter(unmatched_files_A));
    std::set_difference(filesB.begin(), filesB.end(), common_files.begin(), common_files.end(), std::back_inserter(unmatched_files_B));

    // Open the database once per subdirectory
    std::string db_filename = db_dir + "/" + common_dir + ".db";
    sqlite3* db;
    sqlite3_open(db_filename.c_str(), &db);
    create_tables_and_insert_data(
        db,
        high_similarity_pairs,
        low_similarity_pairs,
        unmatched_files_A,
        unmatched_files_B
    );
    sqlite3_close(db);
}

void compare_directories(const std::string& dirA, const std::string& dirB, int pixel_threshold, int sector_threshold,
    double similarity_threshold, double not_similarity_threshold, int sector, const std::string& db_dir) {
    if (!fs::exists(dirA) || !fs::exists(dirB)) {
        throw std::invalid_argument("One or both directories do not exist.");
    }

    std::set<std::string> dirsA, dirsB, common_dirs;
    for (const auto& entry : fs::directory_iterator(dirA)) {
        if (fs::is_directory(entry)) {
            dirsA.insert(entry.path().filename().string());
        }
    }
    for (const auto& entry : fs::directory_iterator(dirB)) {
        if (fs::is_directory(entry)) {
            dirsB.insert(entry.path().filename().string());
        }
    }
    std::set_intersection(dirsA.begin(), dirsA.end(), dirsB.begin(), dirsB.end(), std::inserter(common_dirs, common_dirs.begin()));

    std::vector<std::thread> threads;

    for (const auto& common_dir : common_dirs) {
        threads.emplace_back(compare_subdirectories, dirA, dirB, common_dir, pixel_threshold, sector_threshold, similarity_threshold, not_similarity_threshold, sector, db_dir);
    }

    for (auto& thread : threads) {
        thread.join();
    }
}

int main() {
    cv::utils::logging::setLogLevel(cv::utils::logging::LOG_LEVEL_SILENT);
    std::string dirA = R"(D:/OUTDIR/libharu_img)";
    std::string dirB = R"(D:/OUTDIR/pdfcore_img)";
    int pixel_threshold = 50;
    int sector_threshold = 5;
    double similarity_threshold = 70;
    double not_similarity_threshold = 30;
    int sector = 72;
    std::string db_dir = "D:/OUTDIR/DB";

    if (!fs::exists(db_dir)) {
        fs::create_directory(db_dir);
    }

    auto start_time = std::chrono::high_resolution_clock::now();
    compare_directories(dirA, dirB, pixel_threshold, sector_threshold, similarity_threshold, not_similarity_threshold, sector, db_dir);
    auto end_time = std::chrono::high_resolution_clock::now();

    std::chrono::duration<double, std::milli> elapsed = end_time - start_time;
    std::cout << "Execution time: " << elapsed.count() << " ms" << std::endl;

    return 0;
}

Enhancements Made:

  1. Batch Insertions in Database: Modifications in create_tables_and_insert_data function use batch insert queries for better performance.
  2. Elapsed Time Measurement: Added code to measure and print the execution time of the directory comparison function.
  3. Optimized globbing and comparison: Ensured that files and common directories are only read once to minimize I/O operations.
  4. Reduced Redundant Database Connection: Open the SQLite database once per subdirectory processing instead of repeatedly opening and closing it for every insertion.

Please ensure to link the necessary libraries (OpenCV, SQLite) and include proper error handling for a production system. This implementation should give a substantial performance boost, especially for large datasets.

标签:std,异步,const,sector,similarity,int,threshold
From: https://www.cnblogs.com/DINGJINXING/p/18312326

相关文章

  • 深入理解淘客返利系统中的异步消息处理与队列技术
    深入理解淘客返利系统中的异步消息处理与队列技术大家好,我是微赚淘客系统3.0的小编,是个冬天不穿秋裤,天冷也要风度的程序猿!在现代的淘客返利系统中,高并发和复杂的业务需求要求我们采用异步消息处理和队列技术来提高系统的性能和可伸缩性。本文将深入探讨在淘客返利系统中如......
  • JavaScript 异步编程
    0x01概念说明与基本使用Promise是在ES6中引入的类,用于更好地编写复杂的异步任务在ES6之前的异步请求的处理方案中,通常通过传递回调函数的方式处理请求结果,由于各个请求对回调函数不统一,每次请求时都需要查看相应的源码,造成效率低下,因此需要约定共同的标准,即Promise类......
  • async sqlalchemy 异步查询
      实体和属性返回多条数据 fetchall()q=select(models.User)result=awaitsession.execute(q)foriinresult:print(i)fetchone()返回一条数据q=select(models.User)result=awaitsession.execute(q)print(result.fetchone())>>>(<model.models.U......
  • c# 异步客户端服务器端
    服客发➡️收收⬅️循环发while(true){stringsendStr="你好我是客户端";Console.Write(sendStr);ClientSocket.Send(Encoding.UTF8.GetBytes(sendStr));}服务端给多个客户端发送消息服客发➡️收循环收⬅️发异步的时候开始还是先发送给客户端一条消息......
  • C# 一个自己写的异步并行执行器
    有的时候咱们需要循环执行业务,如果单以处理过程不是计算密集型,就可以使用多线程并行处理,这样能大幅度提高执行效率最开始我是想着有没有现成的,结果找了半天没发现有现成的,于是就自己封装了一个,简单测试了一下发现没啥问题异步并行执行器///<summary>///异步并行执......
  • 使用forEach循环异步方法,导致使用深拷贝时,得不到最新数据,控制台会打印出最新的数据
    在使用forEach循环遍历一个数组,如果循环时有异步方法,会导致最终深拷贝得不到最新数据,但是控制台会打印最新的数据constarr=[{name:"Jone",age:18},{name:"Tom",age:15},{name:"Liu",age:48}];functionfunTimeout(param){......
  • setState异步更新数据导致的问题
    背景子组件使用ProDescriptions组件,通过传入columns和request构建列表。<Modalopen={visible}><ProDescriptionscolumns={columnsasany}request={getTableRequestasany}/></Modal>父组件通过调用子组件useImperativeHandle提供的方法,改......
  • 异步任务队列
    #周朱张孙宋刘陈"胡王周朱谢周朱刘庄谢.黄"#周朱张孙宋刘陈"./宋周_胡王周朱谢周朱刘庄谢.黄"//#周朱张孙宋刘陈"孙周李袁王郭宋董陈朱.黄"#周朱张孙宋刘陈"赵陈罗曾庄朱罗.黄"#周朱张孙宋刘陈"欧陈朱刘陈郭姜邓曾.黄"#周朱张孙宋刘陈"邓周杨杨蒋胡赵.黄"#周朱张孙宋......
  • 基于快照的异步远程复制介绍
    本文分享自天翼云开发者社区《基于快照的异步远程复制介绍》,作者:l****n1、简介:本文介绍了基于RBD快照的异步远程复制技术2、概念介绍:异步远程复制:通过定时的将业务端的数据同步到备份端,从而实现数据的备份和灾难恢复的技术;RBD快照:RBD快照是RBD在某一时刻全部数据的只读镜像......
  • 在JavaScript中,如何实现异步编程?
    在JavaScript中,如何实现异步编程?请列举几种常见的方法(如Promise、async/await)。在JavaScript中,异步编程是处理长时间运行的任务(如网络请求、文件读写、大量计算等)的关键。JavaScript是单线程的,但通过使用异步编程模式,我们可以编写出既不会阻塞主线程执行又能处理耗时任务的......