首页 > 编程问答 >C++ 函数返回极其缓慢,远慢于功能等效的 python 代码

C++ 函数返回极其缓慢,远慢于功能等效的 python 代码

时间:2024-07-31 04:58:11浏览次数:11  
标签:python c++ algorithm optimization string-matching

我有一个在我编写的脚本中使用的函数,用于从列表中删除多余的阻塞关键字。基本上,输入(以任何顺序):

{"apple", "bapple", "banana", "cherry", "bananaman", "sweetherrypie", "sweet", "b"}

它应该输出一个缩小的字符串数组(以任何顺序):

{"apple", "cherry", "sweet", "b"}

这样做是为了减少匹配算法需要传递的关键字数量,如一个个人学习项目(如果一个字符串包含“banana”,它也会包含“b”,所以香蕉是不必要的,并且可以被丢弃

它需要具有极高的性能,因为它必须在超过 10,000,000 个字符串的列表上运行.

最初,我编写了一些 Python 代码来执行此操作:

def deduplicate_keywords(keywords):
    # Remove duplicates and sort the keywords by length in ascending order
    keywords = sorted(set(keywords), key=len)
    unique_keywords = []

    for keyword in keywords:
        not_redundant = True

        for unique_keyword in unique_keywords:
            # Check if the current keyword includes any of the unique keywords
            if unique_keyword in keyword:
                not_redundant = False
                break
        # If the keyword does not include any of the shorter keywords, add it to unique_keywords
        if not_redundant:
            unique_keywords.append(keyword)

    return unique_keywords

以及一个性能基准来衡量其性能:

class TestDeduplicateKeywords(unittest.TestCase):
    def test_performance_intensive_case(self):
        keywords = [
            "abcdef",
            "def",
            "ghidef",
            "jkl",
            "mnop",
            "qrst",
            "uvwxyz",
            "ghijkl",
            "abcdefgh",
            "ijklmnop",
            "mnopqrstuv",
            "rstuvwx",
            "mnopqrstuvwx",
            "uvwx",
        ]

        # Extending the list to make it very long and complex
        base_keywords = keywords[:]
        for i in range(200000):
            base_keywords.extend([f"{k}{i}" for k in keywords])
            base_keywords.extend([f"{i}{k}" for k in keywords])
            base_keywords.extend([f"{i}{k}{i}" for k in keywords])

        keywords = base_keywords

        old_function_time = 0
        new_function_time = 0

        for repeat in range(1): 
            # This loop (not present in the cpp code) is used to loop it multiple times to find an average
            start_time = time.time()
            result = deduplicate_keywords(keywords)
            end_time = time.time()
            old_function_time = (
                repeat * old_function_time + (end_time - start_time)
            ) / (repeat + 1)
            start_time = time.time()
            result = deduplicate_keywords_new(keywords) 
            # Above is a function that can be edited to check whether changes speed up the code
            end_time = time.time()
            new_function_time = (
                repeat * new_function_time + (end_time - start_time)
            ) / (repeat + 1)

        # As the list is extended with numbers, the result will be unique non-redundant keywords
        expected_result = ["def", "jkl", "mnop", "qrst", "uvwx"]

        self.assertEqual(set(result), set(expected_result))
        print(
            f"Test performance_intensive_case for old function took {old_function_time} seconds\n"
            f"Test performance_intensive_case for new function took {new_function_time} seconds\n"
        )

        if old_function_time < new_function_time:
            print(f"Old is faster than new")
        else:
            print(f"New is faster than old")


if __name__ == "__main__":
    unittest.main()

这对于我的需求来说太慢了 12 seconds 在我的弱笔记本电脑上。

因此,我尝试在 cpp 中编写相同的内容,看看是否会获得性能增益 如此 慢得多,因此我向函数本身添加了一些计时代码,以查看速度减慢的位置:|| |这将输出到终端(缩短,删除当前空的 deduplicate_keywords_new 函数的日志):

#include <iostream>
#include <vector>
#include <string>
#include <set>
#include <algorithm>
#include <chrono>
#include <cassert>

std::vector<std::string> deduplicate_keywords(const std::vector<std::string>& keywords) {
    auto t_start = std::chrono::high_resolution_clock::now();

    // Remove duplicates using a set
    auto t1 = std::chrono::high_resolution_clock::now();
    std::cout << "started old function!" << std::endl;
    std::set<std::string> unique_set(keywords.begin(), keywords.end());
    auto t2 = std::chrono::high_resolution_clock::now();
    std::cout << "removed duplicates in " << std::chrono::duration<double>(t2 - t1).count() << " seconds" << std::endl;

    // Convert set back to vector and sort by length
    auto t3 = std::chrono::high_resolution_clock::now();
    std::vector<std::string> sorted_keywords(unique_set.begin(), unique_set.end());
    std::sort(sorted_keywords.begin(), sorted_keywords.end(), [](const std::string& a, const std::string& b) {
        return a.length() < b.length();
    });
    auto t4 = std::chrono::high_resolution_clock::now();
    std::cout << "Sorted by length in " << std::chrono::duration<double>(t4 - t3).count() << " seconds" << std::endl;

    // Filter out redundant keywords
    auto t5 = std::chrono::high_resolution_clock::now();
    std::vector<std::string> unique_keywords;
    for (const auto& keyword : sorted_keywords) {

        for (const auto& unique_keyword : unique_keywords) {
            // Check if the current keyword includes any of the unique keywords
            if (keyword.find(unique_keyword) != std::string::npos) {
                goto next_iteration;
            }
        }

        // If the keyword does not include any of the shorter keywords, add it to unique_keywords
        unique_keywords.push_back(keyword);
        next_iteration:
            continue;
    }
    auto t6 = std::chrono::high_resolution_clock::now();
    std::cout << "Filtered in " << std::chrono::duration<double>(t6 - t5).count() << " seconds" << std::endl;

    auto t7 = std::chrono::high_resolution_clock::now();
    sorted_keywords.clear();
    sorted_keywords.shrink_to_fit();
    unique_keywords.shrink_to_fit();
    auto t8 = std::chrono::high_resolution_clock::now();
    std::cout << "Destructed in " << std::chrono::duration<double>(t8 - t7).count() << " seconds" << std::endl;

    auto t9 = std::chrono::high_resolution_clock::now();
    auto total_time = std::chrono::duration<double>(t9 - t_start).count();
    std::cout << "Total function time: " << total_time << " seconds" << std::endl;

    return unique_keywords;
}

void test_performance_intensive_case() {
    std::vector<std::string> keywords = {
        "abcdef", "def", "ghidef", "jkl", "mnop", "qrst", "uvwxyz",
        "ghijkl", "abcdefgh", "ijklmnop", "mnopqrstuv", "rstuvwx",
        "mnopqrstuvwx", "uvwx"
    };

    // Extending the list to make it very long and complex
    std::vector<std::string> base_keywords = keywords;
    for (int i = 0; i < 20000; ++i) {
        for (const auto& k : keywords) {
            base_keywords.push_back(k + std::to_string(i));
            base_keywords.push_back(std::to_string(i) + k);
            base_keywords.push_back(std::to_string(i) + k + std::to_string(i));
        }
    }

    auto keywords_extended = base_keywords;
    std::set<std::string> expected_result = {"def", "jkl", "mnop", "qrst", "uvwx"};

    // there is no looping to calculate an average because it is so much slower
    auto start_time_total = std::chrono::high_resolution_clock::now();
    auto start_time = std::chrono::high_resolution_clock::now();
    std::vector<std::string> result_old = deduplicate_keywords(keywords_extended);
    auto end_time = std::chrono::high_resolution_clock::now();
    double old_function_time = std::chrono::duration<double>(end_time - start_time).count();
    auto end_time_total = std::chrono::high_resolution_clock::now();
    double total_test_time = std::chrono::duration<double>(end_time_total - start_time_total).count();
    assert(std::set<std::string>(result_old.begin(), result_old.end()) == expected_result);

    std::cout << "old function time: " << old_function_time << " seconds" << std::endl;
    std::cout << "Total test case time: " << total_test_time << " seconds" << std::endl;

    auto start_time_total_new = std::chrono::high_resolution_clock::now();
    auto start_time_new = std::chrono::high_resolution_clock::now();
    std::vector<std::string>result_new = deduplicate_keywords_new(keywords_extended); // same as in the python code, useful for comparison
    auto end_time_new = std::chrono::high_resolution_clock::now();
    double new_function_time = std::chrono::duration<double>(end_time - start_time).count();
    auto end_time_total_new = std::chrono::high_resolution_clock::now();
    double total_test_time_new = std::chrono::duration<double>(end_time_total_new - start_time_total_new).count();
    assert(std::set<std::string>(result_new.begin(), result_new.end()) == expected_result);


    std::cout << "new function time: " << new_function_time << " seconds" << std::endl;
    std::cout << "Total test case time: " << total_test_time_new << " seconds" << std::endl;

    std::cout << "Test performance_intensive_case for old function took " << old_function_time
              << " seconds but new function took " << new_function_time << " seconds"  << std::endl;

    if (old_function_time < new_function_time) {
        std::cout << "Old is faster than new" << std::endl;
    } else {
        std::cout << "New is faster than old" << std::endl;
    }
}

int main() {
    test_performance_intensive_case();
    return 0;
}

除了总体速度降低之外,日志似乎表明返回函数(

started old function!
removed duplicates in 9.64173 seconds
Sorted by length in 3.36138 seconds
Filtered in 0.366933 seconds
Destructed in 7.63208 seconds
Total function time: 21.0037 seconds
old function time: 405.213 seconds
Total test case time: 405.213 seconds

)似乎需要非常长的时间。为什么是这样?我能做些什么来阻止这种情况呢?我对 cpp 编码和开发相当陌生,因此如果您的解释详细而透彻,那就太好了。谢谢! return unique keywords ) seems to take extremely long. Why is this? And what can I do to stop this? I am fairly new to cpp coding and development, so it would be great if your explainations were detailed and thorough. Thanks!


遇到的 C++ 代码性能问题与返回语句本身无关,而是与向量去分配和复制的方式有关。问题出在 test_performance_intensive_case 函数中的这一行:

std::vector<std::string> result_old = deduplicate_keywords(keywords_extended);

这里发生的是, deduplicate_keywords 函数返回的 unique_keywords 向量 按值复制 result_old 向量。这意味着将复制整个向量内容(包括所有字符串),这是一项代价高昂的操作,尤其是在向量很大时。

C++ 代码的执行时间比 Python 代码长得多的原因是,Python 代码可能在内部优化了此复制操作,或者根本没有进行复制(例如,通过使用某种形式的写时复制)。

为了解决这个问题,可以通过 引用 返回向量:

const std::vector<std::string>& deduplicate_keywords(const std::vector<std::string>& keywords) {
    // ... [函数的其余部分保持不变] ...

    return unique_keywords;
}

并在 test_performance_intensive_case 函数中按引用接收它:

const std::vector<std::string>& result_old = deduplicate_keywords(keywords_extended);

通过按引用返回向量,避免了复制整个向量及其内容,这将显着提高代码的性能。

除了这个更改之外,以下是一些可以进一步优化 C++ 代码的额外提示:

  • 使用 reserve() 预分配向量容量: unique_keywords 向量中添加元素之前,使用 unique_keywords.reserve(sorted_keywords.size()) 预分配其容量。这将最大程度地减少向量在添加元素时需要调整大小的次数。
  • 考虑使用 std::unordered_set 来代替 std::set 如果不需要对关键字进行排序,则可以使用 std::unordered_set 来实现更快的重复项删除。
  • 使用分析器来识别瓶颈: 分析器可以帮助确定代码中占用最多时间的特定部分,这使可以专注于优化这些区域。

通过应用这些更改并进一步分析的代码,可以显着提高其性能并使其更快地执行。

标签:python,c++,algorithm,optimization,string-matching
From: 78812814

相关文章

  • 在预定时间从 python telegram bot 发起对话
    对于没有提供代码,我提前表示歉意。我明天会尝试添加它,但我现在还没有接近它,思考如何解决这个问题让我一直在思考。我已经为一个机器人创建了一个程序,该程序的数据帧充满了之前请求的用户添加到机器人的消息列表中。现在,机器人使用job_queue在一天中的随机时间向每个用户发......
  • python函数:带星号的参数一个接一个
    我想知道为什么定义以下函数被认为是好的。deffoo(*x,y):pass据我所知,现在有调用此函数的方法,因为它总是缺少y的值。(如果我错了,请纠正我。)这有什么我不知道的用处吗?你说得对,在Python中定义一个像deffoo(*x,y):...这样的函数,在试图以常规方......
  • 基于遗传(GA)、粒子群(PSO)、模拟退火(SA)、禁忌搜索(ST)、蚁群算法(ACO)、自自组织神
        ......
  • 10个append()函数在Python程序开发中的创新应用
    文末赠免费精品编程资料~~在Python编程的世界里,append()函数是列表操作中最常见的方法之一。它允许我们在列表的末尾添加一个元素,这一简单的功能却能激发无限的创造力。今天,我们将探讨append()函数在Python程序开发中的10种创新应用,从基本用法到高级技巧,逐步深入。1.构......
  • 全网最适合入门的面向对象编程教程:28 类和对象的Python实现-Python编程原则、哲学和规
    全网最适合入门的面向对象编程教程:28类和对象的Python实现-Python编程原则、哲学和规范大汇总摘要:本文主要介绍了在使用Python进行面向对象编程时,Python异常处理的原则-“请求谅解,而非许可”,以及软件设计和Python的编程原则,同时介绍了PEP8规范。原文链接:FreakStud......
  • python生成器
    一前言环境:python3.10win10二生成器1关于生成器先看一个例子    定义了一个函数,当我们运行该函数时,并未像普通函数那样执行函数体内的代码    从其中的英文可知,执行函数得到了一个生成器对象,这个生成器对象也叫做generatoriterator(生成器迭代器),generatorit......
  • C++初阶大总结
    目录一.命名空间1.命名空间定义2.命名空间使用二.C++输入&输出三.缺省参数四.函数重载五.引用1.常引用2.传值、传引用效率比较3.引用和指针的区别4.引用和指针的不同点:小知识点:六.内联函数七.auto关键字(C++11)1.auto的使用细则八.基于范围的for循环(C++11)......
  • 【C++】入门基础
     1.命名空间1.1namespace的价值在C/C++中,变量、函数和后⾯要学到的类都是⼤量存在的,这些变量、函数和类的名称将都存在于全局作⽤域中,可能会导致很多冲突。使⽤命名空间的⽬的是对标识符的名称进⾏本地化,以避免命名冲突或名字污染,namespace关键字的出现就是针对这种问题的......
  • 生成MySQL-oracle-SQL server数据字典(附Python代码)
    生成数据字典,早年写的,请注意新的版本变化。(1)MySQL元数据SQLUSEinformation_schema;#取出库和表。select  TABLE_SCHEMAAS'数据库名称',  TABLE_NAMEAS'表名',  TABLE_TYPEAS'表类型',  ROW_FORMATAS'行格式',  ENGINEAS'数据库引擎',  TABL......
  • Python - Method Resolution Order (MRO)
    TheorderinwhichPythonsearchesforattributesinbaseclassesiscalledmethodresolutionorder(MRO).Itgivesalinearizedpathforaninheritancestructure.PythoncomputesanMROforeveryclassinthehierarchy;thisMROiscomputedusingthe‘C3......