首页 > 编程语言 >c++死锁调试 ,gdb pstack

c++死锁调试 ,gdb pstack

时间:2024-12-19 14:20:04浏览次数:5  
标签:std __ thread lock gdb pstack 死锁 lib64 include

  psatck

  ‌pstack命令‌是一个在Linux系统中用于查看进程堆栈信息的工具。

  写了一个服务端死锁程序,如下:

  

#include <iostream>
#include <thread>
#include <mutex>
#include <chrono>
#include <sys/socket.h>
#include <netinet/in.h>
#include <unistd.h>
#include <cstring>

// 定义两个互斥锁
std::mutex mutex1;
std::mutex mutex2;

// 处理客户端请求的函数
void handle_client(int client_socket, int client_id) {
    std::cout << "Client " << client_id << ": Connected" << std::endl;

    // 模拟客户端请求处理
    if (client_id == 1) {
        // 客户端 1:先获取 mutex1,再获取 mutex2
        std::cout << "Client " << client_id << ": Trying to lock mutex1..." << std::endl;
        std::lock_guard<std::mutex> lock1(mutex1);
        std::this_thread::sleep_for(std::chrono::seconds(5));  // 增加锁的持有时间
        std::cout << "Client " << client_id << ": Locked mutex1, now trying to lock mutex2..." << std::endl;

        // 尝试获取 mutex2
        std::lock_guard<std::mutex> lock2(mutex2);  // 死锁发生点
        std::this_thread::sleep_for(std::chrono::seconds(5));  // 模拟更多工作
        std::cout << "Client " << client_id << ": Locked both mutex1 and mutex2" << std::endl;
    } else if (client_id == 2) {
        // 客户端 2:先获取 mutex2,再获取 mutex1
        std::cout << "Client " << client_id << ": Trying to lock mutex2..." << std::endl;
        std::lock_guard<std::mutex> lock2(mutex2);
        std::this_thread::sleep_for(std::chrono::seconds(5));  // 增加锁的持有时间
        std::cout << "Client " << client_id << ": Locked mutex2, now trying to lock mutex1..." << std::endl;

        // 尝试获取 mutex1
        std::lock_guard<std::mutex> lock1(mutex1);  // 死锁发生点
        std::this_thread::sleep_for(std::chrono::seconds(5));  // 模拟更多工作
        std::cout << "Client " << client_id << ": Locked both mutex1 and mutex2" << std::endl;
    }

    // 关闭客户端连接
    close(client_socket);
    std::cout << "Client " << client_id << ": Disconnected" << std::endl;
}

// TCP 服务器主函数
void start_server(int port) {
    int server_fd, new_socket;
    struct sockaddr_in address;
    int opt = 1;
    int addrlen = sizeof(address);

    // 创建 socket 文件描述符
    if ((server_fd = socket(AF_INET, SOCK_STREAM, 0)) == 0) {
        perror("socket failed");
        exit(EXIT_FAILURE);
    }

    // 设置 SO_REUSEADDR 选项
    if (setsockopt(server_fd, SOL_SOCKET, SO_REUSEADDR | SO_REUSEPORT, &opt, sizeof(opt))) {
        perror("setsockopt failed");
        exit(EXIT_FAILURE);
    }

    // 绑定 socket 到指定端口
    address.sin_family = AF_INET;
    address.sin_addr.s_addr = INADDR_ANY;
    address.sin_port = htons(port);

    if (bind(server_fd, (struct sockaddr *)&address, sizeof(address)) < 0) {
        perror("bind failed");
        exit(EXIT_FAILURE);
    }

    // 监听连接
    if (listen(server_fd, 3) < 0) {
        perror("listen failed");
        exit(EXIT_FAILURE);
    }

    std::cout << "Server started on port " << port << ". Waiting for connections..." << std::endl;

    int client_id = 1;  // 用于区分不同客户端

    while (true) {
        // 接受新的客户端连接
        if ((new_socket = accept(server_fd, (struct sockaddr *)&address, (socklen_t*)&addrlen)) < 0) {
            perror("accept failed");
            continue;
        }

        // 为每个客户端创建一个新线程
        std::thread client_thread(handle_client, new_socket, client_id++);
        client_thread.detach();  // 分离线程,允许其独立运行
    }
}

int main() {
    int port = 8080;
    start_server(port);
    return 0;
}
tcp_deadlock_server.cpp

  编译:g++ -std=c++11 -pthread -o tcp_deadlock_server tcp_deadlock_server.cpp -g  

用telnet(telnet 127.1 8080)连两次就会死锁,服务端输出如下:  

Server started on port 8080. Waiting for connections...
Client 1: Connected
Client 1: Trying to lock mutex1...
Client 2: Connected
Client 2: Trying to lock mutex2...
Client 1: Locked mutex1, now trying to lock mutex2...
Client 2: Locked mutex2, now trying to lock mutex1...

  pstack调试死锁

  ps查看进程ID,然后pstack + 进程ID : pstack  915 > pstack_out,将输出重定向到文件,好看一些:

  

Thread 3 (LWP 919):
#0  0x0000fffcc23821dc in ?? () from /lib64/libpthread.so.0
#1  0x0000fffcc237b060 in pthread_mutex_lock () from /lib64/libpthread.so.0
#2  0x00000000004012c4 in __gthread_mutex_lock (__mutex=0x420240 <mutex1>) at /usr/include/c++/7.3.0/aarch64-linux-gnu/bits/gthr-default.h:748
#3  0x0000000000401a88 in std::mutex::lock (this=0x420240 <mutex1>) at /usr/include/c++/7.3.0/bits/std_mutex.h:103
#4  0x0000000000401b34 in std::lock_guard<std::mutex>::lock_guard (this=0xfffcc19ce810, __m=...) at /usr/include/c++/7.3.0/bits/std_mutex.h:162
#5  0x00000000004015ac in handle_client (client_socket=5, client_id=2) at tcp_deadlock_server.cpp:38
#6  0x0000000000402308 in std::__invoke_impl<void, void (*)(int, int), int, int> (__f=@0x248c23e0: 0x401310 <handle_client(int, int)>, __args#0=@0x248c23dc: 5, __args#1=@0x248c23d8: 2) at /usr/include/c++/7.3.0/bits/invoke.h:60
#7  0x0000000000401e18 in std::__invoke<void (*)(int, int), int, int> (__fn=@0x248c23e0: 0x401310 <handle_client(int, int)>, __args#0=@0x248c23dc: 5, __args#1=@0x248c23d8: 2) at /usr/include/c++/7.3.0/bits/invoke.h:95
#8  0x00000000004029cc in std::thread::_Invoker<std::tuple<void (*)(int, int), int, int> >::_M_invoke<0ul, 1ul, 2ul> (this=0x248c23d8) at /usr/include/c++/7.3.0/thread:234
#9  0x0000000000402970 in std::thread::_Invoker<std::tuple<void (*)(int, int), int, int> >::operator() (this=0x248c23d8) at /usr/include/c++/7.3.0/thread:243
#10 0x0000000000402950 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (*)(int, int), int, int> > >::_M_run (this=0x248c23d0) at /usr/include/c++/7.3.0/thread:186
#11 0x0000fffcc257e134 in ?? () from /lib64/libstdc++.so.6
#12 0x0000fffcc23788cc in ?? () from /lib64/libpthread.so.0
#13 0x0000fffcc22ba1ec in ?? () from /lib64/libc.so.6
Thread 2 (LWP 917):
#0  0x0000fffcc23821dc in ?? () from /lib64/libpthread.so.0
#1  0x0000fffcc237b060 in pthread_mutex_lock () from /lib64/libpthread.so.0
#2  0x00000000004012c4 in __gthread_mutex_lock (__mutex=0x420270 <mutex2>) at /usr/include/c++/7.3.0/aarch64-linux-gnu/bits/gthr-default.h:748
#3  0x0000000000401a88 in std::mutex::lock (this=0x420270 <mutex2>) at /usr/include/c++/7.3.0/bits/std_mutex.h:103
#4  0x0000000000401b34 in std::lock_guard<std::mutex>::lock_guard (this=0xfffcc21de820, __m=...) at /usr/include/c++/7.3.0/bits/std_mutex.h:162
#5  0x0000000000401450 in handle_client (client_socket=4, client_id=1) at tcp_deadlock_server.cpp:27
#6  0x0000000000402308 in std::__invoke_impl<void, void (*)(int, int), int, int> (__f=@0x248c2290: 0x401310 <handle_client(int, int)>, __args#0=@0x248c228c: 4, __args#1=@0x248c2288: 1) at /usr/include/c++/7.3.0/bits/invoke.h:60
#7  0x0000000000401e18 in std::__invoke<void (*)(int, int), int, int> (__fn=@0x248c2290: 0x401310 <handle_client(int, int)>, __args#0=@0x248c228c: 4, __args#1=@0x248c2288: 1) at /usr/include/c++/7.3.0/bits/invoke.h:95
#8  0x00000000004029cc in std::thread::_Invoker<std::tuple<void (*)(int, int), int, int> >::_M_invoke<0ul, 1ul, 2ul> (this=0x248c2288) at /usr/include/c++/7.3.0/thread:234
#9  0x0000000000402970 in std::thread::_Invoker<std::tuple<void (*)(int, int), int, int> >::operator() (this=0x248c2288) at /usr/include/c++/7.3.0/thread:243
#10 0x0000000000402950 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (*)(int, int), int, int> > >::_M_run (this=0x248c2280) at /usr/include/c++/7.3.0/thread:186
#11 0x0000fffcc257e134 in ?? () from /lib64/libstdc++.so.6
#12 0x0000fffcc23788cc in ?? () from /lib64/libpthread.so.0
#13 0x0000fffcc22ba1ec in ?? () from /lib64/libc.so.6
Thread 1 (LWP 915):
#0  0x0000fffcc23827c4 in accept () from /lib64/libpthread.so.0
#1  0x0000000000401868 in start_server (port=8080) at tcp_deadlock_server.cpp:89
#2  0x00000000004018f8 in main () at tcp_deadlock_server.cpp:102
pstack 输出

  能看到一共三个线程,Thread 3 (LWP 919)卡在pthread_mutex_lock:#5  0x00000000004015ac in handle_client (client_socket=5, client_id=2) at tcp_deadlock_server.cpp:38,

  Thread 2 (LWP 917)卡在pthread_mutex_lock:#5  0x0000000000401450 in handle_client (client_socket=4, client_id=1) at tcp_deadlock_server.cpp:27

  就发现了死锁在的位置  

  gdb

  不用c++11又写了一个程序,如下:

  

#include <iostream>
#include <cstring>
#include <unistd.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <pthread.h>
#include <errno.h>
#include <cstdlib>
#include <fcntl.h>
#include <stdio.h>

// 定义两个互斥锁
pthread_mutex_t mutex1 = PTHREAD_MUTEX_INITIALIZER;
pthread_mutex_t mutex2 = PTHREAD_MUTEX_INITIALIZER;

// 处理客户端请求的函数
void* handle_client(void* arg) {
    int client_socket = *(static_cast<int*>(arg));
    free(arg);  // 释放传递给线程的动态分配的内存

    std::cout << "Client connected with socket: " << client_socket << std::endl;

    // 模拟客户端请求处理
    if (client_socket == 4) {
        // 客户端 1:先获取 mutex1,再获取 mutex2
        std::cout << "Client " << client_socket << ": Trying to lock mutex1..." << std::endl;
        pthread_mutex_lock(&mutex1);
        usleep(5000000);  // 休眠 5 秒
        std::cout << "Client " << client_socket << ": Locked mutex1, now trying to lock mutex2..." << std::endl;

        // 尝试获取 mutex2
        pthread_mutex_lock(&mutex2);  // 死锁发生点
        usleep(5000000);  // 休眠 5 秒
        std::cout << "Client " << client_socket << ": Locked both mutex1 and mutex2" << std::endl;

        // 释放互斥锁
        pthread_mutex_unlock(&mutex2);
        pthread_mutex_unlock(&mutex1);
    } else if (client_socket == 5) {
        // 客户端 2:先获取 mutex2,再获取 mutex1
        std::cout << "Client " << client_socket << ": Trying to lock mutex2..." << std::endl;
        pthread_mutex_lock(&mutex2);
        usleep(5000000);  // 休眠 5 秒
        std::cout << "Client " << client_socket << ": Locked mutex2, now trying to lock mutex1..." << std::endl;

        // 尝试获取 mutex1
        pthread_mutex_lock(&mutex1);  // 死锁发生点
        usleep(5000000);  // 休眠 5 秒
        std::cout << "Client " << client_socket << ": Locked both mutex1 and mutex2" << std::endl;

        // 释放互斥锁
        pthread_mutex_unlock(&mutex1);
        pthread_mutex_unlock(&mutex2);
    }

    // 关闭客户端连接
    close(client_socket);
    std::cout << "Client disconnected with socket: " << client_socket << std::endl;

    pthread_exit(NULL);
}

// TCP 服务器主函数
void start_server(int port) {
    int server_fd, new_socket;
    struct sockaddr_in address;
    int opt = 1;
    int addrlen = sizeof(address);

    // 创建 socket 文件描述符
    if ((server_fd = socket(AF_INET, SOCK_STREAM, 0)) == 0) {
        perror("socket failed");
        exit(EXIT_FAILURE);
    }

    // 设置 SO_REUSEADDR 选项
    if (setsockopt(server_fd, SOL_SOCKET, SO_REUSEADDR, &opt, sizeof(opt))) {
        perror("setsockopt failed");
        exit(EXIT_FAILURE);
    }

    // 绑定 socket 到指定端口
    address.sin_family = AF_INET;
    address.sin_addr.s_addr = INADDR_ANY;
    address.sin_port = htons(port);

    if (bind(server_fd, (struct sockaddr *)&address, sizeof(address)) < 0) {
        perror("bind failed");
        exit(EXIT_FAILURE);
    }

    // 监听连接
    if (listen(server_fd, 3) < 0) {
        perror("listen failed");
        exit(EXIT_FAILURE);
    }

    std::cout << "Server started on port " << port << ". Waiting for connections..." << std::endl;

    int client_id = 1;  // 用于区分不同客户端

    while (true) {
        // 接受新的客户端连接
        if ((new_socket = accept(server_fd, (struct sockaddr *)&address, (socklen_t*)&addrlen)) < 0) {
            perror("accept failed");
            continue;
        }

        // 为每个客户端创建一个新线程
        pthread_t thread;
        int* client_socket_ptr = new int(new_socket);  // 动态分配存储套接字描述符的内存
        if (pthread_create(&thread, NULL, handle_client, static_cast<void*>(client_socket_ptr)) != 0) {
            perror("pthread_create failed");
            delete client_socket_ptr;  // 如果线程创建失败,释放内存
            close(new_socket);
            continue;
        }

        // 分离线程,允许其独立运行
        pthread_detach(thread);

        // 为了测试死锁,只接受前两个客户端连接
        if (client_id >= 3) {
            close(new_socket);  // 关闭多余的连接
            continue;
        }
        client_id++;
    }
}

int main() {
    int port = 8080;
    start_server(port);
    return 0;
}
tcp_deadlock_server_c++0x.cpp

  编译运行telnet测试跟上面一样

  gdb调试死锁

  ps查看进程ID,然后gdb跟进程:gdb -p 11560

  查看所有线程:info threads,进入线程:thread 2 ,然后bt查看线程堆栈,切换另一个线程如上,就能看到两个线程都卡在了lock,具体调试步骤如下:

  

(gdb) info threads
  3 Thread 0x7fb6c0115700 (LWP 11562)  0x0000003b5200dff4 in __lll_lock_wait () from /lib64/libpthread.so.0
  2 Thread 0x7fb6bf714700 (LWP 11564)  0x0000003b5200dff4 in __lll_lock_wait () from /lib64/libpthread.so.0
* 1 Thread 0x7fb6c0117720 (LWP 11560)  0x0000003b5200e7ed in accept () from /lib64/libpthread.so.0
(gdb) thread 2
[Switching to thread 2 (Thread 0x7fb6bf714700 (LWP 11564))]#0  0x0000003b5200dff4 in __lll_lock_wait () from /lib64/libpthread.so.0
(gdb) bt
#0  0x0000003b5200dff4 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x0000003b52009328 in _L_lock_854 () from /lib64/libpthread.so.0
#2  0x0000003b520091f7 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x0000000000400f9f in handle_client (arg=0x13a8010) at tcp_deadlock_server.cpp:48
#4  0x0000003b520077f1 in start_thread () from /lib64/libpthread.so.0
#5  0x0000003b51ce570d in clone () from /lib64/libc.so.6
(gdb) thread 3
[Switching to thread 3 (Thread 0x7fb6c0115700 (LWP 11562))]#0  0x0000003b5200dff4 in __lll_lock_wait () from /lib64/libpthread.so.0
(gdb) bt
#0  0x0000003b5200dff4 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x0000003b52009328 in _L_lock_854 () from /lib64/libpthread.so.0
#2  0x0000003b520091f7 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x0000000000400eb2 in handle_client (arg=0x13a8010) at tcp_deadlock_server.cpp:33
#4  0x0000003b520077f1 in start_thread () from /lib64/libpthread.so.0
#5  0x0000003b51ce570d in clone () from /lib64/libc.so.6
gdb调试死锁

  总结

  pstack和gdb都使用 ptrace() 系统调用来附着到目标进程。ptrace() 允许 GDB 暂停目标进程的执行,读取和修改其内存及寄存器,并捕获系统调用。

标签:std,__,thread,lock,gdb,pstack,死锁,lib64,include
From: https://www.cnblogs.com/liudw-0215/p/18617173

相关文章

  • 不可重入锁与死锁
    不可重入锁确实可能导致死锁,特别是在同一线程尝试多次获取同一把锁时。如果锁是不可重入的,那么线程在第二次尝试获取锁时会永远阻塞,从而导致死锁。不可重入锁与死锁的关系不可重入锁不允许同一个线程多次获取同一把锁。在以下情况下,这种限制会导致死锁:递归调用时:如果一......
  • MySQL死锁成因及解决方案
    1.死锁的发生1.1什么是死锁?        死锁是指两个或多个事务在并发执行时,因为资源互相占用而进入一种无限等待的状态,导致无法继续执行的现象。例如:事务A持有资源1,同时请求资源2。事务B持有资源2,同时请求资源1。两者互相等待对方释放资源,最终导致死锁。1.2死锁......
  • MySQL 中如果发生死锁应该如何解决?
    MySQL中如果发生死锁应该如何解决?死锁是指多个事务在执行过程中因资源争用形成的循环等待,导致无法继续执行。MySQL会自动检测死锁并选择一个事务进行回滚,但我们可以通过优化设计和操作来避免和解决死锁问题。1.MySQL如何检测死锁?死锁检测:MySQL的InnoDB存储引擎会维护......
  • GDB调试面经
    1.linux下,如何debug查看内存泄露问题?在linux系统中可以使用top命令实时显示系统中进程的内存使用情况。free命令显示了系统中空闲和已使用的内存。使用valgrind是一个强大的内存调试和分析工具,它可以检测内存泄漏和其他内存相关的问题。'''valgrind--leak-check=yes--show-le......
  • MySQL的各种锁(表锁,行锁,悲观锁,乐观锁,间隙锁,死锁)
    对于UPDATE、DELETE、INSERT语句,InnoDB会自动给涉及数据集加排他锁(X)。而MyISAM在执行查询语句SELECT前,会自动给涉及的所有表加读锁,在执行增、删、改操作前,会自动给涉及的表加写锁,这个过程并不需要我们去手动操作。那么在特定情况下,我们该如何去加锁呢?下面咱们来认真的......
  • 操作系统中的死锁
    什么是死锁一组进程中的每一个进程都在等待仅由该组进程中其他进程才能引发的事件,这样就形成死锁了。死锁的原因竞争不可抢占的资源竞争可消耗资源进程推进顺序不当死锁产生的必要条件1.互斥条件:对资源互斥访问2.请求和保持:进程已经拥有了一个资源,还需要拥有其他资源,此......
  • 【linux系统】基础开发工具(git、gdb/cgdb使用)
    1.版本控制器Git不知道你工作或学习时,有没有遇到这样的情况:我们在编写各种⽂档时,为了防止文档丢失,更改失误,失误后能恢复到原来的版本,不得不复制出⼀个副本,比如:“报告-v1”“报告-v2”“报告-v3”“报告-确定版”“报告-最终版”“报告-究极进化版......
  • 《Java核心技术I》死锁
    死锁账户1:200元账户2:300元线程1:从账号1转300到账户2线程2:从账户2转400到账户1如上,线程1和线程2显然都被阻塞,两个账户的余额都不足以转账,两个线程都无法执行下去。有可能会因为每一个线程要等待更多的钱款存入而导致所有线程都被阻塞,这样的状态称为死锁(deadlock)。通俗的......
  • SQL SERVER死锁查询,死锁分析,解锁,查询占用
    From:  https://www.cnblogs.com/K-R-/p/18431639简单点的处理方法:1、查询死锁的表selectrequest_session_idspid,OBJECT_NAME(resource_associated_entity_id)tableNamefromsys.dm_tran_lockswhereresource_type='OBJECT'2、解锁declare@spidintSet@spid=......
  • arm-none-eabi-gdb无法运行
    在ubuntu24.02下arm-none-eabi-gdb无法运行。在STLINK驱动、OPENOCD配置正确的情况下,在STM32CUBEIDE中开启openocd调试或者直接使用命令arm-none-eabi-gdb./xxx.elf调试会得出如下错误arm-none-eabi-gdb:/lib/x86_64-linux-gnu/libncurses.so.5:version`NCURSES_5.3.2002101......