首页 > 其他分享 >ContextSwitch 学习与使用

ContextSwitch 学习与使用

时间:2023-04-15 09:34:19浏览次数:52  
标签:switches 2000000 thread ctxsw 学习 context 使用 ns ContextSwitch

ContextSwitch 学习与使用


说明

github上面有一个简单的测试系统调用以及上下文切换的工具.
contextswitch. 
下载之后直接make就可以进行简单的测试

需要注意的是 部分arm环境没有: 
-mno-avx 
这个参数, 需要去掉一下. 

官方文档以及说明

Little micro-benchmarks to assess the performance overhead of context
switching.

timesyscall: Benchmarks the overhead of a system call.
timectxsw:   Benchmarks the overhead of context switching between 2 processes.
timetctxsw:  Benchmarks the overhead of context switching between 2 threads.
timectxswws: Benchmarks the overhead of context switching between 2 processes
             using a working set of the size specified in argument.
timetctxsw2: Benchmarks the overhead of context switching between 2 threads,
             by using a shed_yield() method.
             If you do taskset -a 1, all threads should be scheduled on the
             same processor, so you are really doing thread context switch.
             Then to be sure that you are really doing it, just do:
               strace -ff -tt -v taskset -a 1 ./timetctxsw2
             Now why sched_yield() is enough for testing ? Because, it place
             the current thread at the end of the ready queue. So the next
             ready thread will be scheduled.
             I also added sched_setscheduler(SCHED_FIFO) to get the best
             performances.
From: https://github.com/tsuna/contextswitch       

脚本说明

runbench() {
  $* ./timesyscall
  $* ./timectxsw
  $* ./timetctxsw
  $* ./timetctxsw2
}
每一组测试内的内容分别为:

1. 系统调用的时间.
2. 2个进程之间的上下文切换的时间.
3. 同一进程内的连个线程切换的时间.
4. shed_yield() method 方法的切换时间 (不太了解)

一共分为三组
第一组不进行设置
第二组绑定CPU但是在两个核心上
第三组绑定到同一个CPU核心上面.

测试结果说明

在我所有的测试环境内: 
1. AMD 9T34 无可争议的排第一
2. 相同硬件不同操作系统的差异比较大, 如果比较必须使用相同的操作系统来进行.
3. 国产里面与SPECJVM和SPECCPU的结果完全一样.飞腾<海光<鲲鹏<阿里倚天
   阿里倚天无可争议的王者. 
4. 十年前的CPU的确不如现在新的CPU. 必须更新换代,性能更好,速度更快. 
5. CPU绑核非常有用途,需要进行优化. 
6. 协程,轻量级线程是未来. 只有这样性能才会好.    

结果图表-1


结果图表-2


E5-2620 2.0Ghz

2 physical CPUs, 6 cores/CPU, 2 hardware threads/core = 24 hw threads total
-- No CPU affinity --
10000000 system calls in 11841646290ns (1184.2ns/syscall)
2000000 process context switches in 6039748545ns (3019.9ns/ctxsw)
2000000  thread context switches in 6745297188ns (3372.6ns/ctxsw)
sched_setscheduler(): Operation not permitted
2000000  thread context switches in 755823488ns (377.9ns/ctxsw)
-- With CPU affinity --
10000000 system calls in 14343751134ns (1434.4ns/syscall)
2000000 process context switches in 16353343542ns (8176.7ns/ctxsw)
2000000  thread context switches in 13617487377ns (6808.7ns/ctxsw)
sched_setscheduler(): Operation not permitted
2000000  thread context switches in 2363107269ns (1181.6ns/ctxsw)
-- With CPU affinity to CPU 0 --
10000000 system calls in 11929472188ns (1192.9ns/syscall)
2000000 process context switches in 6915983386ns (3458.0ns/ctxsw)
2000000  thread context switches in 6837489882ns (3418.7ns/ctxsw)
sched_setscheduler(): Operation not permitted
2000000  thread context switches in 795652256ns (397.8ns/ctxsw)

Intel(R) Xeon(R) Gold 5118 CPU @ 2.30GHz 云海OS虚拟机

1 physical CPUs, 8 cores/CPU, 1 hardware threads/core = 8 hw threads total
-- No CPU affinity --
10000000 system calls in 2841917410ns (284.2ns/syscall)
2000000 process context switches in 7404178178ns (3702.1ns/ctxsw)
2000000  thread context switches in 7502081647ns (3751.0ns/ctxsw)
sched_setscheduler(): Operation not permitted
2000000  thread context switches in 222130514ns (111.1ns/ctxsw)
-- With CPU affinity --
10000000 system calls in 2835862084ns (283.6ns/syscall)
2000000 process context switches in 4990890087ns (2495.4ns/ctxsw)
2000000  thread context switches in 4311646652ns (2155.8ns/ctxsw)
sched_setscheduler(): Operation not permitted
2000000  thread context switches in 870608240ns (435.3ns/ctxsw)
-- With CPU affinity to CPU 0 --
10000000 system calls in 2844931708ns (284.5ns/syscall)
2000000 process context switches in 7601947691ns (3801.0ns/ctxsw)
2000000  thread context switches in 7914561498ns (3957.3ns/ctxsw)
sched_setscheduler(): Operation not permitted
2000000  thread context switches in 247057805ns (123.5ns/ctxsw)

Intel(R) Xeon(R) Gold 5118 CPU @ 2.30GHz 云海OS物理机

2 physical CPUs, 12 cores/CPU, 2 hardware threads/core = 48 hw threads total
-- No CPU affinity --
10000000 system calls in 5769760409ns (577.0ns/syscall)
2000000 process context switches in 7245677219ns (3622.8ns/ctxsw)
2000000  thread context switches in 7069213271ns (3534.6ns/ctxsw)
sched_setscheduler(): Operation not permitted
2000000  thread context switches in 475086926ns (237.5ns/ctxsw)
-- With CPU affinity --
10000000 system calls in 5762431985ns (576.2ns/syscall)
2000000 process context switches in 8692364627ns (4346.2ns/ctxsw)
2000000  thread context switches in 6572286258ns (3286.1ns/ctxsw)
sched_setscheduler(): Operation not permitted
2000000  thread context switches in 1304249661ns (652.1ns/ctxsw)
-- With CPU affinity to CPU 0 --
10000000 system calls in 5774310295ns (577.4ns/syscall)
2000000 process context switches in 6869635514ns (3434.8ns/ctxsw)
2000000  thread context switches in 6927117249ns (3463.6ns/ctxsw)
sched_setscheduler(): Operation not permitted
2000000  thread context switches in 473255745ns (236.6ns/ctxsw)

飞腾S2500-物理机器-NFSV3

2 physical CPUs, 128 cores/CPU, 1 hardware threads/core = 256 hw threads total
-- No CPU affinity --
10000000 system calls in 3838470070ns (383.8ns/syscall)
2000000 process context switches in 10913991269ns (5457.0ns/ctxsw)
2000000  thread context switches in 10987973614ns (5494.0ns/ctxsw)
sched_setscheduler(): Operation not permitted
2000000  thread context switches in 354962539ns (177.5ns/ctxsw)
-- With CPU affinity --
10000000 system calls in 3851009222ns (385.1ns/syscall)
2000000 process context switches in 10500204985ns (5250.1ns/ctxsw)
2000000  thread context switches in 8605107251ns (4302.6ns/ctxsw)
sched_setscheduler(): Operation not permitted
2000000  thread context switches in 1694906366ns (847.5ns/ctxsw)
-- With CPU affinity to CPU 0 --
10000000 system calls in 3871134715ns (387.1ns/syscall)
2000000 process context switches in 8211223439ns (4105.6ns/ctxsw)
2000000  thread context switches in 8915611368ns (4457.8ns/ctxsw)
sched_setscheduler(): Operation not permitted
2000000  thread context switches in 362941497ns (181.5ns/ctxsw)

飞腾S2500-物理机器-银河麒麟V10

model name : HUAWEI,Kunpeng 920
2 physical CPUs, 128 cores/CPU, 1 hardware threads/core = 256 hw threads total
-- No CPU affinity --
10000000 system calls in 1104251960ns (110.4ns/syscall)
2000000 process context switches in 5502095280ns (2751.0ns/ctxsw)
2000000  thread context switches in 5057680610ns (2528.8ns/ctxsw)
2000000  thread context switches in 159336010ns (79.7ns/ctxsw)
-- With CPU affinity --
10000000 system calls in 1104213220ns (110.4ns/syscall)
2000000 process context switches in 3157105260ns (1578.6ns/ctxsw)
2000000  thread context switches in 2749304460ns (1374.7ns/ctxsw)
2000000  thread context switches in 520588690ns (260.3ns/ctxsw)
-- With CPU affinity to CPU 0 --
10000000 system calls in 1104361790ns (110.4ns/syscall)
2000000 process context switches in 2554260900ns (1277.1ns/ctxsw)
2000000  thread context switches in 2501093900ns (1250.5ns/ctxsw)
2000000  thread context switches in 159835540ns (79.9ns/ctxsw)

飞腾S2500-KVM虚拟机

10000000 system calls in 2016128780ns (201.6ns/syscall)
2000000 process context switches in 20813179318ns (10406.6ns/ctxsw)
2000000  thread context switches in 21270077053ns (10635.0ns/ctxsw)
2000000  thread context switches in 283497350ns (141.7ns/ctxsw)
-- With CPU affinity --
10000000 system calls in 2003773606ns (200.4ns/syscall)
2000000 process context switches in 7149973534ns (3575.0ns/ctxsw)
2000000  thread context switches in 6041671015ns (3020.8ns/ctxsw)
2000000  thread context switches in 1184706267ns (592.4ns/ctxsw)
-- With CPU affinity to CPU 0 --
10000000 system calls in 1996452026ns (199.6ns/syscall)
2000000 process context switches in 20093433102ns (10046.7ns/ctxsw)
2000000  thread context switches in 20838253803ns (10419.1ns/ctxsw)
2000000  thread context switches in 284723964ns (142.4ns/ctxsw)

海光机器

model name : Hygon C86 7285 32-core Processor
pgrep: cannot allocate 4611686018427387903 bytes
2 physical CPUs, 32 cores/CPU, 2 hardware threads/core = 128 hw threads total
-- No CPU affinity --
10000000 system calls in 1188373575ns (118.8ns/syscall)
2000000 process context switches in 7182741168ns (3591.4ns/ctxsw)
2000000  thread context switches in 5057264353ns (2528.6ns/ctxsw)
2000000  thread context switches in 218741918ns (109.4ns/ctxsw)
-- With CPU affinity --
10000000 system calls in 1199538092ns (120.0ns/syscall)
2000000 process context switches in 4926579090ns (2463.3ns/ctxsw)
2000000  thread context switches in 4116607893ns (2058.3ns/ctxsw)
2000000  thread context switches in 877003690ns (438.5ns/ctxsw)
-- With CPU affinity to CPU 0 --
10000000 system calls in 1207213049ns (120.7ns/syscall)
2000000 process context switches in 4803238321ns (2401.6ns/ctxsw)
2000000  thread context switches in 5033478360ns (2516.7ns/ctxsw)
2000000  thread context switches in 218102516ns (109.1ns/ctxsw)

鲲鹏机器

2 physical CPUs, 128 cores/CPU, 1 hardware threads/core = 256 hw threads total
-- No CPU affinity --
10000000 system calls in 1628256836ns (162.8ns/syscall)
2000000 process context switches in 3567828849ns (1783.9ns/ctxsw)
2000000  thread context switches in 3366796751ns (1683.4ns/ctxsw)
2000000  thread context switches in 208056729ns (104.0ns/ctxsw)
-- With CPU affinity --
10000000 system calls in 3957162873ns (395.7ns/syscall)
2000000 process context switches in 66176473553ns (33088.2ns/ctxsw)
2000000  thread context switches in 64858764678ns (32429.4ns/ctxsw)
2000000  thread context switches in 9224336984ns (4612.2ns/ctxsw)
-- With CPU affinity to CPU 0 --
10000000 system calls in 1658580824ns (165.9ns/syscall)
2000000 process context switches in 4162672768ns (2081.3ns/ctxsw)
2000000  thread context switches in 3930988507ns (1965.5ns/ctxsw)
2000000  thread context switches in 206905930ns (103.5ns/ctxsw)

Intel 8369HB 3.3Ghz

10000000 system calls in 2039800553ns (204.0ns/syscall)
2000000 process context switches in 3484116193ns (1742.1ns/ctxsw)
2000000  thread context switches in 3504345370ns (1752.2ns/ctxsw)
sched_setscheduler(): Operation not permitted
2000000  thread context switches in 163336302ns (81.7ns/ctxsw)
-- With CPU affinity --
10000000 system calls in 2042749498ns (204.3ns/syscall)
2000000 process context switches in 3512477901ns (1756.2ns/ctxsw)
2000000  thread context switches in 3037479215ns (1518.7ns/ctxsw)
sched_setscheduler(): Operation not permitted
2000000  thread context switches in 589604636ns (294.8ns/ctxsw)
-- With CPU affinity to CPU 0 --
10000000 system calls in 2037861063ns (203.8ns/syscall)
2000000 process context switches in 3543912186ns (1772.0ns/ctxsw)
2000000  thread context switches in 3575216872ns (1787.6ns/ctxsw)
sched_setscheduler(): Operation not permitted
2000000  thread context switches in 164079529ns (82.0ns/ctxsw)

阿里倚天710

1 physical CPUs, 8 cores/CPU, 1 hardware threads/core = 8 hw threads total
-- No CPU affinity --
10000000 system calls in 672626352ns (67.3ns/syscall)
2000000 process context switches in 3586487130ns (1793.2ns/ctxsw)
2000000  thread context switches in 3228362627ns (1614.2ns/ctxsw)
sched_setscheduler(): Operation not permitted
2000000  thread context switches in 102817391ns (51.4ns/ctxsw)
-- With CPU affinity --
10000000 system calls in 672290182ns (67.2ns/syscall)
2000000 process context switches in 1990312435ns (995.2ns/ctxsw)
2000000  thread context switches in 1682598464ns (841.3ns/ctxsw)
sched_setscheduler(): Operation not permitted
2000000  thread context switches in 328222163ns (164.1ns/ctxsw)
-- With CPU affinity to CPU 0 --
10000000 system calls in 672409838ns (67.2ns/syscall)
2000000 process context switches in 3347526340ns (1673.8ns/ctxsw)
2000000  thread context switches in 3100110717ns (1550.1ns/ctxsw)
sched_setscheduler(): Operation not permitted
2000000  thread context switches in 102631615ns (51.3ns/ctxsw)

AMD 9T34

model name : AMD EPYC 9T34 64-Core Processor
1 physical CPUs, 8 cores/CPU, 2 hardware threads/core = 16 hw threads total
-- No CPU affinity --
10000000 system calls in 553414290ns (55.3ns/syscall)
2000000 process context switches in 1963917388ns (982.0ns/ctxsw)
2000000  thread context switches in 2131473467ns (1065.7ns/ctxsw)
2000000  thread context switches in 115396178ns (57.7ns/ctxsw)
-- With CPU affinity --
10000000 system calls in 554322086ns (55.4ns/syscall)
2000000 process context switches in 2730693871ns (1365.3ns/ctxsw)
2000000  thread context switches in 2559121196ns (1279.6ns/ctxsw)
2000000  thread context switches in 550724648ns (275.4ns/ctxsw)
-- With CPU affinity to CPU 0 --
10000000 system calls in 553295602ns (55.3ns/syscall)
2000000 process context switches in 2011838005ns (1005.9ns/ctxsw)
2000000  thread context switches in 2027328701ns (1013.7ns/ctxsw)
2000000  thread context switches in 114914625ns (57.5ns/ctxsw)

标签:switches,2000000,thread,ctxsw,学习,context,使用,ns,ContextSwitch
From: https://www.cnblogs.com/jinanxiaolaohu/p/17318069.html

相关文章

  • 【学习笔记】LGV引理
    令$w(P)$表示路径$P$的所有边权之积,\(e(u,v)\)表示所有\(u\)到\(v\)的路径\(w(P)\)之和,令:\[M=\begin{bmatrix}e(A_1,B_1)\quade(A_1,B_2)\quad...\quade(A_1,B_n)\\e(A_2,B_1)\quade(A_2,B_2)\quad...\quade(A_2,B_n)\\...\\e(A_n,B_1)\......
  • MySQL存储过程入门使用
    一、存储过程概述存储过程的英文是StoredProcedure。它的思想很简单,就是一组经过预先编译的SQL语句的封装。执行过程:存储过程预先存储在MySQL服务器上,需要执行的时候,客户端只需要向服务器端发出调用存储过程的命令,服务器端就可以把预先存储好的这一系列SQL语句全部执行......
  • LEC5101视学学习目标
    MAT246LEC5101VisionandLearningObjectivesWinter2023Theultimateobjectiveofthecourse,issummarizedinitstitle:topresentarichcollectionofmathematicalconcepts,toenrichourlanguage,andtosharpenourproblemsolvingintuition.Withthis......
  • 区块链学习(10)-事件
    一、在Solidity中,emit关键字用于触发事件(Event)。事件是智能合约与区块链外部环境(如前端应用或者监控系统)进行通信的一种方式。当某些条件满足或某些操作发生时,智能合约可以通过触发事件通知外部环境。事件会将相关数据记录在交易的日志中,外部监听器可以订阅并解析这些日志,以便在事......
  • AirNet使用笔记5
    1、DBM“升级工具”,“InstallPosition”之后“UpdatePosition”失败;“ShowLog”提示:/home/cdatc/InstallTK/copyAirNet:errorwhileloadingsharedlibraries:libQtXml.so.4:cannotopensharedobjectfile:Nosuchfileordirectory原因:/usr/lib64下缺少以下3个libQt*......
  • #yyds干货盘点#dB、dBm、dBi、dBc,一文带你学习无线信号强度指标
    定义与由来dB是一个比值,是一个纯计数方法,没有任何单位标注,那它具有什么存在的意义呢。在无线通信里,常常用它来衡量一个地点的信号质量强度。为了纪念贝尔,所以采用贝尔的名字来对信号的衰减或增强进行命名,其表示为Bel=Log(输入信号的功率/输出信号的功率),在实际使用中这个单位太大,所......
  • 使用清华镜像安装python第三方库
    1,在cmd安装python第三方库提示超时,可以使用清华镜像安装  pip3install-ihttps://pypi.tuna.tsinghua.edu.cn/simple--upgrade库名 ......
  • Docker中使用Nginx部署Web项目
    环境Ubuntu22.04.2LTSdocker安装官网教程Setup [Docker’spackagerepository](https://docs.docker.com/engine/install/ubuntu/#set-up-the-repository).只需要前三步,后面的步骤是安装DockerEngineDownloadlatest DEBpackage.Nginx使用镜像创建一个容器......
  • 在markdown中,如何使用html代码将多行的代码块插入到一行文字中
    这是一行文字,其中包含了一段多行的代码块:<codestyle="display:inline-block;">intmain(){printf("Hello,world!");return0;}</code>文字继续向下。效果如下:这是一行文字,其中包含了一段多行的代码块:intmain(){printf("Hello,world!");return0;}文字继续......
  • gin框架中JWT的使用
    前言:Token和SessionSession是一种记录服务器和客户端会话状态的机制,使服务端有状态化,可以记录会话信息。而Token是令牌,访问资源接口(API)时所需要的资源凭证。Token使服务端无状态化,不会存储会话信息。正文:Gin JWT基于JWT的Token认证机制实现JSONWebToken(JWT)是一个......