首页 > 系统相关 >转载:linux:iowait的含义

转载:linux:iowait的含义

时间:2024-04-25 14:01:13浏览次数:30  
标签:there iowait linux will program time 转载 CPU

原文:https://blog.pregos.info/wp-content/uploads/2010/09/iowait.txt

原文:https://www.kawabangga.com/posts/5903

 

原文:

What exactly is "iowait"?

To summarize it in one sentence, 'iowait' is the percentage
of time the CPU is idle AND there is at least one I/O
in progress.

Each CPU can be in one of four states: user, sys, idle, iowait.
Performance tools such as vmstat, iostat, sar, etc. print
out these four states as a percentage.  The sar tool can
print out the states on a per CPU basis (-P flag) but most
other tools print out the average values across all the CPUs.
Since these are percentage values, the four state values
should add up to 100%.

The tools print out the statistics using counters that the
kernel updates periodically (on AIX, these CPU state counters
are incremented at every clock interrupt (these occur
at 10 millisecond intervals).
When the clock interrupt occurs on a CPU, the kernel
checks the CPU to see if it is idle or not. If it's not
idle, the kernel then determines if the instruction being
executed at that point is in user space or in kernel space.
If user, then it increments the 'user' counter by one. If
the instruction is in kernel space, then the 'sys' counter
is incremented by one.

If the CPU is idle, the kernel then determines if there is
at least one I/O currently in progress to either a local disk
or a remotely mounted disk (NFS) which had been initiated
from that CPU. If there is, then the 'iowait' counter is
incremented by one. If there is no I/O in progress that was
initiated from that CPU, the 'idle' counter is incremented
by one.

When a performance tool such as vmstat is invoked, it reads
the current values of these four counters. Then it sleeps
for the number of seconds the user specified as the interval
time and then reads the counters again. Then vmstat will
subtract the previous values from the current values to
get the delta value for this sampling period. Since vmstat
knows that the counters are incremented at each clock
tick (10ms), second, it then divides the delta value of
each counter by the number of clock ticks in the sampling
period. For example, if you run 'vmstat 2', this makes
vmstat sample the counters every 2 seconds. Since the
clock ticks at 10ms intervals, then there are 100 ticks
per second or 200 ticks per vmstat interval (if the interval
value is 2 seconds).   The delta values of each counter
are divided by the total ticks in the interval and
multiplied by 100 to get the percentage value in that
interval.

iowait can in some cases be an indicator of a limiting factor
to transaction throughput whereas in other cases, iowait may
be completely meaningless.
Some examples here will help to explain this. The first
example is one where high iowait is a direct cause
of a performance issue.

Example 1:
Let's say that a program needs to perform transactions on behalf of
a batch job. For each transaction, the program will perform some
computations which takes 10 milliseconds and then does a synchronous
write of the results to disk. Since the file it is writing to was
opened synchronously, the write does not return until the I/O has
made it all the way to the disk. Let's say the disk subsystem does
not have a cache and that each physical write I/O takes 20ms.
This means that the program completes a transaction every 30ms.
Over a period of 1 second (1000ms), the program can do 33
transactions (33 tps).  If this program is the only one running
on a 1-CPU system, then the CPU usage would be busy 1/3 of the
time and waiting on I/O the rest of the time - so 66% iowait
and 34% CPU busy.

If the I/O subsystem was improved (let's say a disk cache is
added) such that a write I/O takes only 1ms. This means that
it takes 11ms to complete a transaction, and the program can
now do around 90-91 transactions a second. Here the iowait time
would be around 8%. Notice that a lower iowait time directly
affects the throughput of the program.

Example 2:

Let's say that there is one program running on the system - let's assume
that this is the 'dd' program, and it is reading from the disk 4KB at
a time. Let's say that the subroutine in 'dd' is called main() and it
invokes read() to do a read. Both main() and read() are user space
subroutines. read() is a libc.a subroutine which will then invoke
the kread() system call at which point it enters kernel space.
kread() will then initiate a physical I/O to the device and the 'dd'
program is then put to sleep until the physical I/O completes.
The time to execute the code in main, read, and kread is very small -
probably around 50 microseconds at most. The time it takes for
the disk to complete the I/O request will probably be around 2-20
milliseconds depending on how far the disk arm had to seek. This
means that when the clock interrupt occurs, the chances are that
the 'dd' program is asleep and that the I/O is in progress. Therefore,
the 'iowait' counter is incremented. If the I/O completes in
2 milliseconds, then the 'dd' program runs again to do another read.
But since 50 microseconds is so small compared to 2ms (2000 microseconds),
the chances are that when the clock interrupt occurs, the CPU will
again be idle with a I/O in progress.  So again, 'iowait' is
incremented.  If 'sar -P <cpunumber>' is run to show the CPU
utilization for this CPU, it will most likely show 97-98% iowait.
If each I/O takes 20ms, then the iowait would be 99-100%.
Even though the I/O wait is extremely high in either case,
the throughput is 10 times better in one case.



Example 3:

Let's say that there are two programs running on a CPU. One is a 'dd'
program reading from the disk. The other is a program that does no
I/O but is spending 100% of its time doing computational work.
Now assume that there is a problem with the I/O subsystem and that
physical I/Os are taking over a second to complete. Whenever the
'dd' program is asleep while waiting for its I/Os to complete,
the other program is able to run on that CPU. When the clock
interrupt occurs, there will always be a program running in
either user mode or system mode. Therefore, the %idle and %iowait
values will be 0. Even though iowait is 0 now, that does not
mean there is NOT a I/O problem because there obviously is one
if physical I/Os are taking over a second to complete.



Example 4:

Let's say that there is a 4-CPU system where there are 6 programs
running. Let's assume that four of the programs spend 70% of their
time waiting on physical read I/Os and the 30% actually using CPU time.
Since these four  programs do have to enter kernel space to execute the
kread system calls, it will spend a percentage of its time in
the kernel; let's assume that 25% of the time is in user mode,
and 5% of the time in kernel mode.
Let's also assume that the other two programs spend 100% of their
time in user code doing computations and no I/O so that two CPUs
will always be 100% busy. Since the other four programs are busy
only 30% of the time, they can share that are not busy.

If we run 'sar -P ALL 1 10' to run 'sar' at 1-second intervals
for 10 intervals, then we'd expect to see this for each interval:

         cpu    %usr    %sys    %wio   %idle
          0       50      10      40       0
          1       50      10      40       0
          2      100       0       0       0
          3      100       0       0       0
          -       75       5      20       0

Notice that the average CPU utilization will be 75% user, 5% sys,
and 20% iowait. The values one sees with 'vmstat' or 'iostat' or
most tools are the average across all CPUs.

Now let's say we take this exact same workload (same 6 programs
with same behavior) to another machine that has 6 CPUs (same
CPU speeds and same I/O subsytem).  Now each program can be
running on its own CPU. Therefore, the CPU usage breakdown
would be as follows:

         cpu    %usr    %sys    %wio   %idle
          0       25       5      70       0
          1       25       5      70       0
          2       25       5      70       0
          3       25       5      70       0
          4      100       0       0       0
          5      100       0       0       0
          -       50       3      47       0

So now the average CPU utilization will be 50% user, 3% sy,
and 47% iowait.  Notice that the same workload on another
machine has more than double the iowait value.



Conclusion:

The iowait statistic may or may not be a useful indicator of
I/O performance - but it does tell us that the system can
handle more computational work. Just because a CPU is in
iowait state does not mean that it can't run other threads
on that CPU; that is, iowait is simply a form of idle time.
 

 

标签:there,iowait,linux,will,program,time,转载,CPU
From: https://www.cnblogs.com/jinziguang/p/18157593

相关文章

  • linux系统内存分布图
    目录linux系统内存分布图:1:保留区:2:代码段:3:程序由数据以及指令组成4:数据段:(静态分布区)5:栈:6:内核(kernel):映射到进程虚拟内存,但程序无法访问(不允许用户访问的位置)linux系统内存分布图:linux系统使用的是虚拟内存,不是真是的为物理地址(linux中使用虚拟内存,虚拟内存和物理内存的转......
  • linux centos7 3.10+安装mysql8.0.36
    目录网络源linux操作先搜索有没有安装低版本的mysql5.7或者和mysql冲突的mariadb卸载冲突软件一定要卸载,否则会导致安装mysql失败安装mysql的依赖包,wget,解压源码包时所用的依赖库增加一个mysql的user解压源码包到当前目录给mysql改一个简单的名字编辑/etc下的mysql配置文件my.cnf......
  • 在Linux中,如何创建一个新用户?
    在Linux中创建新用户通常使用useradd命令或adduser命令。这两个命令的功能相似,但adduser命令在某些发行版(如Debian及其衍生版)中更常用,因为它提供了一个更为友好的交互过程,自动处理一些额外的设置,比如创建邮箱、设置密码等。下面是使用这两个命令创建新用户的详细步骤:1.使用user......
  • 在Linux中,如何查看当前日期和时间?
    在Linux中查看当前日期和时间,最常用且直接的方法是使用date命令。这是一个快速简便的命令,不需要任何参数即可显示当前系统的日期和时间信息。以下是具体的操作步骤:打开终端:首先,打开你的Linux系统的终端。这可以通过快捷键(通常是Ctrl+Alt+T)或者从应用菜单中找到“终端”来完成......
  • 在Linux中,如何添加一个用户到特定的组?
    在Linux系统中,用户可以属于一个或多个组。将用户添加到特定的组是权限管理的一部分,允许用户继承组的权限和访问控制。以下是将用户添加到特定组的步骤:1.使用usermod命令使用usermod命令:要将用户添加到现有的组,你可以使用usermod命令的-aG(appendtogroup)选项。sudousermod......
  • 在Linux中,如何监控系统的性能?
    在Linux中监控系统性能是一个关键的运维任务,它有助于识别瓶颈、优化资源分配并确保系统的稳定运行。以下是一些常用的命令行工具和方法,用于监控Linux系统的性能:top命令:top是最基础也是最常用的实时系统监控工具,它可以显示当前系统中的进程列表以及CPU、内存使用情况等。通过......
  • linux网络配置
    网络配置命令ifconfigifconfig命令通常用于查看、配置和管理网络接口的信息,其通常用法如下:查看所有网络接口信息:可以直接运行ifconfig命令来查看系统上所有网络接口的信息,包括接口名、IP地址、MAC地址等。查看特定网卡信息:可以指定网卡名来查看特定网卡的详细信息。......
  • Linux统计文件内容重复行
    需求:在一个文件中,如下文件内容有许多乱序的重复值,那我们想要快速知道哪些是重复值怎么办?试问你能靠眼里10秒内找到吗?哈哈哈 解决方案:先使用sort命令将文件内容进行排序,再使用uniq命令进行统计重复值uniq常用参数-c#在每行前统计重复的次数......
  • linux 离线安装 mysql8.0
    一、下载linuxmysql8.0离线安装包mysql下载地址:https://dev.mysql.com/get/Downloads/MySQL-8.0/mysql-8.0.20-linux-glibc2.12-x86_64.tar.xzmysql官方下载最新版本:https://dev.mysql.com/downloads/mysql/ 然后选择linux把下载的压缩包上传到要安装的服务器上,解压mysql t......
  • linux命令从log文件中找出404 或者500的所有报错信息?
     你可以使用grep命令结合正则表达式来找出包含"404"或"500"的所有报错信息,并显示这些行的内容。以下是示例命令:grep-E'404|500'/path/to/logfile.log这个命令会在指定的日志文件/path/to/logfile.log中查找包含"404"或"500"的所有行,并将这些行显示出来。g......