Linux server maintenance load average, wait, kswapd0 in IO is too high and the process of using iostat and iotop commands to troubleshoot the reasons

1. The server load average is very high, the wait is close to 100%, and the kswapd0 is close to 100%.

    Installed on the server yesterdayjenkins,But every time I start it, I find thatjenkinsAutomatically hung up,Very puzzled。I later discovered the serverload averageAlready very high,Basically there1(single coreCPU),And every timejenkinsAfter starting,load averageall the way up,Basically arrived4Got it。andCPUMedium data displaywaitWaiting for input and outputCPUThe percentage of time is basically98%。As shown below:

Server load average is high, wait is close to 100%, kswapd0 in IOs is close to 100%

    It can be seen in the figure,CPUThe usage rate is not high,Explain that it is notCPUbottleneck,waitgreat value,It means that there is a lot of data input and output between the memory and the hard disk waiting.,PresumablyIOvery big,passiotopCheckIOThe situation is as follows:

Server load average is high, wait is close to 100%, kswapd0 in IOs is close to 100%

    IOIt can be seen in the situation:kswapd0The proportion is almost reached100%,kswapd0What is it??

Linux uses kswapd for virtual memory management such that pages that have been recently accessed are kept in memory and less active pages are paged out to disk. (what is a page?) …Linux uses manages memory in units called pages.So, the kswapd process regularly decreases the ages of unreferenced pages…and at the end they are paged out moved out to disk.

    kswapd0is in virtual memory management,Responsible for page change,The operating system wakes up every certain timekswapd ,Check if memory is tight,If you are not nervous,then sleep,exist kswapd middle,have2 threshold,pages_hige and pages_low,When the number of free memory pages is less than pages_low when,kswapdThe process will scan the memory and release it each time32 indivualfree pages,until free page quantity arrivedpages_high。

    visiblekswapd0So busy,There is a problem with the memory,CantopThe command shows that there are still several hundred memory left.M。It's impossible for this to happen,Suddenly I remembered that I wanted to test the memory threshold of the server two days ago.,It may be because the memory threshold is set too high,andkswapd0It is considered that the memory in the memory threshold cannot be used,So it starts up and scans the memory frequently and releases it.,thus occupying a large amount ofIO。I previously changed the server's memory threshold to500M:Command view,

    cat /proc/sys/vm/min_free_kbytes   show:524288,So immediately use the command:

[root@kermit ~]# echo 51200 > /proc/sys/vm/min_free_kbytes    
[root@kermit ~]# cat /proc/sys/vm/min_free_kbytes         
51200

    Then start observingload averageandcpu waitquantity,Sure enough, everything is starting to decline.,It can be seen that,kswapd0It is started when the available memory is insufficient.,And this available memory isfreeminus the available memory in /proc/sys/vm/min_free_kbytes Data after configuration value,at the same time/proc/sys/vm/min_free_kbytes The value cannot be configured too high.。Adjust a few hundredMThat’s it(Depends on physical memory size)。

publish:July 18, 2016 -Monday

2. Use the iostat and iotop commands to troubleshoot the cause when the wa in linux is too high.

    linuxDownwaIndicates when it is too high,expressI/OWaiting timeCPUThe proportion of time is very high,waRight nowwaittime。There was a related article before,But it was not mentioned at that timeiostat,Make one firstwaVery high-end scene。Set the memory threshold larger,Such as memory50%,This of course depends on the current situation of your server.。Order:echo 524200 > /proc/sys/vm/min_free_kbytes After execution, my server’sIOJust go up91Got it(Memory1G)。Screenshot below:

    When faced with such a high fever server,The first thing to check is whether the machine is using a lot of swap space.,Because the operating speed of the hard disk is much slower thanRAM,When system memory is exhausted,Start using swap space(useswapPerformance will be severely affected)。If there is still a lot of memory available,You need to clarify which process takes up most of theI/Ooperate。

    First: Use iostat to troubleshoot a large number of partitions****I/O Operation

    Installiostat:Out of serviceyum install iostat.Because the package name is not callediostat,Rathersysstat。introduce:sysstat.x86_64 : The sar and iostat system monitoring commands。passyum install sysstatAfter the installation is complete。useiostatView below:

[root@kermit ~]# iostat
Linux 2.6.32-431.23.3.el6.x86_64 (kermit)       10/28/2016      _x86_64_        (1 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.54    0.00    0.18    0.34    0.00   98.94

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
xvda              2.70        18.20        37.59 1064816412 2199369704

    Because the server has only one partition,So I only see this partition.IOOperation is particularly high,If there are many partitions。And discover a partition through this commandIOEspecially when it is high,Can be reuseddfCheck what files are mounted on this partition,To check where these files are being operated on。For example, where have I been writing my diary?。iostatPutting a numeric parameter after the command means refreshing every few seconds.。Several important data meanings in the above results:

avg-cpu:
Percentage of CPUs used by %user: to run at the user level
Percentage of CPUs used for %nice: nice operations
Percentage of CPUs used by %system: to run at the kernel
Percentage of CPUs consumed when %iowait: CPU waits for hardware I/O
Percentage of %idle: CPU idle time
Device:
Sda: Device Name
Tps: number of I/O requests sent to per second.
Blk_read/s: Amount of data read per second
Blk_wrtn/s: amount of data written per second.
Total data read by Blk_read:
Total amount of data written by Blk_wrtn:

    Second: You can use iotop to see which process has the highest IOs.

    Also withyumHow to installiotop: yum search iotop。I installed it beforeiotop,Encountered a problem when starting this time:No module named iotop.ui To run an uninstalled copy of iotop, launch iotop.py in the top directory,This error occurs becausepythonversion problem。I originallycentosBelow ispython2.6Version,Later upgraded and installedpython2.7Version,andiotopis inpython2.6version installed(yumThe installed libraries will be inpython2.6Down),and has been updated again/usr/sbin/pythonsoft link,And in/usr/sbin/iotopThe command file used is:#!/usr/bin/python (first row),directly causes this error。It’s easy to solve,Revise/usr/sbin/pythonthe first line of:

    Once completed, start usingiotop,iotopandtopThe command is somewhat similar to,All passed-dParameter specifies refresh seconds,Excuting an order:iotop -d 5 The default result will be as we want withIOsort,The result is as shown below:

    At this point we can seekswapd0ofIOThe highest operation,iotopCommands support keyboard shortcuts:Use the left and right arrows to change the sorting method,The default is to pressIOsort,rKeys are sorted in reverse order,oThe key is only displayed withIOoutput process,andtopSame asqkey to exit。

This document is transferred from https://blog.csdn.net/weixin_47792780/article/details/139207065,If there is any infringement,Please contact to delete。