How to find the cause of server load
table of contents
My name is Ito and I am an infrastructure engineer.
Last time, I introduced something called "load average" that is checked when the server load increases.
Check the server load with load average? Beyond Blog
Load average is a representation of processes waiting to be processed.
The higher the number, the more processes are waiting for processing, and the situation is "higher load."
Why is the load so high this time? I would like to find out the cause.
There are two main causes
There are basically two main reasons why a high load average occurs.
- CPU processing cannot keep up
- Disk I/O processing cannot keep up
Now, I would like to explain how to find out if each is the cause.
If the cause is the CPU
First, check the CPU usage using the top command. The following two values are worth noting.
%user | CPU usage rate used by user processes |
%system | CPU usage rate used by the system (kernel) |
If a general process is CPU intensive, the user mode (%user) CPU usage will probably be high.
Also, if a large number of processes are running in user mode, the process will be switched.
Since the kernel mode (%system) CPU is used when performing this "process switching",
the kernel mode CPU usage rate will be high for programs that often switch processes.
If user mode CPU usage is high
If kernel CPU usage is high
If disk I/O is the cause
If disk I/O is the cause, check the following at top.
%iowait | Processes are idle when disk I/O is required |
SWAP | Amount of memory used up and HDD used instead of memory |
The iowait value is often high when a large amount of data is being read and written from disk.
For example, a DB server that accesses a lot of databases tends to have a high iowait value.
SWAP is the capacity of the HDD used in place of memory when a process uses all of the memory.
An example of using SWAP is when a web server receives a large amount of access, allocating memory and using SWAP.
When using SWAP, using HDD instead of memory = increased load due to disk I/O, which slows down the server speed.
If iowait is high and load is high due to disk I/O
If you are using SWAP
I will investigate
I think this will help you figure out whether the problem is caused by the CPU or I/O.
Furthermore, we will use the ps command to find out which processes are using the CPU (in the case of SWAP, which processes are using memory).
If the CPU usage rate in user mode is high, there is no problem with I/O, etc.,
so you may need to increase CPU performance or review the program.
it may be necessary to expand the memory or
Also, there may be cases where the load average is low but the processing is slow.
In that case, there may be a problem with the software settings or network processing.
When faced with the problem of "high load,"
the first step to solving the problem is to calm down and identify where the load is being placed.