Check the server load with load average
table of contents
My name is Ito and I am an infrastructure engineer.
When it comes to server operation and maintenance, a sudden increase in load is a problem.
, "My service is slow, but I don't know why!"
, I'd like to introduce you to the "load average" that is often checked first!
About load average
When the load is high and the site or game is heavy, I will use the top command for the time being.
The top command displays the current status of the OS in real time.
With so much information out there, you may not know where to start looking.
This time we will talk about load average, so let's check the load average.
Load average (LA) represents the "queue of processes" for that server.
From the left: "LA 1 minute ago," "LA 5 minutes ago," and "LA 15 minutes ago."
This indicates a situation where various processes are requesting the CPU to process them, but because the server cannot handle them,
The higher the load average value, the higher the load on that server.
The number of processes that a server can process at one time is determined by the number of CPU cores on the server.
Processing can be done by multitasking, so for example, if you have a server with 4 cores, you can process 4 processes at once.
A little more details
Do you have a general understanding of load average?
Now let's talk about Linux processes.
Processes also have various states.
TASK_RUNNING | The process is executable and is running or waiting to be executed. |
---|---|
TASK_INTERRUPTIBLE | Interrupts are possible, but you don't know when they will return because they are waiting for user input, etc. |
TASK_UNINTERRUPTIBLE | Server load is high and interrupts cannot be made and the server is in a waiting state |
TASK_STOPPED | aborted state |
TASK_ZOMBIE | so-called zombie process |
Reference: Process management 1 - Process descriptor - Pridact information sharing wiki
Reference: Learn how Linux works - Process management and scheduling
Of these, the following three are not related to load.
- TASK_INTERRUPTIBLE: Because it is waiting for user input, it does not enter the queue because it does not know when it will return.
- TASK_STOPPED: Process has stopped
- TASK_ZONBIE: Becoming a zombie
In other words, the remaining two are queued and become the load average number, which is the "system load."
"The task is waiting to be executed (TASK_RUNNING)" or "The task is too loaded to be interrupted (TASK_UNINTERRUPTIBLE)."
- TASK_RUNNING
- TASK_UNINTERRUPTIBLE
Other commands that can check LA
Here are two other commands that can be used to check the load average.
You can use the w command to see what other users are logged in.
[root@test ~]# w 12:49:13 up 4:38, 2 users, load average: 0.00, 0.00, 0.00 USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT vagrant pts/0 10.0.2.2 11:43 0.00s 0.00s 0.00s sshd: vagrant [priv] vagrant pts/1 10.0.2.2 11:55 54:08 2.06s 0.00s sshd: vagrant [priv]
The uptime command can be used to check how long a server will continue to run.
You can also check the load average here.
[root@test ~]# uptime 12:49:34 up 4:38, 2 users, load average: 0.00, 0.00, 0.00
summary
So, this time I explained about load average!
- When the load is high, check the load average
- You can see the number of processes that the server cannot handle
- The higher the load average value, the higher the load.
- Even though we say "process" in one word, there are various states.
- There are multiple commands to view load averages.
It would be the best if we could create a system that didn't care about these things, but
it's still important to know these values while operating the server, so
make sure you understand them properly!