Explaining how to check server load with load average and Linux processes
table of contents
I'm Ito, an infrastructure engineer.
When it comes to server operation and maintenance, the sudden increase in load is the problem.
To avoid the fact that the service is heavy, but I don't know the cause!,
About load average
When the load is high and the site or game is heavy, I will use the top command for the time being.
The top command displays the current status of the OS in real time.
With so much information out there, you may not know where to start looking.
This time we will talk about load average, so let's check the load average.
A load average (LA) represents a "process queue" for that server and is generally expressed as the average value over a period of 1 minute, 5 minutes, or 15 minutes.
In the diagram above, from left to right, it says "LA 1 minute ago", "LA 5 minutes ago", and "LA 15 minutes ago".
various processes ask the CPU to handle it, but the server cannot handle it, and
the process queues behind it.
The higher the load average, the higher the load on that server.
The number of processes a server can process at once is "the number of cores of CPUs that are on that server."
Since you can multitasking process, for example, a four-core server can complete the processing of four processes at once.
About the Linux Process
Do you have a general understanding of load average?
Here we will explain the Linux process. There are many different conditions in the process.
TASK_RUNNING | The process is executable and is running or waiting to be executed. |
---|---|
TASK_INTERRUPTIBLE | Interrupts are possible, but you don't know when they will return because they are waiting for user input, etc. |
TASK_UNINTERRUPTIBLE | The server is loaded high, so it is unable to interrupt and is waiting |
TASK_STOPPED | aborted state |
TASK_ZOMBIE | so-called zombie process |
Reference: Process management 1 - Process descriptor - Pridact information sharing wiki
Reference: Learn how Linux works - Process management and scheduling
Of these, the following three are not related to load.
- TASK_INTERRUPTIBLE: Because it is waiting for user input, it does not enter the queue because it does not know when it will return.
- TASK_STOPPED: Process has stopped
- TASK_ZONBIE: Becoming a zombie
In other words, the remaining two are queued and become the load average number, which is the "system load."
"The task is waiting to be executed (TASK_RUNNING)" or "The task is too loaded to be interrupted (TASK_UNINTERRUPTIBLE)."
- TASK_RUNNING
- TASK_UNINTERRUPTIBLE
Other commands that can check LA
Here are two other commands that can be used to check the load average.
You can use the w command to see what other users are logged in.
[root@test ~]# w 12:49:13 up 4:38, 2 users, load average: 0.00, 0.00, 0.00 USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT vagrant pts/0 10.0.2.2 11:43 0.00s 0.00s 0.00s sshd: vagrant [priv] vagrant pts/1 10.0.2.2 11:55 54:08 2.06s 0.00s sshd: vagrant [priv]
The uptime command can be used to check how long a server will continue to run.
You can also check the load average here.
[root@test ~]# uptime 12:49:34 up 4:38, 2 users, load average: 0.00, 0.00, 0.00
summary
So, this time I explained about load average!
- When the load is high, check the load average
- Understand how many processes the server cannot handle
- The higher the load average value, the higher the load.
- Even though we say "process" in one word, there are various states.
- There are multiple commands to view load averages.
It would be the best if you could create a system that doesn't bother you, but knowing these values while operating a server is pretty important, so make sure you understand them properly!
If you want to consult a cloud professional
At Beyond, we use the technical capabilities we have cultivated as a multi-cloud integrator and managed service provider (MSP) since our founding to design, build, and migrate using various cloud/server platforms such as AWS, GCP, Azure, and Oracle Cloud. I went there.
We provide custom-made cloud/server environments that are optimized for customers according to the specifications and functions of the systems and applications they require, so if you are interested in the cloud, please feel free to contact us. .
● Cloud/server design/construction
● Cloud/server migration/migration
● Cloud/server operation, maintenance, and monitoring (24 hours a day, 365 days a year)