My first command is always 'w'. And I always urge young engineers to do the same.
There is no shorter command to show uptime, load averages (1/5/15 minutes), logged in users. Essential for quick system health checks!
It should also be mentioned, Linux Load Average is a complex beast[1]. However, a general rule of thumb that works for most environments is:
You always want the load average to be less than the total number of CPU cores. If higher, you're likely experiencing a lot of waits and context switching.
[1] https://www.brendangregg.com/blog/2017-08-08/linux-load-aver...
On Linux this is not true, on an IO heavy system - with lots of synchronous I/Os done concurrently by many threads - your load average may be well over the number of CPUs, without having a CPU shortage. Say, you have 16 CPUs, load avg is 20, but only 10 threads out of 20 are in Runnable (R) mode on average, and the other 10 are in Uninterruptible sleep (D) mode. You don't have a CPU shortage in this case.
Note that synchronous I/O completion checks for previously submitted asynchronous I/Os (both with libaio and io_uring) do not contribute to system load as they sleep in the interruptible sleep (S) mode.
That's why I tend to break down the system load (demand) by the sleep type, system call and wchan/kernel stack location when possible. I've written about the techniques and one extreme scenario ("system load in thousands, little CPU usage") here:
https://tanelpoder.com/posts/high-system-load-low-cpu-utiliz...
Hey Tanel - I wanted to thank you for that blog post and psn tool - it recently helped me in a tricky performance investigation.
Glad to be helpful! :-)
The proper way is to have a idea of what it normally is before you need to troubleshoot issues.
What is a 'good load' depends on the application and how it works. Some servers something close to 0 is a good thing. Other servers a 10 or lower means something is seriously wrong.
Of course if you don't know what is a 'good' number or you are trying to optimize a application and looking for bottlenecks then it is time to reach for different tools.
- [deleted]
Glances is nice. I think it is a clone of HP-UX Glance.
https://nicolargo.github.io/glances/
I have also hacked basic top to add database login details to server processes.
Me too! So much so that I add it to my .bashrc everywhere.