//------------------------------------------------------------------- //-------------------------------------------------------------------
Troubleshoot High Load Low CPU on Linux

Why Is My Linux Server Load High When CPU Usage Is Low?

A high load low CPU system is a common but also confusing scenario for Linux administrators. Unlike high CPU load, where the processor is busy calculating, this state usually includes processes that are stuck waiting for data from the disk. Whether you are managing a personal VPS server or a high-traffic dedicated server, understanding and identifying this bottleneck is key to maintaining peak performance.

In this guide from PerLod Hosting, we want to explore the root causes, especially Dstate processes and IO wait, identify and fix the issue.

Why is Linux System Load High When CPU is Idle?

To understand why the load is high while the CPU is low, you must understand how Linux calculates system load. Linux Load Average is not just a measure of CPU usage; it is a count of processes in two specific states:

  • R (Running/Runnable): Processes actively using the CPU or waiting in line for the CPU.
  • D (Uninterruptible Sleep): Processes waiting for a hardware resource, usually disk I/O, to respond.

When your server shows High Load but Low CPU, it means your queue is full of Dstate processes. These processes are not using the processor; they are paused, waiting for the storage system to read or write data. Because the kernel considers them active, they fill the load average, even though the CPU itself is 99% idle.

Common causes include:

  • Storage Latency: The physical disk or network storage is too slow to handle the volume of requests.
  • Saturated Disk Bandwidth: A backup script or database query is consuming 100% of the disk’s read and write capacity.
  • Hardware Locks: Drivers waiting for an external device to respond.

How IO Wait Contributes to System Load?

The primary symptom of D-state processes is IO Wait (wa), which represents the percentage of time the CPU sits idle because it is waiting for a disk I/O operation to complete. When a process requests data, the CPU sends the command to the disk and waits. If the disk is fast, the wait is too small. If the disk is slow or overloaded, the process enters Uninterruptible Sleep (D-state).

Key Metrics to Analyze:

1. %wa (IO Wait): The %wa metric in the top command is the most essential option for this issue. It measures the percentage of time the CPU is sitting idle, specifically because it is waiting for a disk I/O request to finish.

  • Normal Behavior of %wa: In a healthy system, this is usually near 0% because modern CPUs and disks like NVMe SSDs are fast.
  • The Bottleneck: If %wa is consistently high, for example, above 10 to 20 %, while user (us) and system (sy) CPU usage are low, your CPU is essentially bored, but unable to work because the disk is too slow to provide data. The system is I/O bound, which means the disk speed, not the CPU speed, is limiting performance.

2. Await (Latency): While %wa tells you how much time is wasted, the await metric in the iostat command tells you why. It represents the average time in milliseconds it takes for a single I/O request to be fully served by the disk.

When this latency is high, for example, 50ms, 100ms, or more, processes cannot finish their tasks quickly. They get stuck in the D-state (Uninterruptible Sleep), waiting for their turn. This backlog of waiting processes causes the high load average, even though the CPU itself isn’t doing any heavy calculation.

In summary, a high load low CPU system is almost a storage bottleneck.

How To Fix High Load Low CPU Issue?

Now that you understand that disk latency and stuck processes are the root cause, you need a systematic way to find the specific bottleneck. You must trace the high load from the system level down to the specific process or disk partition.

You can follow the steps below to verify the issue, identify the process, and apply a fix.

1. Verify the Load and IO Wait

The first step is to confirm that the high load is caused by I/O wait and not CPU processing. To do this, you can use the top command:

top

You must look for the %Cpu(s) line, specifically the wa value. Example output:

%Cpu(s):  1.2 us,  0.5 sy,  0.0 ni, 48.1 id, 50.2 wa ...
  • us (User): Low value, for example, 1.2%, represents that applications are not calculating much.
  • id (Idle): Moderate or low value, like 48.1,% means the CPU has free time.
  • wa (IO Wait): A high value like 50.2% confirms the CPU is wasting half its time waiting for disk.

2. Identify Processes in D-State

At this point, you must find the specific processes that are stuck in Uninterruptible Sleep (D-State). To do this, you can use the command below:

ps -eo state,pid,cmd | grep "^D"

This command lists all processes where the state starts with D. In the output, you must see a list of PIDs (Process IDs) and command names, for example, mysql, apache, or a backup script tar. These are the processes stuck waiting for the disk.

3. Detect the Disk Bottleneck

Once you identify the D-state processes, you must check if a specific disk is saturated or responding slowly. For this purpose, you can use the following iostat command:

iostat -xz 1

The iostat is in the sysstat package; you can install it with:

sudo apt install sysstat #Ubuntu
sudo dnf install sysstat #RHEL

In your output from the iostat command, you must check the following metrics:

  • %util: If near 100%, the disk is fully saturated.
  • await: The average time (ms) for I/O requests. Values >10ms usually indicate slowness; >100ms indicates severe latency.

4. Find the Process Causing High I/O

While the ps command lists the processes that are stuck waiting, the iotop command reveals the actual cause, which is the specific process that is heavily using the disk.

You can use the iotop command below to find the process causing high I/O:

sudo iotop -oP
  • -o: Only shows processes actively doing I/O.
  • -P: Shows processes instead of threads.

You must look at the DISK WRITE or DISK READ columns to find the top consumer.

5. Resolve High Load Issues

Once you have identified the specific process or disk bottleneck using the tools above, you can take action to resolve high load issues. The solution depends on whether the traffic is temporary, like a backup, or permanent, like a busy database:

Terminate Non-Critical Tasks: If a background job, like a backup or log rotation script, is causing the spike during peak hours, stop it immediately using kill -9 [PID] and reschedule it for a quieter time.

Optimize Software Configuration: For database services like MySQL or PostgreSQL, high I/O often signals poor caching. Increase buffer pool sizes to keep more data in RAM, reducing the need for disk reads, or optimize slow SQL queries that trigger full table scans.

If you are running heavy workloads, this guide on Tuning High-traffic Servers can be helpful.

Upgrade Storage Resources: If your disk utilization (%util) sits at 100% even during normal traffic, your current hardware cannot handle the workload. You must upgrade to faster storage, such as NVMe SSDs or provision higher IOPS for cloud volumes.

You can also fine-tune your existing drive settings using the NVMe Optimization for Linux Server Guide.

FAQs

Can I kill a process in D-state?

No. D-state processes cannot be killed, even with kill -9, because they are locked by the kernel waiting for hardware. You must fix the hardware issue or reboot the server to clear them.

What is the difference between Zombie (Z) and D-state processes?

Zombie (Z): A finished process waiting for its parent. It uses zero resources and does not affect performance.
D-State (D): An active process blocked by hardware. It inflates the load average and slows down the system.

What is a good load average?

A healthy load average should be less than your total number of CPU cores.
Load < Cores: System is handling traffic well.
Load > Cores: Processes are queuing up.
High Load on 4 Cores and Low CPU: This confirms a severe disk I/O bottleneck, as processes are stuck waiting despite the CPU being idle.

Is high load always bad if CPU is low?

It is a warning sign. A high load with a low CPU means that applications are running slowly while waiting for data. Your server may not crash, but users will experience lag and timeouts because the CPU is sitting idle instead of working.

Conclusion

A high load low CPU system is usually a storage problem, not a processor problem. It means your CPU is fast, but it is forced to wait for slow disks to finish their work.​

By monitoring %wa (IO Wait) and identifying D-state processes, you can detect the exact bottleneck, whether it’s a saturated drive, a heavy backup script, or an unoptimized database. Fixing these I/O issues will unlock your server’s true performance and bring load averages back down to normal levels.

We hope you enjoy this guide on fixing the high load low CPU system issue. Subscribe to our X and Facebook channels to get the latest updates and articles.

Post Your Comment

PerLod delivers high-performance hosting with real-time support and unmatched reliability.

Contact us

Payment methods

payment gateway
Perlod Logo
Privacy Overview

This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.