High-Performance Linux Networking: Kernel and sysctl Tuning Tutorial
In this guide, you will learn Linux Kernel network tuning to improve throughput, reduce latency, and increase connection scalability on production servers. You can use this setup for real-world workloads, including API backends, reverse proxies, storage replication, and an AI/GPU pipeline, where default Linux networking values often become a bottleneck under sustained traffic.
At Perlod Hosting, these optimizations are most useful for high-performance VPS and dedicated servers, which help you get faster downloads and uploads, keep connections stable under heavy traffic, and improve performance for latency-sensitive services.
This guide applies to both flexible VPS servers and dedicated servers, especially when you need stable performance under high traffic.
Table of Contents
Prerequisites for Linux Kernel Network Tuning
Before applying any Linux kernel tuning, make sure the server meets a few basic requirements and that you can safely roll back if needed.
Required access includes:
- Root or sudo privileges on the Linux server.
- Kernel version 4.9+ for TCP BBR support.
- Backup plan before making changes.
You can back up your sysctl configuration with the command below, so you can restore it if something goes wrong:
sudo cp /etc/sysctl.conf /etc/sysctl.conf.bak
Check your current kernel version:
uname -r
If the version is 4.9+, you’re good to continue; if it’s older, you should upgrade the kernel first.
Step 1: Analyze Current Linux System Limitations
Before Linux kernel tuning, we must understand what’s limiting the network performance.
You can check current TCP buffer settings with the following commands:
echo "=== TCP Read Memory ===" && sysctl net.ipv4.tcp_rmem
echo "=== TCP Write Memory ===" && sysctl net.ipv4.tcp_wmem
echo "=== Congestion Control ===" && sysctl net.ipv4.tcp_congestion_control
echo "=== QDisc (Queuing Discipline) ===" && sysctl net.core.default_qdisc
In your output, you must see something similar to these:
net.ipv4.tcp_rmem = 4096 131072 6291456
net.ipv4.tcp_wmem = 4096 16384 4194304
net.ipv4.tcp_congestion_control = cubic
net.core.default_qdisc = pfifo_fast
Explanations:
- tcp_rmem: The system will allocate 4KB–6MB per connection for reading.
- tcp_wmem: The system will allocate 4KB–4MB per connection for writing.
- cubic: Default congestion control designed for LAN.
- pfifo_fast: Basic FIFO queue.
Then, you can check the core system limits with the commands below:
echo "=== Core Buffer Maximums ===" && sysctl net.core.rmem_max net.core.wmem_max
echo "=== Network Device Backlog ===" && sysctl net.core.netdev_max_backlog
echo "=== TCP Auto-tuning ===" && sysctl net.ipv4.tcp_moderate_rcvbuf
echo "=== File Descriptors ===" && sysctl fs.file-max
Example output:
net.core.rmem_max = 212992 (208 KB - too small!)
net.core.wmem_max = 212992 (208 KB - too small!)
net.core.netdev_max_backlog = 1000 (will drop packets at scale)
net.ipv4.tcp_moderate_rcvbuf = 1 (good, auto-tuning is on)
fs.file-max = 794974 (OK, but can be increased)
Also, check the network interface speed with the following command:
ethtool eth0 | grep "Speed:"
Example output:
Speed: 10000Mb/s ← 10 Gbps connection
Or for multiple interfaces, you can use:
for iface in eth0 eth1 eth2; do echo "$iface: $(ethtool $iface 2>/dev/null | grep Speed)"; done
Step 2: Calculate Required Buffer Size
At this point, you can calculate the required buffer size using the Bandwidth‑Delay Product (BDP). BDP estimates how much data must be in flight on the network path to fully utilize the available bandwidth, so it gives a practical minimum for your TCP buffer sizing.
Use the formula below to compute BDP in bytes:
BDP (bytes) = Bandwidth (bits/sec) × RTT (seconds) / 8
Now you can plug in your bandwidth and measured RTT to get a target buffer size for high-throughput transfers.
Example 1: Long-Distance 100 Gbps Connection
- Bandwidth: 100 Gbps
- Typical RTT: ~60 ms (0.06 seconds)
Calculation:
BDP = (100,000,000,000 bits/sec) × (0.06 sec) / 8
BDP = 6,000,000,000 bits / 8
BDP = 750,000,000 bytes
BDP = 750 MB
Recommended TCP buffer: 750 MB minimum. We will use 512 MB for safety.
Example 2: Moderate 10 Gbps Connection
- Bandwidth: 10 Gbps
- Typical RTT: ~80 ms (0.08 seconds)
Calculation:
BDP = (10,000,000,000) × (0.08) / 8
BDP = 100,000,000 bytes
BDP = 100 MB
Recommended TCP buffer: 100 MB minimum.
To measure your actual RTT, ping your target destination. For example, 8.8.8.8 for Google’s DNS:
ping -c 20 8.8.8.8 | tail -1
Example output:
round-trip min/avg/max/stddev = 8.5/11.2/15.3/2.1 ms
Or measure RTT to a specific server:
# To a far-away datacenter
ping -c 10 1.1.1.1 | grep avg
# To measure specifically
for i in {1..20}; do ping -c 1 -W 1 8.8.8.8 2>/dev/null | grep time=; done | awk -F'=' '{print $NF}' | sort -n | tail -5
Step 3: Edit the Sysctl Configuration File
In this step, you can make your Linux kernel tuning persistent by adding the kernel networking settings to /etc/sysctl.conf, so they survive reboots and apply consistently across the server. These parameters control core TCP behavior, which directly impacts throughput and latency on high-speed WAN links like 10G and 100G.
Open the sysctl configuration file with the command below:
sudo sysctl -p /etc/sysctl.conf
At the end of the file, add the following settings:
# ============================================================================
# LINUX KERNEL NETWORK OPTIMIZATION - VERIFIED 2024/2025
# Based on: Cloudflare, Google BBR, ESnet 100G Tuning Research
# Last Updated: December 2025
# ============================================================================
# ============================================================================
# 1. TCP BUFFER SIZES - Critical for Long-Distance Networks
# ============================================================================
# tcp_rmem = minimum initial maximum (in bytes)
# For 100G networks, we recommend 512MB max buffer
net.ipv4.tcp_rmem = 4096 131072 536870912
net.ipv4.tcp_wmem = 4096 131072 536870912
# 2. Core Socket Buffer Limits - System-wide maximums
# Set to same as tcp_rmem/tcp_wmem max values
net.core.rmem_max = 536870912
net.core.wmem_max = 536870912
net.core.rmem_default = 131072
net.core.wmem_default = 131072
# ============================================================================
# 3. TCP WINDOW SCALING - Enable larger windows
# ============================================================================
# Required for tcp_rmem/tcp_wmem to be effective
net.ipv4.tcp_window_scaling = 1
# ============================================================================
# 4. CONGESTION CONTROL & QUEUEING DISCIPLINE
# ============================================================================
# BBR (Bottleneck Bandwidth and Round-trip time)
# - Superior to CUBIC for WAN connections (especially intercontinental)
# - More efficient use of buffer, lower latency
# - Maintains high throughput even with packet loss
net.ipv4.tcp_congestion_control = bbr
# FQ (Fair Queuing) - Required for BBR to work effectively
# - Eliminates bufferbloat
# - Fair distribution of bandwidth
# - Essential for low-latency applications
net.core.default_qdisc = fq
# ============================================================================
# 5. SYN FLOOD & CONNECTION QUEUE PROTECTION
# ============================================================================
# For servers handling thousands of concurrent connections
net.ipv4.tcp_max_syn_backlog = 8192
net.core.somaxconn = 8192
net.core.netdev_max_backlog = 8192
# ============================================================================
# 6. TCP RELIABILITY & ACKNOWLEDGMENT FEATURES
# ============================================================================
# SACK (Selective Acknowledgment) - Faster recovery from packet loss
net.ipv4.tcp_sack = 1
# FACK (Forward Acknowledgment) - Better loss detection
net.ipv4.tcp_fack = 1
# TCP Timestamps - Improves RTT measurement accuracy
net.ipv4.tcp_timestamps = 1
# TCP Fast Open - Reduces connection setup time
# 3 = enable for both client and server
net.ipv4.tcp_fastopen = 3
# ============================================================================
# 7. TCP TIMEOUT & PORT REUSE
# ============================================================================
# Reduce TIME_WAIT duration for high-throughput servers
net.ipv4.tcp_fin_timeout = 30
# Allow reusing TIME_WAIT sockets for new connections
# Safe when tcp_timestamps is enabled
net.ipv4.tcp_tw_reuse = 1
# Expand available ephemeral ports
net.ipv4.ip_local_port_range = 10000 65535
# ============================================================================
# 8. MTU DISCOVERY & FRAGMENTATION
# ============================================================================
# Enable path MTU discovery with ICMP blackhole detection
# Prevents packets being silently dropped
net.ipv4.tcp_mtu_probing = 1
# ============================================================================
# 9. TCP PERFORMANCE TUNING
# ============================================================================
# Don't reset slow-start after idle period
# Allows better recovery on re-transmission
net.ipv4.tcp_slow_start_after_idle = 0
# Auto-tuning of receive buffer (enabled by default)
net.ipv4.tcp_moderate_rcvbuf = 1
# Prevent excessive buffering (bufferbloat mitigation)
net.ipv4.tcp_notsent_lowat = 16384
# ============================================================================
# 10. CONNECTION TIMEOUT & KEEPALIVE
# ============================================================================
# TCP Keep-Alive probe interval (reduces stale connections)
net.ipv4.tcp_keepalive_time = 300
net.ipv4.tcp_keepalive_probes = 3
net.ipv4.tcp_keepalive_intvl = 15
# ============================================================================
# 11. SYSTEM-WIDE LIMITS
# ============================================================================
# Maximum number of open file descriptors system-wide
# Critical for C10k/C100k problem (thousands of concurrent connections)
fs.file-max = 2097152
# Increase memory for netfilter/conntrack (if using iptables)
net.nf_conntrack_max = 1000000
net.netfilter.nf_conntrack_max = 1000000
Once you are done, save and close the file.
Apply the Linux kernel tuning changes by reloading the configuration:
sudo sysctl -p /etc/sysctl.conf
Then, use this command to check for errors:
sysctl -p
If you see errors like this:
sysctl: cannot stat /proc/sys/net/netfilter/nf_conntrack_max: No such file or directory
This is OK, your kernel doesn’t support that particular parameter, and other parameters will still be applied.
Step 4: Enable TCP BBR Congestion Control for Linux Kernel Tuning
TCP BBR is Google’s modern congestion control algorithm designed to keep throughput high while avoiding the latency spikes you often see on high-bandwidth, high-RTT paths.
You can load the BBR module with the command below:
sudo modprobe tcp_bbr
Verify BBR is loaded:
lsmod | grep tcp_bbr
Example output:
tcp_bbr 24576 0
The number after tcp_bbr is the reference count. Either 0 or another number is fine.
To make BBR load automatically on boot, create a module load configuration:
echo "tcp_bbr" | sudo tee -a /etc/modules-load.d/bbr.conf
You can verify it with:
cat /etc/modules-load.d/bbr.conf
In the output, you must see:
tcp_bbr
Also, you can enable the FQ (Fair Queuing) queue discipline, which is commonly paired with BBR to reduce bufferbloat under load:
echo "sch_fq" | sudo tee -a /etc/modules-load.d/fq.conf
Step 5: Confirm BBR, FQ, and Buffers Are Active
At this point, you can validate that your Linux kernel tuning and TCP tuning actually took effect, not just that it was added to a config file.
Check Congestion Control with:
sysctl net.ipv4.tcp_congestion_control
Example Output:
net.ipv4.tcp_congestion_control = bbr
Check QDisc with:
sysctl net.core.default_qdisc
Example output:
net.core.default_qdisc = fq
Check TCP Buffers with:
sysctl net.ipv4.tcp_rmem net.ipv4.tcp_wmem
Example output:
net.ipv4.tcp_rmem = 4096 131072 536870912
net.ipv4.tcp_wmem = 4096 131072 536870912
Verify Window Scaling:
sysctl net.ipv4.tcp_window_scaling
Example output:
net.ipv4.tcp_window_scaling = 1
See All Active Connections and Their Congestion Control:
ss -tin
Example output:
State Recv-Q Send-Q Local Address:Port Peer Address:Port
ESTAB 0 0 192.168.1.100:5201 192.168.1.200:60234
ESTAB 0 0 192.168.1.100:5202 192.168.1.201:60235
For a more detailed TCP state, you can use:
ss -tmin
For connections using BBR specifically, you can use:
ss -tin | grep -i bbr
Step 6: Benchmark Throughput with iperf3
This step turns your tuning into measurable results by running controlled throughput tests with iperf3 between two endpoints.
Install iperf3 based on your distro with the commands below:
sudo apt update && sudo apt install iperf3 -y #Ubuntu/Debian
sudo dnf install iperf3 -y #RHEL,AlmaLinux
Then, start the iperf3 Server with the command below. Run on the server that will receive data:
iperf3 -s -p 5201 -D
Now run on a different machine (iPerf3 Client), or use localhost for testing:
iperf3 -c 192.168.1.100 -p 5201 -t 30 -P 4
Next, you can measure the download or reverse speed:
iperf3 -c 192.168.1.100 -p 5201 -t 30 -P 4 -R
The -R flag reverses the test direction, server sends, and client receives.
You can stop the server with the command below:
pkill iperf3
Step 7: Monitor Network Performance in Real-Time
In this step, you can monitor your network performance in real-time, so you can confirm your tuning is delivering stable throughput without hidden issues like rising latency, queue buildup, or retransmissions.
To monitor bandwidth usage, you can use the bmon tool:
sudo apt install bmon -y
bmon -o ascii
Press h for help, q to quit.
For monitoring connections and bandwidth per process, you can use the nethogs tool:
sudo apt install nethogs -y
sudo nethogs eth0
To monitor TCP Socket Statistics, you can run:
watch -n 1 'ss -s'
To check current latency, you can run:
ping -c 20 192.168.1.100 | tail -1
Monitor TCP retransmissions with the command below:
watch -n 1 'cat /proc/net/snmp | grep Tcp'
Lower TcpRetransSegs is better.
You can compare the benchmark results and see the improvements.
Advanced Linux Kernel Tuning for Extreme High-Performance
These optional tuning parameters focus on eliminating host-side bottlenecks that occur at 100 Gbps and above, particularly CPU power management, NUMA effects, and NIC queue limits.
For 100+ Gbps Networks:
Set CPU Governor to Performance Mode:
# Check current governor
cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
# Set to performance
sudo apt install cpupower -y
sudo cpupower frequency-set -g performance
# Verify
cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
Disable CPU Power Saving:
# Disable C-states (power saving states)
sudo nano /etc/default/grub
# Add to GRUB_CMDLINE_LINUX line:
# idle=poll processor.max_cstate=0
sudo update-grub
sudo reboot
Enable NUMA Affinity for Multi-Socket Servers:
# Check NUMA configuration
numactl --hardware
# Bind process to specific NUMA node
numactl --cpunodebind=0 --preferred=0 ./your-application
Increase RX/TX Ring Buffer:
# Check current values
ethtool -g eth0
# Increase (requires driver support)
sudo ethtool -G eth0 rx 4096 tx 4096
Disable Swap for Ultra-Low Latency:
# Check swap usage
free -h
# Disable (if not needed)
sudo swapoff -a
# Permanent: comment out swap line in /etc/fstab
sudo nano /etc/fstab
# Comment: #/swapfile none swap sw 0 0
FAQs
How do I confirm BBR is enabled for Linux kernel tuning?
Check the configured congestion control via sysctl, then confirm active connections using ss -tin.
Why set fq as the default queue discipline?
FQ helps reduce latency spikes and improve fairness between flows. It’s also commonly recommended alongside BBR because it enables pacing behavior that enhances performance and stability.
Is it safe to enable tcp_tw_reuse?
This can help clients who open lots of new connections, but it needs careful testing in your setup. If the system is mainly a server handling incoming connections, prioritize increasing the connection backlog and tuning application keep-alives first, instead of depending on TIME_WAIT reuse.
Conclusion
Linux Kernel network tuning is most effective when it’s done methodically by measuring a baseline, applying a small set of proven kernel changes, verifying they actually took effect, and then re-testing under real load.
The best improvements usually come from three things: using the right congestion control (often BBR), pairing it with a matching queue setup (like FQ), and setting TCP buffer sizes based on your actual bandwidth and latency (BDP).
If you’re running high-traffic services, these kernel-level optimizations can remove hidden bottlenecks and make performance more stable.
We hope you enjoy this Linux kernel tuning guide. Subscribe to our X and Facebook channels to get the latest updates and articles.
For further reading:
Setting up remote management with IPMI
Building a High-Availability Cluster with Corosync and Pacemaker