//------------------------------------------------------------------- //-------------------------------------------------------------------
OpenSearch disk watermark errors

How to Fix OpenSearch Flood Stage and Disk Watermark Errors

OpenSearch disk watermark errors occur when nodes run critically low on disk space, triggering protective mechanisms that can freeze your cluster and block write operations. Understanding how these watermarks work and how to recover from them is essential for maintaining cluster stability and preventing data loss.

In this guide, we want to troubleshoot and resolve OpenSearch disk watermark errors. You will learn exactly why your cluster entered a defensive read-only state and how to use the correct API commands to release these blocks and restore write operations.

Whether you are running a self-managed cluster or high-performance instances on Perlod Hosting, learning these disk watermarks in OpenSearch settings is essential.

Understand OpenSearch Watermark Thresholds and Why They Freeze Clusters

OpenSearch uses three disk watermark levels to protect cluster stability as disk space decreases. The OpenSearch disk watermark limit includes:

  1. The low disk watermark (default 85%) runs when a node reaches this threshold, which prevents OpenSearch from allocating new shards to that node.
  2. The high disk watermark (default 90%) activates more aggressive protection by attempting to relocate existing shards away from the affected node to other nodes with available space.
  3. The flood stage watermark (default 95%) represents the final protection level where OpenSearch blocks all write operations to indexes with any shard on the affected node.

When the flood stage watermark is reached, OpenSearch enforces a read-only index block, index.blocks.read_only_allow_delete, on every index with shards on the affected node. This automatic protection prevents complete disk exhaustion, which could cause node failures and potential data corruption. Write operations are blocked cluster-wide for affected indexes, though read operations and searches continue to function normally. The cluster remains operational but cannot accept new data until disk space is freed or the watermark thresholds are adjusted.

To verify disk space on each node, you can run the command below:

GET _nodes/stats/fs

This displays detailed filesystem statistics, including available space, total space, and disk usage percentages for every node in your cluster.

To check your current watermark settings and verify if any nodes have triggered watermark alerts, run the command below:

GET _cluster/settings

This displays all cluster-level settings, including the configured disk watermark thresholds and their current values.

Recover OpenSearch Disk Watermark

If you are seeing read-only errors or your logs are warning about a flood stage, you have triggered a disk watermark. To recover the OpenSearch disk watermark, you can follow the actions below.

1. Immediate Recovery Actions: The fastest way to restore write operations is to free up disk space by deleting old or unnecessary indexes. You can delete an index using the command below. Remember to replace the index-name with your actual index name:

DELETE /index-name

After freeing sufficient disk space, you must manually remove the read-only block that OpenSearch applied during the flood stage.

To remove the block from all affected indexes, you can use:

PUT */_settings?expand_wildcards=all
{
  "index.blocks.read_only_allow_delete": null
}

The expand_wildcards=all parameter ensures the setting applies to all indexes, including closed ones, and setting the value to null removes the block entirely.

2. Adjust Watermark Thresholds: If you need immediate action and can not delete data, you can temporarily increase the watermark thresholds.

You can use the command below to modify watermarks using percentage values:

PUT _cluster/settings
{
  "transient": {
    "cluster.routing.allocation.disk.watermark.low": "85%",
    "cluster.routing.allocation.disk.watermark.high": "90%",
    "cluster.routing.allocation.disk.watermark.flood_stage": "95%",
    "cluster.info.update.interval": "1m"
  }
}

The transient parameter means these settings will reset after a cluster restart, while cluster.info.update.interval controls how frequently OpenSearch checks disk usage.

Alternatively, you can set watermarks using fixed byte values instead of percentages:

PUT _cluster/settings
{
  "persistent": {
    "cluster.routing.allocation.disk.watermark.low": "25gb",
    "cluster.routing.allocation.disk.watermark.high": "15gb",
    "cluster.routing.allocation.disk.watermark.flood_stage": "10gb"
  }
}

Using persistent instead of transient preserves these settings across cluster restarts, and fixed byte values prevent percentage-based warnings on large disks.

3. Verify Cluster Recover: You can check if your cluster is rebalancing shards away from full nodes by monitoring cluster health:

GET _cluster/health/

This command gives you a quick health check and shows if data is currently moving to safe nodes. If you have space but data isn’t moving, it is usually because:

  • The empty nodes are actually too full: They might already be above the 85% safety limit.
  • Rules are blocking the move: Settings like Zone Awareness might prevent data from moving to certain servers.
  • The nodes already have that data: A server cannot hold two copies of the same file, so if the target node already has a copy, it refuses the move.

How to Prevent OpenSearch Disk Watermark Errors?

Fixing a frozen cluster is good, but never letting it freeze in the first place is better. The best way to avoid late-night emergencies is to manage your storage before it hits the critical 95% mark. In this section, you can explore key steps to keep your OpenSearch cluster healthy.

1. Long-Term Storage Management: Don’t delete old data manually; let OpenSearch do it for you. You can set up Index State Management (ISM) to automatically delete indexes once they get too old, for example, delete logs after 30 days.

To make this easier, name your indexes with dates like logs-2026-01-26. This allows the system to instantly recognize which files are old and safely remove them based on the rules you set.

2. Capacity Planning Strategies: For heavy workloads, running on a Dedicated Server ensures you have the stable speed and storage space you need to grow. Don’t wait for the disk full error; add new nodes to your cluster before you hit the limit so OpenSearch can spread the data out safely.

You can also use tools like Terraform to automatically add more space the moment your monitoring alerts go off.

While you ensure sufficient storage to avoid watermarks, you must also configure your RAM correctly to prevent OutOfMemory crashes. For a complete configuration strategy, you can check this guide on OpenSearch Heap Sizing Best Practices.

3. Monitoring and Alerting: You need to know you’re running low on space before it becomes a problem. Set up alerts to notify you the moment disk usage hits 85%.

Also, make sure OpenSearch checks disk space frequently, for example, every minute by setting cluster.info.update.interval to 1m. Finally, move old data to snapshots. This lets you safely delete it from your live servers to free up space, knowing you can always restore it later if needed.

4. Optimizing Shard Allocation: When OpenSearch moves data to fix a full disk, it moves shards. You can control how many shards move at once by adjusting the incoming_recoveries and outgoing_recoveries settings (default is 2).

Make sure your Low Watermark leaves enough space for these moves. For example, if your biggest shard is 50GB and you move 2 at a time, you need at least 150GB of free space (3x shard size) to be safe. This gives the system enough buffer room to shuffle data around without immediately running out of space again.

FAQs

What is the default OpenSearch disk watermark limit?

OpenSearch uses three default limits:
Low (85%): Stops putting new data on the node.
High (90%): Starts moving existing data to other nodes.
Flood Stage (95%): Locks the cluster to read-only to prevent crashing.

Can I disable disk watermarks in OpenSearch?

Yes, but it is dangerous. Without watermarks, your disk can hit 100% and crash the server, which causes data loss. It is safer to adjust the limits or add more storage instead.

Why is my OpenSearch cluster still read-only after I deleted data?

Freeing space doesn’t automatically unlock your data. You must manually run the unlock command (setting read_only_allow_delete to null) to tell OpenSearch it is safe to write again.

Conclusion

Disk watermark errors are frustrating, but they exist to save your data from total corruption. They are the emergency brakes that stop your cluster before it crashes. By understanding the three stages of watermarks (Low, High, and Flood Stage), you can catch these issues early. Rememeber do not wait for 95%; set up monitoring alerts at 85%, use automated cleanup policies (ISM), and ensure your infrastructure has space to grow.

If you constantly encounter storage limits, it might be time to upgrade your infrastructure. Whether you need optimized storage configurations or powerful dedicated servers, Perlod provides the high-performance foundation your OpenSearch cluster needs to run smoothly without interruptions.

We hope you enjoy this guide. Subscribe to our X and Facebook channels to get the latest updates and articles.

Post Your Comment

PerLod delivers high-performance hosting with real-time support and unmatched reliability.

Contact us

Payment methods

payment gateway
Perlod Logo
Privacy Overview

This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.