Managing Storage: How to Track Directory Sizes Over Time

Written by

in

Managing Storage: How to Track Directory Sizes Over Time Data expansion happens silently. A server running smoothly today can easily run out of disk space tomorrow due to runaway logs, expanding databases, or unmonitored user uploads. Standard tools like du (disk usage) tell you what your storage looks like right now, but they fail to show the rate of growth. To prevent unexpected downtime, you must track directory sizes over time.

Implementing a historical storage tracking system allows you to forecast capacity needs, detect anomalous data spikes, and find the exact root cause of storage issues. The Core Strategy: Capture, Store, and Analyze

Building a tracking system follows a simple three-step pipeline:

[ 1. Capture ] Matrix baseline using CLI tools (du, df) │ ▼ [ 2. Store ] Append time-stamped data to a log, database, or TSDB │ ▼ [ 3. Analyze ] Visualize trends and calculate growth velocity

Capture: Run a scheduled script to measure specific target folders.

Store: Save these snapshots with a timestamp into a flat-file, database, or time-series platform.

Analyze: Compare the current snapshot against historical data to calculate growth velocity. Method 1: The Lightweight Cron and Log Approach

For a single server or a quick setup, you do not need complex software. A simple Bash script combined with a cron job provides immediate visibility. 1. Write the Collection Script

Create a script named track_size.sh. This script records the size of your target directories in Megabytes, paired with an ISO-8601 timestamp.

#!/bin/bash TARGET_DIR=“/var/www/html” LOG_FILE=“/var/log/directory_growth.log” TIMESTAMP=\((date -Iseconds) # Get size in MB DIR_SIZE=\)(du -sm “\(TARGET_DIR" | cut -f1) # Append to log file echo "\)TIMESTAMP, \(TARGET_DIR, \){DIR_SIZE}MB” >> “\(LOG_FILE" </code> Use code with caution. 2. Automate with Cron</p> <p>Schedule this script to run daily. Open your crontab configuration (<code>crontab -e</code>) and add the following line to execute the script every night at midnight: <code>0 0/usr/local/bin/track_size.sh </code> Use code with caution. 3. Analyze the Results Your log file will accumulate clean, parseable entries:</p> <p><code>2026-06-01T00:00:00-04:00, /var/www/html, 1240MB 2026-06-02T00:00:00-04:00, /var/www/html, 1245MB 2026-06-03T00:00:00-04:00, /var/www/html, 1580MB </code> Use code with caution.</p> <p>A sudden jump from June 2nd to June 3rd immediately alerts you to investigate that specific 24-hour window. Method 2: Open-Source CLI Utilities</p> <p>If you prefer a pre-built terminal solution that handles historical tracking natively, several open-source tools excel at this task.</p> <p><strong>ncdu (NCurses Disk Usage):</strong> While primarily used for real-time scanning, <code>ncdu</code> allows you to export your scanning results to a JSON file: <code>ncdu -1xo- / > /var/log/ncdu/scan_\)(date +%F).json Use code with caution.

You can later import and compare these files within ncdu to visually browse what changed between dates.

dirdiff: A specialized utility to compare the structure and contents of two directories over time, helping you identify exactly which files were added or modified. Method 3: Enterprise Monitoring (Prometheus & Grafana)

For production environments managing multiple servers, manual text logs become unmanageable. You need a centralized time-series database.

┌─────────────────┐ ┌────────────┐ ┌───────────┐ │ Linux Server │ ───> │ Prometheus │ ───> │ Grafana │ │ (node_exporter) │ │ (TSDB) │ │(Dashboard)│ └─────────────────┘ └────────────┘ └───────────┘ 1. Deploy Prometheus Node Exporter

The Prometheus node_exporter runs on your servers and natively exposes system-level disk metrics (node_filesystem_size_bytes and node_filesystem_free_bytes). 2. Track Custom Directories

To track specific directories rather than entire disks, use the Textfile Collector feature in node_exporter. Have a cron job write directory sizes directly to a .prom file:

echo “directory_size_bytes{path=”/data”} $(du -sb /data | cut -f1)” > /var/lib/node_exporter/textfile_collector/dir_size.prom Use code with caution. 3. Build a Grafana Dashboard

Once Prometheus scrapes this metric, you can plot it over 7-day, 30-day, or 90-day intervals in Grafana. Use a PromQL query to calculate the exact daily growth velocity: deriv(directory_size_bytes{path=“/data”}[1d]) Use code with caution.

This setup allows you to create alerting rules that trigger an email, Slack, or PagerDuty notification if a directory grows by more than 20% in a single day. Best Practices for Long-Term Storage Management

Exclude Cache and Temporary Folders: Do not waste resources tracking folders like /tmp, .npm, or application cache directories. Use the –exclude flag in your collection scripts.

Monitor Inodes, Not Just Bytes: A directory can run out of space if it contains millions of tiny files, even if the total disk size in Gigabytes is low. Always track inode consumption (df -i).

Enforce Log Rotation: If you use flat-file logging (Method 1), ensure you set up logrotate to compress and archive your historical logs so the tracking system itself doesn’t cause a storage crisis.

To help tailor this approach to your environment, let me know:

What operating system (Linux, Windows, macOS) are your servers running?

How many servers or directories do you need to track simultaneously?

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *