Patterns
35 items
35 items
Optimized storage for time-stamped data
Time-series storage is optimized for data points with timestamps, like metrics, sensor readings, and financial data. The key insight is that time-series data has unique properties: it arrives in time order, is immutable once written, queries are usually for recent data and time ranges, and old data can be downsampled or deleted. This enables specialized storage techniques like columnar compression (10-100x), time-based partitioning for fast range queries, downsampling old data, and retention policies. Systems like InfluxDB, Prometheus, and TimescaleDB achieve 10-100x better compression and 100x faster range queries than general-purpose databases.
Unlike general databases where time is just another column, time-series databases treat time as a primary index. Data is organized, partitioned, and compressed by time, enabling fast range queries and automatic retention.
Time-series data is immutable - once a metric is recorded at a timestamp, it never changes. This enables aggressive compression and append-only storage without MVCC complexity.
Metrics change slowly over time (CPU 45%, 46%, 47%). Storing consecutive values in columnar format enables delta encoding, run-length encoding, and compression ratios of 10-100x.
Consider storing application metrics in PostgreSQL:
Scenario: 1000 servers × 100 metrics per server × 1 sample/second = 100K writes/second
Problems with row-oriented databases:
Why this fails:
Time-series data has unique properties that general databases don't exploit: arrives in time order, immutable, high write rate, queries are range scans, old data less valuable. Specialized storage can achieve 10-100x better compression and 100x faster queries.