R2 Data Catalog now supports automatic snapshot expiration for Apache Iceberg tables.
In Apache Iceberg, a snapshot is metadata that represents the state of a table at a given point in time. Every mutation creates a new snapshot which enable powerful features like time travel queries and rollback capabilities but will accumulate over time.
Without regular cleanup, these accumulated snapshots can lead to:
- Metadata overhead
- Slower table operations
- Increased storage costs.
Snapshot expiration in R2 Data Catalog automatically removes old table snapshots based on your configured retention policy, improving performance and storage costs.
# Enable catalog-level snapshot expiration# Expire snapshots older than 7 days, always retain at least 10 recent snapshotsnpx wrangler r2 bucket catalog snapshot-expiration enable my-bucket \ --older-than-days 7 \ --retain-last 10Snapshot expiration uses two parameters to determine which snapshots to remove:
--older-than-days: age threshold in days--retain-last: minimum snapshot count to retain
Both conditions must be met before a snapshot is expired, ensuring you always retain recent snapshots even if they exceed the age threshold.
This feature complements automatic compaction, which optimizes query performance by combining small data files into larger ones. Together, these automatic maintenance operations keep your Iceberg tables performant and cost-efficient without manual intervention.
To learn more about snapshot expiration and how to configure it, visit our table maintenance documentation or see how to manage catalogs.
Source: Cloudflare




