Today, we’re launching the new Cloudflare Pipelines: a streaming data platform that ingests events, transforms them with SQL, and writes to R2 as Apache Iceberg tables or Parquet files.
Pipelines can receive events via HTTP endpoints or Worker bindings, transform them with SQL, and deliver to R2 with exactly-once guarantees. This makes it easy to build analytics-ready warehouses for server logs, mobile application events, IoT telemetry, or clickstream data without managing streaming infrastructure.
For example, here’s a pipeline that ingests clickstream events and filters out bot traffic while extracting domain information:
INSERT into events_tableSELECT user_id, lower(event) AS event_type, to_timestamp_micros(ts_us) AS event_time, regexp_match(url, '^https?://([^/]+)')[1] AS domain, url, referrer, user_agentFROM events_jsonWHERE event = 'page_view' AND NOT regexp_like(user_agent, '(?i)bot|spider');
Get started by creating a pipeline in the dashboard or running a single command in Wrangler:
npx wrangler pipelines setup
Check out our getting started guide to learn how to create a pipeline that delivers events to an Iceberg table you can query with R2 SQL. Read more about today’s announcement in our blog post.
Source: Cloudflare