Flow PHP
Flow is a strongly typed, memory-efficient data processing framework for PHP. It gives you a single fluent API to read, transform, and write data across CSV, JSON, XML, Parquet, REST, RDBMS, Elasticsearch, and more — without per-format boilerplate.
If you build pipelines, ETL jobs, exports, imports, or reporting in PHP and you've outgrown one-off scripts, you're in the right place.
Hello, Flow
Install the core and the adapters you need:
composer require flow-php/etl flow-php/etl-adapter-csv flow-php/etl-adapter-json
Then write your first pipeline:
<?php
use function Flow\ETL\DSL\{data_frame, ref};
use function Flow\ETL\Adapter\CSV\from_csv;
use function Flow\ETL\Adapter\JSON\to_json;
data_frame()
->read(from_csv('orders.csv'))
->filter(ref('amount')->isNotNull())
->write(to_json('orders.json'))
->run();
That's a complete program. The pipeline reads top-to-bottom and streams rows in constant memory regardless of input size.
Mental model
Every Flow pipeline has three stages:
- Extract —
->read(...)pulls rows from a source (file, API, database). - Transform —
->filter(),->withEntry(),->map(),->join(),->groupBy(),->window(), and friends shape the rows. - Load —
->write(...)streams the result into a sink;->run()executes the whole thing.
Data moves through the pipeline as DataFrame → Rows → Row → Entry, with every value strongly typed.
Where to go from here
New to Flow
- Installation — Composer setup and runtime tips.
- Quick Start — From empty project to working pipeline in five minutes.
Learn the API
- Data Frame — the core: filter, map, join, group by, window, partition, sort, limit.
- DSL Reference — every function in the DSL, with signatures and examples.
Working with a specific source or sink
- Browse Adapters — CSV, JSON, XML, Parquet, Avro, Excel, HTTP, PostgreSQL, Doctrine, Elasticsearch, ChartJS.
See it running
- Examples — runnable snippets grouped by topic and format. Open any example in the Playground.
Going to production
What Flow isn't
Flow is not a database engine, a query planner, or a distributed compute framework. It runs inside any PHP runtime — CLI, FPM, queue worker — and processes rows in constant memory. If you need cluster-scale joins or SQL planning, reach for Spark or Trino. If you need ergonomic, type-safe data pipelines inside a PHP application, you're in the right place.
Get help
Found a typo or an outdated section? Edit this page on GitHub