UNIFIED DATA PROCESSING FRAMEWORK

composer require flow-php/etl ~0.19.0

Changelog Release Cycle

Extracts

Read from various data sources.

Transforms

Shape and optimize for your needs.

Loads

Store and secure in one of many available data sinks.

Examples:

Data Processing Made Easy

One of the most significant challenges in data processing lies in maintaining consistency, particularly in languages as flexible as PHP. Instead of creating custom code for each dataset or integration, Flow PHP offers a uniform API for all data sources.

Whether you’re dealing with a CSV file or consuming data from a REST API, Flow ensures consistency by offering a uniform API for all data sources.

It will not only make your codebase more consistent but also ensure that your system processes data in a memory-efficient way out of the box.

Framework

API

Unified, Strongly Typed API

Flow PHP offers a unified, strongly typed API for all data sources, including:

CSV,
Json,
XML,
Text,
Parquet,
Avro,
Rest API,
RDBMS,
Elasticsearch / Meilisearch

Flow not only enables you to process various data sources consistently, but also strives to accurately detect data types and cast them to the appropriate PHP types.

Even when reading schemaless formats like CSV, Flow allows you to either predefine the schema according to which it will cast the data, or it will attempt to infer the schema from the data itself.

Consistent Memory Consumption

Processing large datasets is no easy task, especially without dedicated tools. The most common solution is to read datasets in chunks and process them one by one.

But unfortunately, it sounds easier than it is. The most common problem is lack of unified API for all data sources and memory management. Flow PHP solves this problem for you.

All of this is possible thanks to the Flow PHP architecture based on generators and iterators. This approach enables you to process large datasets even on small machines.

Memory Management

Contributors

Join us on GitHub