UNIFIED DATA PROCESSING FRAMEWORK

composer require flow-php/etl ~0.20.0

Changelog Release Cycle

Extracts

Read from various data sources.

Transforms

Shape and optimize for your needs.

Loads

Store and secure in one of many available data sinks.

Examples:

Description

Read data from a parquet file.

function from_parquet(string|Path $uri);

Additional options:

withColumns(array $columns) - default [], list of columns to read when not set, all columns will be read
withOptions(Options $options) - custom Parquet Reader Options
withByteOrder(ByteOrder $order) - default ByteOrder::LITTLE_ENDIAN, the byte order of the parquet file
withOffset(int $offset) - default null, rows to skip from the beginning of the file

Download

composer.json

{
    "name": "flow-php/examples",
    "description": "Flow PHP - Examples",
    "license": "MIT",
    "type": "library",
    "require": {
        "flow-php/etl": "1.x-dev",
        "flow-php/etl-adapter-parquet": "1.x-dev"
    }
}

code.php

<?php

declare(strict_types=1);

use function Flow\ETL\Adapter\Parquet\from_parquet;
use function Flow\ETL\DSL\{data_frame, to_stream};

require __DIR__ . '/vendor/autoload.php';

data_frame()
    ->read(from_parquet(
        __DIR__ . '/input/dataset.parquet',
    ))
    ->collect()
    ->write(to_stream(__DIR__ . '/output.txt', truncate: false))
    ->run();

Output

+----+--------+------------------+--------+
| id |   name |            email | active |
+----+--------+------------------+--------+
|  1 |   John |   [email protected] |   true |
|  2 |   Paul |   [email protected] |   true |
|  3 | George | [email protected] |  false |
|  4 |  Ringo |   [email protected] |   true |
+----+--------+------------------+--------+
4 rows

Contributors

Join us on GitHub