flow php

UNIFIED DATA PROCESSING FRAMEWORK

composer require flow-php/etl ^0.10.0

Changelog

elephant
extract

Extracts

Read from various data sources.

arrow
transform

Transforms

Shape and optimize for your needs.

arrow
load

Loads

Store and secure in one of many available data sinks.

Examples:

Description

Read data from a parquet file.

function from_parquet(string|Path $uri);

Additional options:

  • withColumns(array $columns) - default [], list of columns to read when not set, all columns will be read
  • withOptions(Options $options) - custom Parquet Reader Options
  • withByteOrder(ByteOrder $order) - default ByteOrder::LITTLE_ENDIAN, the byte order of the parquet file
  • withOffset(int $offset) - default null, rows to skip from the beginning of the file

Code

<?php

declare(strict_types=1);

use function Flow\ETL\Adapter\Parquet\from_parquet;
use function Flow\ETL\DSL\{data_frame, to_stream};

require __DIR__ . '/../../../autoload.php';

data_frame()
    ->read(from_parquet(
        __DIR__ . '/input/dataset.parquet',
    ))
    ->collect()
    ->write(to_stream(__DIR__ . '/output.txt', truncate: false))
    ->run();

Output

+----+--------+------------------+--------+
| id |   name |            email | active |
+----+--------+------------------+--------+
|  1 |   John |   [email protected] |   true |
|  2 |   Paul |   [email protected] |   true |
|  3 | George | [email protected] |  false |
|  4 |  Ringo |   [email protected] |   true |
+----+--------+------------------+--------+
4 rows

Contributors

Join us on GitHub external resource
scroll back to top