flow php

Example: Parquet

Topic: Data reading


Description

Read data from a parquet file.

function from_parquet(string|Path $uri);

Additional options:

  • withColumns(array $columns) - default [], list of columns to read when not set, all columns will be read
  • withOptions(Options $options) - custom Parquet Reader Options
  • withByteOrder(ByteOrder $order) - default ByteOrder::LITTLE_ENDIAN, the byte order of the parquet file
  • withOffset(int $offset) - default null, rows to skip from the beginning of the file

Code

<?php

declare(strict_types=1);

use function Flow\ETL\Adapter\Parquet\from_parquet;
use function Flow\ETL\DSL\{data_frame, to_stream};

require __DIR__ . '/../../../autoload.php';

data_frame()
    ->read(from_parquet(
        __DIR__ . '/input/dataset.parquet',
    ))
    ->collect()
    ->write(to_stream(__DIR__ . '/output.txt', truncate: false))
    ->run();

Output

+----+--------+------------------+--------+
| id |   name |            email | active |
+----+--------+------------------+--------+
|  1 |   John |   [email protected] |   true |
|  2 |   Paul |   [email protected] |   true |
|  3 | George | [email protected] |  false |
|  4 |  Ringo |   [email protected] |   true |
+----+--------+------------------+--------+
4 rows

Contributors

Join us on GitHub external resource
scroll back to top