flow php

UNIFIED DATA PROCESSING FRAMEWORK

composer require flow-php/etl ^0.10.0

Changelog

elephant
extract

Extracts

Read from various data sources.

arrow
transform

Transforms

Shape and optimize for your needs.

arrow
load

Loads

Store and secure in one of many available data sinks.

Examples:

Description

Join allows you to combine two data frames into one, similarly to how SQL JOIN works.
The first data source is the main one (left), and the second one is joined (right) to it. The join is done based on the specified columns.

The following types of joins are supported:

  • inner - only rows with matching keys in both data sources are included in the result
  • left - all rows from the left data source are included, and matching rows from the right data source are added
  • right - all rows from the right data source are included, and matching rows from the left data source are added
  • left_anti - only rows from the left data source that do not have a match in the right data source are included

If joined (right) data frame is too large to fit into memory, consider using joinEach instead.

Code

<?php

declare(strict_types=1);

use function Flow\ETL\DSL\{data_frame, from_array, join_on, to_stream};
use Flow\ETL\Join\{Join};

require __DIR__ . '/../../../autoload.php';

$users = [
    ['id' => 1, 'name' => 'John'],
    ['id' => 2, 'name' => 'Jane'],
    ['id' => 3, 'name' => 'Doe'],
    ['id' => 4, 'name' => 'Bruno'],
];

$emails = [
    ['id' => 2, 'email' => '[email protected]'],
    ['id' => 3, 'email' => '[email protected]'],
    ['id' => 4, 'email' => '[email protected]'],
];

data_frame()
    ->read(from_array($users))
    ->join(
        data_frame()->read(from_array($emails)),
        join_on(['id' => 'id'], join_prefix: 'joined_'),
        Join::left
    )
    ->collect()
    ->write(to_stream(__DIR__ . '/output.txt', truncate: false))
    ->run();

Output

+----+-------+-----------+-----------------+
| id |  name | joined_id |    joined_email |
+----+-------+-----------+-----------------+
|  1 |  John |           |                 |
|  2 |  Jane |         2 |  [email protected] |
|  3 |   Doe |         3 |  [email protected] |
|  4 | Bruno |         4 | [email protected] |
+----+-------+-----------+-----------------+
4 rows

Contributors

Join us on GitHub external resource
scroll back to top