flow php

Group By

Grouping is a common operation in data analysis. The concept is no different from the one you know from SQL.

To group data you need to use DataFrame::groupBy function.

Example:

data_frame()
    ->read(from_array([
        ['id' => 1, 'group' => 'A'],
        ['id' => 2, 'group' => 'B'],
        ['id' => 3, 'group' => 'A'],
        ['id' => 4, 'group' => 'B'],
        ['id' => 5, 'group' => 'A'],
        ['id' => 6, 'group' => 'B'],
        ['id' => 7, 'group' => 'A'],
        ['id' => 8, 'group' => 'B'],
        ['id' => 9, 'group' => 'A'],
        ['id' => 10, 'group' => 'B'],
    ]))
    ->groupBy(ref('group'))
    ->write(to_output(truncate: false))
    ->run();

However, the result of this operation is not very useful. It will just return a DataFrame with one column group:

+-------+
| group |
+-------+
|     A |
|     B |
+-------+
2 rows

To make it more useful, you need to use one of the aggregation functions.

➡️ Aggregations


Contributors

Join us on GitHub external resource
scroll back to top