Window functions perform calculations across a set of rows that are related to the current row. Unlike regular aggregations, window functions don't collapse rows - each row retains its identity while gaining access to aggregate information about its "window" of related rows.
Use window() to define partitions and ordering, then apply window functions like rank(), dense_rank(), or row_number() to compute values within each partition.
The row_number() function assigns a unique sequential number to each row within a partition, starting at 1. Unlike rank() and dense_rank(), row_number() always assigns different numbers to each row, even when values are tied.
This is useful for pagination, selecting top N rows per group, or creating unique identifiers within partitions.
<?php
declare(strict_types=1);
use function Flow\ETL\DSL\{data_frame, row_number, from_array, ref, to_output, window};
require __DIR__ . '/vendor/autoload.php';
$df = data_frame()
->read(
from_array([
['id' => 1, 'name' => 'Greg', 'department' => 'IT', 'salary' => 6000],
['id' => 2, 'name' => 'Michal', 'department' => 'IT', 'salary' => 5000],
['id' => 3, 'name' => 'Tomas', 'department' => 'Finances', 'salary' => 11_000],
['id' => 4, 'name' => 'John', 'department' => 'Finances', 'salary' => 9000],
['id' => 5, 'name' => 'Jane', 'department' => 'Finances', 'salary' => 14_000],
['id' => 6, 'name' => 'Janet', 'department' => 'Finances', 'salary' => 9000],
])
)
->withEntry('row_num', row_number()->over(window()->partitionBy(ref('department'))->orderBy(ref('salary')->desc())))
->sortBy(ref('department'), ref('row_num'))
->collect()
->write(to_output(truncate: false))
->run();