Window functions perform calculations across a set of rows that are related to the current row. Unlike regular aggregations, window functions don't collapse rows - each row retains its identity while gaining access to aggregate information about its "window" of related rows.
Use window() to define partitions and ordering, then apply window functions like rank(), dense_rank(), or row_number() to compute values within each partition.
The dense_rank() function assigns a rank to each row within a partition, without gaps in ranking when there are ties. If two rows have the same value, they get the same rank, but the next rank is consecutive (no gaps).
For example, if two employees have the same salary and both get rank 2, the next employee gets rank 3 (not 4 like with regular rank()).
<?php
declare(strict_types=1);
use function Flow\ETL\DSL\{data_frame, dense_rank, from_array, ref, to_output, window};
require __DIR__ . '/vendor/autoload.php';
$df = data_frame()
->read(
from_array([
['id' => 1, 'name' => 'Greg', 'department' => 'IT', 'salary' => 6000],
['id' => 2, 'name' => 'Michal', 'department' => 'IT', 'salary' => 5000],
['id' => 3, 'name' => 'Tomas', 'department' => 'Finances', 'salary' => 11_000],
['id' => 4, 'name' => 'John', 'department' => 'Finances', 'salary' => 9000],
['id' => 5, 'name' => 'Jane', 'department' => 'Finances', 'salary' => 14_000],
['id' => 6, 'name' => 'Janet', 'department' => 'Finances', 'salary' => 9000],
])
)
->withEntry('dense_rank', dense_rank()->over(window()->partitionBy(ref('department'))->orderBy(ref('salary')->desc())))
->sortBy(ref('department'), ref('dense_rank'))
->collect()
->write(to_output(truncate: false))
->run();