Window functions perform calculations across a set of rows that are related to the current row. Unlike regular aggregations, window functions don't collapse rows - each row retains its identity while gaining access to aggregate information about its "window" of related rows.
Use window() to define partitions and ordering, then apply window functions like rank(), dense_rank(), or row_number() to compute values within each partition.
The rank() function assigns a rank to each row within a partition, with gaps in ranking when there are ties. If two rows have the same value, they get the same rank, and the next rank is skipped.
For example, if two employees have the same salary and both get rank 2, the next employee gets rank 4 (rank 3 is skipped).
<?php
declare(strict_types=1);
use function Flow\ETL\DSL\{data_frame, rank, from_array, ref, to_output, window};
require __DIR__ . '/vendor/autoload.php';
$df = data_frame()
->read(
from_array([
['id' => 1, 'name' => 'Greg', 'department' => 'IT', 'salary' => 6000],
['id' => 2, 'name' => 'Michal', 'department' => 'IT', 'salary' => 5000],
['id' => 3, 'name' => 'Tomas', 'department' => 'Finances', 'salary' => 11_000],
['id' => 4, 'name' => 'John', 'department' => 'Finances', 'salary' => 9000],
['id' => 5, 'name' => 'Jane', 'department' => 'Finances', 'salary' => 14_000],
['id' => 6, 'name' => 'Janet', 'department' => 'Finances', 'salary' => 9000],
])
)
->withEntry('rank', rank()->over(window()->partitionBy(ref('department'))->orderBy(ref('salary')->desc())))
->sortBy(ref('department'), ref('rank'))
->collect()
->write(to_output(truncate: false))
->run();