Introduction
Error Handling
In case of any exception in transform/load steps, an ETL process will break, to change that behavior, please set custom ErrorHandler.
Error Handler defines 3 behavior using 2 methods.
ErrorHandler::throw(\Throwable $error, Rows $rows) : bool
ErrorHandler::skipRows(\Throwable $error, Rows $rows) : bool
If throw
returns true, ETL will simply throw an error.
If `skipRows' returns true, ETL will stop processing given rows, and it will try to move to the next batch.
If both methods return false, ETL will continue processing Rows using next transformers/loaders.
There are 3 build-in ErrorHandlers (look for more in adapters):
Error Handling can be set directly at ETL:
<?php
data_frame()
->read(from_csv(...))
->onError(ignore_error_handler())
->write(to_json(...))
->run();
Row-level Error Handling
For fine-grained error handling during row processing operations:
<?php
use Flow\ETL\Exception\InvalidArgumentException;
$successCount = 0;
$errorCount = 0;
data_frame()
->read($unreliableDataExtractor)
->forEach(function(Row $row) use (&$successCount, &$errorCount) {
try {
validateAndProcess($row);
$successCount++;
} catch (InvalidArgumentException $e) {
logInvalidRow($row, $e->getMessage());
$errorCount++;
} catch (Exception $e) {
logGeneralError($row, $e);
$errorCount++;
}
});
echo "Success: {$successCount}, Errors: {$errorCount}";
Best Practice: When processing unreliable data sources, implement row-level error handling to prevent entire pipeline failures and provide detailed error reporting.