Flow PHP

Encoding

FinalYes

Encodings supported by Parquet. Not all encodings are valid for all types. These enums are also used to specify the encoding of definition and repetition levels.

See the accompanying doc for the details of the more complicated encodings.

Constants

BIT_PACKED  = 4
Bit packed encoding. This can only be used if the data has a known max width. Usable for definition/repetition levels encoding.
BYTE_STREAM_SPLIT  = 9
Encoding for fixed-width data (FLOAT, DOUBLE, INT32, INT64, FIXED_LEN_BYTE_ARRAY).
DELTA_BINARY_PACKED  = 5
Delta encoding for integers. This can be used for int columns and works best on sorted data.
DELTA_BYTE_ARRAY  = 7
Incremental-encoded byte array. Prefix lengths are encoded using DELTA_BINARY_PACKED.
DELTA_LENGTH_BYTE_ARRAY  = 6
Encoding for byte arrays to separate the length values and the data. The lengths are encoded using DELTA_BINARY_PACKED.
PLAIN  = 0
Default encoding.
PLAIN_DICTIONARY  = 2
Deprecated: Dictionary encoding. The values in the dictionary are encoded in the plain type.
RLE  = 3
Group packed run length encoding. Usable for definition/repetition levels encoding and Booleans (on one bit: 0 is false; 1 is true.).
RLE_DICTIONARY  = 8
Dictionary encoding: the ids are encoded using the RLE encoding.

Properties

$__names  : mixed

Constants

BIT_PACKED

Bit packed encoding. This can only be used if the data has a known max width. Usable for definition/repetition levels encoding.

public mixed BIT_PACKED = 4

BYTE_STREAM_SPLIT

Encoding for fixed-width data (FLOAT, DOUBLE, INT32, INT64, FIXED_LEN_BYTE_ARRAY).

public mixed BYTE_STREAM_SPLIT = 9

K byte-streams are created where K is the size in bytes of the data type. The individual bytes of a value are scattered to the corresponding stream and the streams are concatenated. This itself does not reduce the size of the data but can lead to better compression afterwards.

Added in 2.8 for FLOAT and DOUBLE. Support for INT32, INT64 and FIXED_LEN_BYTE_ARRAY added in 2.11.

DELTA_BINARY_PACKED

Delta encoding for integers. This can be used for int columns and works best on sorted data.

public mixed DELTA_BINARY_PACKED = 5

DELTA_BYTE_ARRAY

Incremental-encoded byte array. Prefix lengths are encoded using DELTA_BINARY_PACKED.

public mixed DELTA_BYTE_ARRAY = 7

Suffixes are stored as delta length byte arrays.

DELTA_LENGTH_BYTE_ARRAY

Encoding for byte arrays to separate the length values and the data. The lengths are encoded using DELTA_BINARY_PACKED.

public mixed DELTA_LENGTH_BYTE_ARRAY = 6

PLAIN

Default encoding.

public mixed PLAIN = 0

BOOLEAN - 1 bit per value. 0 is false; 1 is true. INT32 - 4 bytes per value. Stored as little-endian. INT64 - 8 bytes per value. Stored as little-endian. FLOAT - 4 bytes per value. IEEE. Stored as little-endian. DOUBLE - 8 bytes per value. IEEE. Stored as little-endian. BYTE_ARRAY - 4 byte length stored as little endian, followed by bytes. FIXED_LEN_BYTE_ARRAY - Just the bytes.

PLAIN_DICTIONARY

Deprecated: Dictionary encoding. The values in the dictionary are encoded in the plain type.

public mixed PLAIN_DICTIONARY = 2

in a data page use RLE_DICTIONARY instead. in a Dictionary page use PLAIN instead.

RLE

Group packed run length encoding. Usable for definition/repetition levels encoding and Booleans (on one bit: 0 is false; 1 is true.).

public mixed RLE = 3

RLE_DICTIONARY

Dictionary encoding: the ids are encoded using the RLE encoding.

public mixed RLE_DICTIONARY = 8

Properties

$__names

public static mixed $__names = [0 => 'PLAIN', 2 => 'PLAIN_DICTIONARY', 3 => 'RLE', 4 => 'BIT_PACKED', 5 => 'DELTA_BINARY_PACKED', 6 => 'DELTA_LENGTH_BYTE_ARRAY', 7 => 'DELTA_BYTE_ARRAY', 8 => 'RLE_DICTIONARY', 9 => 'BYTE_STREAM_SPLIT']

        
On this page

Search results