PreprocessResponse

code_loader.contract.datasetclasses.PreprocessResponse

An object that holds the samples data and length. This object is generated in the Preprocessing Function for each dataset slice. Then passed to the input encoders, ground_truth encoders and metadata functions as an argument.

from code_loader.contract.enums import DataStateType

@dataclass
class PreprocessResponse:
    length: Optional[int] = None
    data: Any = None
    sample_ids: Optional[Union[List[str], List[int]]] = None
    state: Optional[DataStateType] = None
    sample_id_type: Optional[Union[Type[str], Type[int]]] = None

For more on PreprocessResponse:

Args

length

(int, deprecated) Number of samples in the slice. Deprecated — use sample_ids instead.

data

(Any) Dictionary / pandas.DataFrame / List or any object that describes the dataset features. The data parameter is later passed to the input encoders, ground_truth encoders, and metadata functions.

sample_ids

A list of unique identifiers for each sample in the slice. IDs should be either a list of ints or strings. Preferred over length.

state

(DataStateType, optional) The dataset split this response belongs to. Recommended to always set explicitly.

sample_id_type

(str/int, optional) The type of the sample IDs. Inferred automatically when using sample_ids.

Examples

Basic Usage

Within the Preprocess Function

Full examples can be found at the Dataset Integration section of the following guides:

Last updated

Was this helpful?