Preprocess Function

The preprocessing_func (custom name) is a preprocess function that is called just once before the training/evaluating process. It prepares the data for later use in input encoders, output encoders, and metadata functions.

from code_loader.contract.datasetclasses import PreprocessResponse
from code_loader.inner_leap_binder.leapbinder_decorators import tensorleap_preprocess

@tensorleap_preprocess()
def preprocessing_func() -> List[PreprocessResponse]:
    ...
    train = PreprocessResponse(length=len(train_df), data=train_df, state=DataStateType.training)
    val = PreprocessResponse(length=len(val_df), data=val_df, state=DataStateType.validation)
    test = PreprocessResponse(length=len(test_df), data=test_df, state=, state=DataStateType.test)
    unlabeled = PreprocessResponse(length=len(test_df), data=test_df, state=, state=DataStateType.unlabeled)

    return [train, val, test, unlabeled]

The @tensorleap_preprocess decorator registers the preprocess function into the Tensorleap integration.

This function returns a List of PreprocessResponse objects. The elements on that list correspond with the train , validation, test and unlabeled data slices.

For a successful Tensorleap integration, supplying a train and validation set is mendatory, the rest is optional.

Usage within the full script can be found at the Dataset Script.

Last updated

Was this helpful?