Data pipeline

All process support map and iterable data.

There are steps in pipeline:

Data preparation:
- Split folds here
- Save to parquet file or keep in memory
- Data format is a dict with feature arrays or scalars
Data interface - provide access to any prepared data
- Data format is a dict with feature arrays or scalars
- Data can be taken with __get_item__ of __iter__ methods
- Input arrays are any type, output are torch.Tensor
- Tuple samples aren't supported at this stage
- Parquet file reading are here
- Augmentations are here
Data endpoints - provide a dataloader
- Target for supervised task extracted here from dict
- Target for unsupervised task defined here
- No augmentation here. Should be implemented as part of endpoints if required