When re-initializing the train-test split, it fails because the dimension 'split' already exists.

This is easily solved by:
if 'split' not in data.dims: data.expand_dims("split")
However, it also exposes an important issue. When re-initializing the traintest split, it should give feedback when traintest split is changed. This communication is purely for clarification purposes. The user might not be expecting the traintest to change when using sklearn.model_selection.ShuffleSplit (because the user might assume the seed is fixed), but it will do that. Hence, there will need to be a test, checking whether the splits that were already present in data are identical the ones that were generated.
When re-initializing the train-test split, it fails because the dimension 'split' already exists.

This is easily solved by:
if 'split' not in data.dims: data.expand_dims("split")However, it also exposes an important issue. When re-initializing the traintest split, it should give feedback when traintest split is changed. This communication is purely for clarification purposes. The user might not be expecting the traintest to change when using
sklearn.model_selection.ShuffleSplit(because the user might assume the seed is fixed), but it will do that. Hence, there will need to be a test, checking whether the splits that were already present in data are identical the ones that were generated.