To monitor catastrophic forgetting, we should also evaluate on some "normal" dataset. We can take a random subset of English wikipedia pages from here: https://huggingface.co/datasets/wikimedia/wikipedia/viewer/20231101.en
To monitor catastrophic forgetting, we should also evaluate on some "normal" dataset. We can take a random subset of English wikipedia pages from here: https://huggingface.co/datasets/wikimedia/wikipedia/viewer/20231101.en