Skip to content

Adding shuffled version of the toxicity dataset#1892

Draft
rcap107 wants to merge 8 commits intoskrub-data:mainfrom
rcap107:shuffle-toxicity-dataset
Draft

Adding shuffled version of the toxicity dataset#1892
rcap107 wants to merge 8 commits intoskrub-data:mainfrom
rcap107:shuffle-toxicity-dataset

Conversation

@rcap107
Copy link
Copy Markdown
Member

@rcap107 rcap107 commented Feb 9, 2026

Closes #1234

This PR changes adds a shuffled version of the toxicity dataset to the list of datasets used by the fetchers. The relative dataset has already been added to the dataset repository.

TODO:

  • Upload dataset on figshare
  • Upload dataset on osf

@rcap107 rcap107 added this to the Release 0.8.0 milestone Feb 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DOC - shuffle the toxicity dataset in its example

2 participants