Skip to content

Datasets not found #39

@icyjayd

Description

@icyjayd

I have installed this package and but I can't load the datasets.

My code is as follows:

from genomic_benchmarks.data_check import list_datasets
from genomic_benchmarks.dataset_getters.pytorch_datasets import get_dataset
from genomic_benchmarks.data_check import info
from genomic_benchmarks.loc2seq import download_dataset

When trying to download, for example, 'demo_coding_vs_intergenomic_seqs' I get FileNotFoundError: Dataset demo_coding_vs_intergenomic_seqs not found.

For completion's sake, I wrote code to attempt to download each of the datasets.

for dset in list_datasets():
    try:
        get_dataset(dset, split='train')
        print("success!")
    except:
        print(dset, "not found")

The output is as follows:

demo_coding_vs_intergenomic_seqs not found
human_enhancers_cohn not found
human_ocr_ensembl not found
demo_human_or_worm not found
human_ensembl_regulatory not found
drosophila_enhancers_stark not found
dummy_mouse_enhancers_ensembl not found
human_enhancers_ensembl not found
human_nontata_promoters not found

The same occurs with the info and download_dataset functions as well. Any help on what I'm doing wrong would be appreciated.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions