Skip to content

CIRRIT2IRetrieval errored out #4436

@isaac-chung

Description

@isaac-chung

Describe the bug

Evaluating task CIRRIT2IRetrieval:   0%|          | 0/1 [02:27<?, ?it/s]
Traceback (most recent call last):
  File "/data/home/niklas/isaac/mteb/.venv/bin/mteb", line 10, in <module>
    sys.exit(main())
  File "/data/home/niklas/isaac/mteb/mteb/cli/build_cli.py", line 497, in main
    args.func(args)
  File "/data/home/niklas/isaac/mteb/mteb/cli/build_cli.py", line 91, in run
    mteb.evaluate(
  File "/data/home/niklas/isaac/mteb/mteb/evaluate.py", line 390, in evaluate
    _res = evaluate(
  File "/data/home/niklas/isaac/mteb/mteb/evaluate.py", line 489, in evaluate
    result = _evaluate_task(
  File "/data/home/niklas/isaac/mteb/mteb/evaluate.py", line 162, in _evaluate_task
    task_results[split] = task.evaluate(
  File "/data/home/niklas/isaac/mteb/mteb/abstasks/retrieval.py", line 335, in evaluate
    return super().evaluate(
  File "/data/home/niklas/isaac/mteb/mteb/abstasks/abstask.py", line 209, in evaluate
    scores[hf_subset] = self._evaluate_subset(
  File "/data/home/niklas/isaac/mteb/mteb/abstasks/retrieval.py", line 403, in _evaluate_subset
    results = retriever(
  File "/data/home/niklas/isaac/mteb/mteb/_evaluators/retrieval_evaluator.py", line 71, in __call__
    return search_model.search(
  File "/data/home/niklas/isaac/mteb/mteb/models/search_wrappers.py", line 183, in search
    result_heaps = self._full_corpus_search(
  File "/data/home/niklas/isaac/mteb/mteb/models/search_wrappers.py", line 254, in _full_corpus_search
    sub_corpus_embeddings = self.model.encode(
  File "/data/home/niklas/isaac/mteb/mteb/models/model_implementations/lco_embedding_models.py", line 125, in encode
    for batch in tqdm(inputs, disable=not show_progress_bar):
  File "/data/home/niklas/isaac/mteb/.venv/lib/python3.10/site-packages/tqdm/std.py", line 1181, in __iter__
    for obj in iterable:
  File "/data/home/niklas/isaac/mteb/.venv/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 741, in __next__
    data = self._next_data()
  File "/data/home/niklas/isaac/mteb/.venv/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 801, in _next_data
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
  File "/data/home/niklas/isaac/mteb/.venv/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 57, in fetch
    return self.collate_fn(data)
  File "/data/home/niklas/isaac/mteb/mteb/_create_dataloaders.py", line 235, in _custom_collate_fn
    raise ValueError(f"Found None in batch for key '{key}'")
ValueError: Found None in batch for key 'text'

To reproduce

mteb run -t CIRRIT2IRetrieval -m LCO-Embedding/LCO-Embedding-Omni-3B

Additional information

No response

Are you interested to contribute a fix for this bug?

Yes

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions