Issues Training GRIT with Qwen2.5-VL-3B-Instruct

Hi, authors, thanks for your awesome work!

I'm attempting to train `Qwen/Qwen2.5-VL-3B-Instruct` using the provided training script, but I've encountered several issues that I'd like to clarify:

### Training Script
```bash
#!/bin/bash

setting='dozen_vsr_qwen_add_grounded_reasoning_single_turn_think_rethink'
export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
export WANDB_PROJECT=$setting

# Load config variables
source scripts/train_base_config.sh

# Run the training script with DeepSpeed
python -m accelerate.commands.launch \
    --config_file ./accelerate_configs/deepspeed_zero2.yaml \
    --main_process_port 20092 \
    grpo-gr/GRPO_GR.py \
    --train_data_path ./GRIT_data/tallyqa_train_10.jsonl,./GRIT_data/vsr_cot_train_10.jsonl \
    --train_image_folder_path ./GRIT_data/tallyqa,./GRIT_data/vsr \
    --eval_data_path ./GRIT_data/vsr_val.jsonl,./GRIT_data/mme_val.jsonl,./GRIT_data/tallyqa_val.jsonl,./GRIT_data/gqa_val.jsonl,./GRIT_data/mathvista_mini_val.jsonl,./GRIT_data/ovd_position_val.jsonl,./GRIT_data/ovd_relationship_val.jsonl,./GRIT_data/ovd_negation_val.jsonl \
    --eval_image_folder_path ./GRIT_data/vsr,./GRIT_data/mme,./GRIT_data/tallyqa,./GRIT_data/gqa,./GRIT_data/mathvista_mini,./GRIT_data/ovd_position,./GRIT_data/ovd_relationship,./GRIT_data/ovd_negation \
    --setting $setting \
    --max_turns 1 \
    --output_dir output/$setting \
    --hub_model_id $setting \
    $COMMON_ARGS \
    --eval_steps 50 \
    --save_steps 50 \
    --num_train_epochs 500 \
    --lr_scheduler_type cosine \
    --per_device_eval_batch_size 8
```

## 1. Dataset Issues

### MME Dataset
Most datasets can be downloaded normally, but for the MME dataset, when I try to download from the repository path specified in the paper ([link](https://huggingface.co/datasets/darkyarding/MME/blob/main/MME_Benchmark_release_version.zip)), I find that the image names in the downloaded files don't match the names listed in `mme_val.jsonl`.

### Missing Label Files
The following label files are missing:
- `./GRIT_data/ovd_relationship_val.jsonl`
- `./GRIT_data/ovd_negation_val.jsonl`

Could you please provide these files or clarify how to obtain them?

## 2. Flash Attention Issues

During initial training, I encounter the following warning/error:

```bash
You are attempting to use Flash Attention 2.0 without specifying a torch dtype. This might lead to unexpected behaviour
...
```

The specific error indicates that `float16` is not supported. I resolved this by manually specifying `torch_dtype=torch.bfloat16` during model initialization. Did you encounter this issue during your training? What's the recommended approach to handle this?

https://github.com/eric-ai-lab/GRIT/blob/fd08d573b652bf162e8425d79189d88149e9745b/grpo-gr/GRPO_GRTrainer.py#L233-L234



## 3. Training Hyperparameters

I'd like to confirm a few things about the training parameters:

1. **Epochs**: Is `--num_train_epochs 500` an experimental parameter? This seems quite high - is this intentional?

2. **Batch Size & Memory**: When training on 48GB VRAM, I can only set `per_device_train_batch_size` to 1, otherwise I get OOM errors. Is this normal? If the batch size can only be 1, should the learning rate be scaled accordingly? What would be the recommended values?

3. **Other Parameters**: Are the other hyperparameters in the script reasonable for this model size and task?

## 4. Demo Environment

Regarding the `gradio_qwen.py` mentioned on the GitHub page, where can I find this file? It doesn't seem to be included in the current repository.

---

**Environment:**
- Model: Qwen/Qwen2.5-VL-3B-Instruct
- GPU: 8x GPUs with 48GB VRAM each
- Framework: DeepSpeed ZeRO-2

## 5. Logs

Also, It's very weird that reward scores always get zero.

![Image](https://github.com/user-attachments/assets/bdb7dcad-e1f0-4ab9-a498-fca281b12a40)

Any guidance on these issues would be greatly appreciated. Thank you again for your work on this project!


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issues Training GRIT with Qwen2.5-VL-3B-Instruct #2

Training Script

1. Dataset Issues

MME Dataset

Missing Label Files

2. Flash Attention Issues

3. Training Hyperparameters

4. Demo Environment

5. Logs

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

	if "qwen" in model_id.lower():
	model = Qwen2_5_VLForConditionalGeneration.from_pretrained(model, **model_init_kwargs)

Issues Training GRIT with Qwen2.5-VL-3B-Instruct #2

Description

Training Script

1. Dataset Issues

MME Dataset

Missing Label Files

2. Flash Attention Issues

3. Training Hyperparameters

4. Demo Environment

5. Logs

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions