Run Evaluation

Step 1: Install dependencies

Please download the evaluation code from Github.

pip install -r requirements.txt

Step 2: Run evaluation script

Each task has a corresponding evaluation script located in eval_scripts/. Example usage:

python eval_scripts/run_eval.py --task image_recog --pred path/to/your_predictions.json --gt path/to/ground_truth.json

Step 3: Metrics Reference

Evaluation metrics vary by task and may include Accuracy, BLEU, F1, CLIPScore, etc. Please refer to each task’s README in Task Details.