Prepare Files 

Step 1: Determine Your Target Data Category 

The first step is to determine the target data category that best aligns with your model’s evaluation goals and readiness.

To accommodate varying computational budgets and model development stages, General-Bench is split into two distinct subsets: the closed set and the open set.

Closed Set: with inputs and labels of samples all publicly open, for free open-world use (e.g., for academic experiment/comparisons).

Open Set: with only sample inputs available, which is used for leaderboard ranking. Participants need to submit the predictions to us for internal evaluation.

Scope Definitions

Tasks are further categorized into the following four evaluation scopes:

Scope-A: Full-spectrum Hero

Covers all modalities and task types under the General-Level benchmark. Designed for highly capable, general-purpose multimodal models.

Scope-B: Modality-specific Unified Hero

Focuses on a single modality or combinations of related modalities (e.g., image, video, audio, 3D), targeting modality-level generalists.

Scope-C: Comprehension/Generation Hero

Separates tasks into comprehension vs. generation within each modality. Suitable for lightweight or early-stage models due to its lower entry threshold.

Scope-D: Skill-specific Hero

Provides fine-grained evaluation focused on specific task clusters such as VQA, captioning, or speech recognition—ideal for partial generalists.

Step 2: Download the Dataset 

Based on your selected scope and target tasks, download the corresponding dataset subsets.

Available Evaluation Datasets

Scope-A: Openset , Closeset

Scope-B:

Image: Openset , Closeset

Video: Openset , Closeset

Audio: Openset , Closeset

3D: Openset , Closeset

Scope-C:

Image Comprehension: Openset , Closeset

Image Generation: Openset , Closeset

Video Comprehension: Openset , Closeset

Video Generation: Openset , Closeset

Audio Comprehension: Openset , Closeset

Audio Generation: Openset , Closeset

3D Comprehension: Openset , Closeset

3D Generation: Openset , Closeset

Scope-D:

Image Comprehension:

Detection: Openset , Closeset

VQA: Openset , Closeset

CaptioningL: Openset , Closeset

…

Image Generation:

Text-to-Image: Openset , Closeset

Editing: Openset , Closeset

…

Notes

If a scope contains multiple sub-tasks, please download each dataset individually.
All datasets follow a standardized file structure. For details, refer to Dataset Format.

Prepare Files

Step 1: Determine Your Target Data Category

Step 2: Download the Dataset

Prepare Files 

Step 1: Determine Your Target Data Category 

Step 2: Download the Dataset 