Datasets

Browse 12 datasets for training, fine-tuning, and evaluation

L

LAION-5B-Curated

by laion-ai \u00b7 Updated 1 month ago

image-textmultimodallarge-scale
89K
3400
S

ImageNet-22K

by stanfordai \u00b7 Updated 2 weeks ago

imageclassification
120K
2300
T

RedPajama-V3

by together-ai \u00b7 Updated 4 days ago

textpre-trainingweb
67K
1800
M

CommonVoice-18

by mozilla \u00b7 Updated 5 days ago

audiospeechmultilingual
34K
1200
W

Wikipedia-EN-2025

by wikimedia \u00b7 Updated 1 week ago

textknowledgeenglish
45.2K
890
G

CodeSearchNet

by github \u00b7 Updated 3 weeks ago

codesearchprogramming
28K
780
S

UltraChat-200K

by stingning \u00b7 Updated 2 weeks ago

chatinstructionsft
15K
560
N

Opus-4.6-Reasoning-3000x

by nohurry \u00b7 Updated Feb 10

reasoningfiltered
7.5K
450
O

OmniAction

by OpenMOSS-Team \u00b7 Updated 3 days ago

roboticsaction
21.1K
222
O

Hacker-News

by open-index \u00b7 Updated 1 minute ago

textnews
14.5K
217
O

OmniAction-LIBERO

by OpenMOSS-Team \u00b7 Updated 3 days ago

robotics
1.42K
67
S

Eva Benchmark

by ServiceNow-AI \u00b7 Updated 6 days ago

evaluationbenchmark
4.42K
57