BridgeData V2

Unverified

December 13, 2025, 03:47 AM
October 24, 2025, 09:36 PM

Last updated

Unknown

Release date

January 17, 2024

Size

60096 samples | 441.0 GB

License

CC BY 4.0

Tags

robot learning

imitation learning

reinforcement learning

multi-task

multi-environment

object manipulation

environment manipulation

generalizable learning

BridgeData V2 is a diverse, large-scale robotic manipulation dataset containing 60,096 trajectories collected across 24 environments on a publicly available low-cost robot. Of these trajectories, 50,365 are teleoperated demonstrations across 13 skills and 9,731 are rollouts from a scripted, heavily randomized pick-and-place policy (intended to boost the robustness of the object repositioning skill). The dataset is compatible with open-vocabulary, multi-task learning methods conditioned on goal images or natural language instructions.

To support broad generalization, data was collected for a wide range of tasks in many environments with varying objects, camera poses, and workspace positions. Each trajectory is labeled with a natural language instruction corresponding to the task the robot is performing.

The authors provide the teleopearated demonstration data and the data from the scripted pick-and-place policy as separate zip files. They also provide example model training code and pre-trained weights on their website.

All of the data was collected on a WidowX 250 6DOF robot arm. They collect demonstrations by teleoperating the robot with a VR controller. The control frequency is 5 Hz and the average trajectory length is 38 timesteps. For sensing, they use an RGBD camera that is fixed in an over-the-shoulder view, two RGB cameras with poses that are randomized during data collection, and an RGB camera attached to the robot's wrist. The images are saved at a 640x480 resolution.

Data collection is credited to Abraham Lee, Mia Galatis, Caroline Johnson, Christian Aviña, Samantha Huang, and Nicholas Lofrese. Microsoft Research assisted in labeling parts of the data with language. Research was supported by the TPU Research Cloud and partly supported by ONR N00014-20-1-2383 and NSF IIS-2150826.

BridgeData V2

Website

Paper

Code

Data

Sample

Modality

trajectory

Format

JPEG

Annotation

Content Task description

Type Natural language

Language English

Annotators Crowdsourced

Quality control None

Source

Author

Homer Walke

Kevin Black

Frederik Ebert

Aviral Kumar

Anikait Singh

Yanlai Yang

Patrick Yin

Gengchen Yan

Kuan Fang

Ashvin Nair

Tony Zhao

Quan Vuong

Chongyi Zheng

Philippe Hansen-Estruch

Andre He

Vivek Myers

Moo Jin Kim

Max Du

Karl Schmeckpeper

Bernadette Bucher

Georgios Georgakis

Kostas Daniilidis

Chelsea Finn

Sergey Levine

Institution

University of California, Berkeley

Stanford University

Google DeepMind

Carnegie Mellon University

Contact

homer_walke@berkeley.edu

Citation

@inproceedings{walke2023bridgedata,
  author = {Walke, Homer and Black, Kevin and Lee, Abraham and Kim, Moo Jin and Du, Max and Zheng, Chongyi and Zhao, Tony and Hansen-Estruch, Philippe and Vuong, Quan and He, Andre and Myers, Vivek and Fang, Kuan and Finn, Chelsea and Levine, Sergey},
  booktitle = {Conference on Robot Learning (CoRL)},
  title = {BridgeData V2: A Dataset for Robot Learning at Scale},
  year = {2023}
}

Similar datasets

MIME

Open X-Embodiment

RoboNet

RT-1 Dataset

RoboSet

DROID: Distributed Robot Interaction Dataset

RoboTurk

RH20T

MT-Opt

Clione

BridgeData V2

BridgeData V2

Citation

Similar datasets