Clione

MT-Opt

Unverified
  • December 13, 2025, 06:06 PM
  • November 27, 2025, 05:18 AM
Last updated
Unknown
Release date
April 27, 2021
Size
800000 samples | -- GB
License
Unknown
Tags
robot learning
multi-task
object manipulation
multi-robot

MT-Opt includes diverse, multi-task robot data collected at scale. It includes data collected from 7 robots on a set of 12 real-world tasks. They simultaneously collect data for multiple distinct tasks across multiple robots. They noted that they can use solutions to easier tasks to effectively bootstrap learning of more complex tasks. This is an important benefit of the multi-task system, where an average MT-Opt policy for simple tasks might occasionally yield episodes successful for harder tasks. Over time, this allows them to start training an MT-Opt policy now for the harder tasks, and consequently, to collect better data for those tasks. Importantly, this fluid data collection process results in an imbalanced dataset. They use data impersonation and re-balancing methods to address this imbalance by efficiently expanding and normalizing data.

In particular, they utilize 7 KUKA IIWA arms with two-finger grippers and 3 RGB cameras (left, right, and over the shoulder). In order to be able to automatically reset the environment, we create an actuated resettable bin, which further allows us to automate the data collection process. More precisely, the environment consists of two bins (with the right bin containing all the source objects and the left bin containing a plate fixture magnetically attached anywhere on the workbench) that are connected via a motorized hinge so that after an episode ends, the contents of the workbench can be automatically shuffled and then dumped back into the right bin to start the next episode. This data collection process allows them to collect diverse data at scale: 24 hours per day, 7 days a week across multiple robots.

One episode has ≈ 10 steps on average, taking ≈ 25 seconds to be generated on a robot, including environment reset time. This accounts to ≈3300 episodes/day collected on a single robot, or ≈23K episodes/day collected across the fleet of 7 robots.

For this project nearly 800,000 episodes were collected through the course of 16 months.

The data was collected over different: 1) Locations: Three different physical lab locations. 2) Time of day: Robots ran as close to 24x7 as possible. 3) Robots: 6-7 KUKAs with variations in background, lighting, and slight variation in camera pose. 4) Success Detectors: They iteratively improved their success detectors. 5) RL training regimes: They developed better training loops hyper-parameters and architectures as time went on. 6) Policies: Varied distribution of scripted, epsilon greedy, and on-policy data collection over time. Data collection started in an original physical lab location, was paused due to COVID-19, and the robots were later setup at a different physical lab location affecting lighting and backgrounds.

MT-Opt

Modality
trajectory
Format
Unknown
Description None found.
Source
Author
Dmitry Kalashnikov
Jake Varley
Yevgen Chebotar
Ben Swanson
Rico Jonschkowski
Chelsea Finn
Sergey Levine
Karol Hausman
Institution
Google

Citation

@article{mtopt2021arxiv,
    title={MT-OPT:
    Continuous Multi-Task Robotic Reinforcement Learning at Scale},
    author={Dmitry Kalashnkov and Jake Varley and 
            Yevgen Chebotar and Ben Swanson and 
            Rico Jonschkowski and Chelsea Finn and 
            Sergey Levine and Karol Hausman},
    journal={arXiv},
    year={2021}
}
        

Example usage

Similar datasets



Clione is an open repository for transparent dataset sourcing, supporting responsible research in robotics and machine learning.
Our mission is to make finding and understanding datasets easy and intutive.

About FAQs Contact