Clione

GoLD: Grounded Language Dataset

Verified
  • October 16, 2025, 09:15 PM
Last updated
Unknown
Release date
September 28, 2020
Size
16500 samples | -- GB
License
Unknown
Tags
hri
language grounding
multi-modal
natural language
speech
hci

The Grounded Language Dataset (GoLD) is a multimodal dataset of RGB+depth images of common household objects, along with English natural language descriptions in multiple formats: text, speech (audio), and speech transcriptions. GoLD is comprised of RGB and depth point cloud images of 47 classes of objects in five high-level categories. It includes 16500 text and 16500 speech descriptions gathered with Amazon Mechanical Turk (AMT).

The dataset is intended for research at the intersection of robotics, NLP, and HCI and may help researchers investigate how to learn from the multiple modalities of image, depth, text, speech, and transcription interact, as well as how differences in the vernacular of these modalities impact results.

GoLD: Grounded Language Dataset

Modality
RGB-D image
audio
Format
PNG
WAV
TSV
Source
Author
Patrick Jenkins
Rishabh Sachdeva
Gaoussou Youssouf Kebe
Padraig Higgins
KasraDarvish
Edward Raff
Don Engel
John Winder
Francis Ferraro
Cynthia Matuszek
Institution
University of Maryland
Baltimore County
Booz Allen Hamilton
Johns Hopkins Applied Physics Laboratory
Contact
pjenk1@umbc.edu
rishabs1@umbc.edu
mb88814@umbc.edu
phiggin1@umbc.edu
kasradarvish@umbc.edu
edraff1@umbc.edu
donengel@umbc.edu
jwinder1@umbc.edu
ferraro@umbc.edu
cmat@umbc.edu

Citation

@inproceedings{kebe2021a,
  title = {A Spoken Language Dataset of Descriptions for Speech-Based Grounded Language Learning},
  author = {Gaoussou Youssouf Kebe and Padraig Higgins and Patrick Jenkins and Kasra Darvish and Rishabh Sachdeva and Ryan Barron and John Winder and Donald Engel and Edward Raff and Francis Ferraro and Cynthia Matuszek},
  booktitle = {Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 1)},
  year = {2021},
  url = {https://openreview.net/forum?id=Yx9jT3fkBaD}
}

Similar datasets



Clione is an open repository for transparent dataset sourcing, supporting responsible research in robotics and machine learning.
Our mission is to make finding and understanding datasets easy and intutive.

About FAQs Contact