The Dutch Image Description and Eye-tracking Corpus (DIDEC) contains 307 images from the MS COCO dataset, provided with eye-tracking data and spoken descriptions (about 16 per image). A unique feature of this corpus is that it has two kinds of eye-tracking data, side-by-side:

  1. Free viewing data, where participants look at an image for three seconds.
  2. Production viewing data, collected simultaneously with the spoken descriptions.

The spoken descriptions are manually transcribed, and annotated with tags indicating corrections, repetitions, and (filled) pauses. See this page for more information about the creation process.


Picture of a herd of sheep, with a shepherd and a mule, in a rural landscape.   Picture of sheep overlaid with eye-tracking data.
Photo by Jacinta Lluch Valero (CC BY-SA 2.0)

DecriptionEen hele kudde schapen met een herder erachter en een pakezel
GlossA whole herd of sheep with a shepherd behind them and a mule
RawEen hele kudde schapen <uh> met een man <corr> met een herder erachter en een pakezel