GIVE Generating Instructions in Virtual Environments

GIVE-2.5: Early Announcement


Part of Generation Challenges 2011
Endorsed by SIGGEN, SIGDIAL, and SIGSEM

http://www.give-challenge.org/research/

In the past two years, we have been organizing the Challenges on Generating Instructions in Virtual Environments (GIVE). In 2008-09, we evaluated five natural language generation systems; the Second GIVE Challenge (GIVE-2), in which we are evaluating seven systems, is currently underway.

We are now announcing that we will organize the Second Second GIVE Challenge (GIVE-2.5) in the winter of 2010-11. The task in GIVE-2.5 will be basically identical to the one we now have in GIVE-2; this is so GIVE-2 systems can be improved based on experiences from the evaluation, and to allow more people to participate in the same task.

We invite you to consider participating in GIVE-2.5. For more information and to try out the GIVE-2 software, see http://www.give-challenge.org/research.

If you are potentially interested in participating, please email us at koller@mmci.uni-saarland.de so we know to keep you updated.

Overview

The Challenge on Generating Instructions in Virtual Environments (GIVE) is a novel approach to the notoriously hard problem of evaluating NLG systems. In this scenario, a human user performs a "treasure hunt" task in a virtual 3D environment. The NLG system's job is to generate, in real time, a sequence of natural-language instructions that will help the user perform this task. The crucial thing is that users connect to the generation systems over the Internet. By logging how well they were able to follow the system's instructions, we can evaluate the quality of these instructions in terms of task completion rates and times, subjective measures such as helpfulness and friendliness, and runtime performance. Because the user and the system don't need to be physically in the same place, access to experimental subjects over the Internet becomes easy.

GIVE is a theory-neutral, end-to-end evaluation effort for NLG systems. It involves research opportunities in text planning, sentence planning, realization, and situated communication. One particularly interesting aspect of situating the generation problem in a virtual environment is that spatial and relational expressions play a bigger role than in other NLG tasks. Beyond NLG, GIVE can be interesting as a testbed for improving the NLG components of dialogue systems, and for computational semanticists working on spatial language.

The GIVE-2 Task

In the GIVE-1 Challenge, which we ran last year, five NLG systems were evaluated using data from almost 1200 game runs. To our knowledge, this made GIVE-1 the largest ever NLG evaluation effort in terms of the number of experimental subjects. We presented the results of the evaluation at the ENLG Workshop, and have verified that these results are consistent with (but more detailed than) the results that could be obtained from a traditional lab-based evaluation.

In GIVE-2 we are evaluating seven systems; the public evaluation is currently underway (see www.give-challenge.org). The main novelty in GIVE-2 is that where GIVE-1 used discrete worlds (which were based on square tiles, and the user could only jump from the center of one tile to the center of the next, and turn in 90 degree steps), GIVE-2 permits free, continuous movements in the worlds. This makes the generation task more challenging because simple instructions of the form "walk three steps forward" are no longer possible. The results of GIVE-2 will be presented at the INLG conference this year.

Anyone is invited to submit an NLG system to participate in the GIVE-2.5 Challenge. We particularly invite contributions from students and student teams. To get an idea of what this involves, you may want to go to the GIVE website mentioned above and take a look at our EACL 2009 demo paper describing the software architecture, or download the GIVE-2 software and look at it in more detail.

Provisional Timeline

We plan to use essentially the same software for GIVE-2.5 that we used in GIVE-2. This means that GIVE-2 systems should be adaptable to GIVE-2.5 with minimal effort. While we don't have a precise schedule yet, we hope to present the results of GIVE-2.5 at ENLG 2011. This will probably entail that the public evaluation phase will be at some point in the winter 2010-11. We will distribute a call for participation with more details and provide the GIVE-2.5 software in due time.

Organizing committee

Donna Byron, Northeastern University
Justine Cassell, Northwestern University
Robert Dale, Macquarie University
Alexander Koller, Saarland University
Johanna Moore, University of Edinburgh
Jon Oberlander, University of Edinburgh
Kristina Striegnitz, Union College