GIVE Generating Instructions in Virtual Environments

GIVE-2: Heat maps

On this page, we have visualized areas in which different NLG systems had difficulties in the form of heat maps. Each map paints the tiles of one evaluation world in a color that represents some kind of intensity: the number of users who were standing on a tile when they lost the game, and the number of users who were standing on a tile when they cancelled the game. Thus, "warmer" areas are those in which a system had particular difficulties. Below, we first link to the individual heat maps; then we explain in more detail how they were generated.

The heat maps

Locations where users lost:

Locations where the game was cancelled:

How the heatmaps are computed

In each world+server, we extract the number of times that users lost the game, and the number of times the game was cancelled, on each tile, from the database. "Average" means that we divide the total counts by the number of valid games for the world/server combination. Events occurring on a tile only count after the tutorial has been completed. Note that to determine where users lost, we use the tile where the user received the "lost" message. This may not be the same as where the alarm got triggered.

We then normalize the values by scaling the maximum values to one. In the "normalized for each world" case, we use the maximum value over all tiles and all servers for each world; in the "normalized for each world+server", we use the maximum value over all tiles for one specific combination of world and server.

Then the tiles are colored in the heatmap according to the scale below. Left/blue represents low values, and and right/red represents high values. Black is zero, and white is one.