Interpreting natural language instructions using language, vision and behavior

Artículo publicado finalmente en : ACM Transactions on Interactive Intelligent Systems, Vol. 4, No. 3, Article 13, Publication date: July 2014.

Bibliographic Details
Main Authors:	Benotti, Luciana, Lau, Tessa, Villalba, Martín Federico
Other Authors:	https://orcid.org/0000-0001-7456-4333
Format:	acceptedVersion
Language:	eng
Published:	2023
Subjects:	Natural language processing Natural language interpretation Multi-modal understanding Action recognition Visual feedback Situated virtual agent Unsupervised learning
Online Access:	http://hdl.handle.net/11086/546748

_version_	1801212303134162944
author	Benotti, Luciana Lau, Tessa Villalba, Martín Federico
author2	https://orcid.org/0000-0001-7456-4333
author_facet	https://orcid.org/0000-0001-7456-4333 Benotti, Luciana Lau, Tessa Villalba, Martín Federico
author_sort	Benotti, Luciana
collection	Repositorio Digital Universitario
description	Artículo publicado finalmente en : ACM Transactions on Interactive Intelligent Systems, Vol. 4, No. 3, Article 13, Publication date: July 2014.
format	acceptedVersion
id	rdu-unc.546748
institution	Universidad Nacional de Cordoba
language	eng
publishDate	2023
record_format	dspace
spelling	rdu-unc.5467482023-03-22T16:12:15Z Interpreting natural language instructions using language, vision and behavior Benotti, Luciana Lau, Tessa Villalba, Martín Federico https://orcid.org/0000-0001-7456-4333 Natural language processing Natural language interpretation Multi-modal understanding Action recognition Visual feedback Situated virtual agent Unsupervised learning Artículo publicado finalmente en : ACM Transactions on Interactive Intelligent Systems, Vol. 4, No. 3, Article 13, Publication date: July 2014. acceptedVersion Fil: Benotti, Luciana. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía y Física; Argentina. Fil: Benotti, Luciana. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Fil: Lau, Tessa. Savioke Incorporation; United States of America. Fil: Villalba, Martín Federico. University of Potsdam; Germany. We define the problem of automatic instruction interpretation as follows. Given a natural language instruc- tion, can we automatically predict what an instruction follower, such as a robot, should do in the environment to follow that instruction? Previous approaches to automatic instruction interpretation have required either extensive domain-dependent rule writing or extensive manually annotated corpora. This article presents a novel approach that leverages a large amount of unannotated, easy-to-collect data from humans inter- acting in a game-like environment. Our approach uses an automatic annotation phase based on artificial intelligence planning, for which two different annotation strategies are compared: one based on behavioral information and the other based on visibility information. The resulting annotations are used as training data for different automatic classifiers. This algorithm is based on the intuition that the problem of inter- preting a situated instruction can be cast as a classification problem of choosing among the actions that are possible in the situation. Classification is done by combining language, vision, and behavior information. Our empirical analysis shows that machine learning classifiers achieve 77% accuracy on this task on avail- able English corpora and 74% on similar German corpora. Finally, the inclusion of human feedback in the interpretation process is shown to boost performance to 92% for the English corpus and 90% for the German corpus. http://dl.acm.org/citation.cfm?id=2629632 acceptedVersion Fil: Benotti, Luciana. Universidad Nacional de Córdoba. Facultad de Matemática, Astronomía y Física; Argentina. Fil: Benotti, Luciana. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Fil: Lau, Tessa. Savioke Incorporation; United States of America. Fil: Villalba, Martín Federico. University of Potsdam; Germany. Ciencias de la Computación 2023-03-22T15:00:04Z 2023-03-22T15:00:04Z 2010 article http://hdl.handle.net/11086/546748 eng https://doi.org/10.1145/2629632 Attribution-NonCommercial-ShareAlike 4.0 International https://creativecommons.org/licenses/by-nc-sa/4.0/
spellingShingle	Natural language processing Natural language interpretation Multi-modal understanding Action recognition Visual feedback Situated virtual agent Unsupervised learning Benotti, Luciana Lau, Tessa Villalba, Martín Federico Interpreting natural language instructions using language, vision and behavior
title	Interpreting natural language instructions using language, vision and behavior
title_full	Interpreting natural language instructions using language, vision and behavior
title_fullStr	Interpreting natural language instructions using language, vision and behavior
title_full_unstemmed	Interpreting natural language instructions using language, vision and behavior
title_short	Interpreting natural language instructions using language, vision and behavior
title_sort	interpreting natural language instructions using language vision and behavior
topic	Natural language processing Natural language interpretation Multi-modal understanding Action recognition Visual feedback Situated virtual agent Unsupervised learning
url	http://hdl.handle.net/11086/546748
work_keys_str_mv	AT benottiluciana interpretingnaturallanguageinstructionsusinglanguagevisionandbehavior AT lautessa interpretingnaturallanguageinstructionsusinglanguagevisionandbehavior AT villalbamartinfederico interpretingnaturallanguageinstructionsusinglanguagevisionandbehavior

Interpreting natural language instructions using language, vision and behavior

Similar Items