Stanford CS 224N - Semantic Parsing for robot Commands

Unformatted text preview:

Semantic Parsing For Robot CommandsJeremy HoffmanJustin DriemeyerCS 224N Final Project June 10, 20061. IntroductionThe STAIR (STanford AI Robot) project aims to build a broadly competent “office assistant” robot that includes a spoken dialogue interface. The robot has limited capabilities, so it is only intended to handle a small set of utterance types, such as introductions, simple inquiries, and requests to fetch objects or deliver messages. Still, language understanding (extracting the semantic content of a human utterance) is difficultbecause people use a variety of words and sentence forms to express their intentions, and disfluencies or automated speech recognition (ASR) errors may occur.To address the language understanding problem for this task, we built a PCFG-based semantic parser for transcriptions of utterances. Given a string of text from the ASR component, our parser extracts the appropriate semantic values or indicates that it could parse the utterance. A probabilistic approach is ideal for this task, where confidence margins from the speech recognition and language understanding componentscould be combined, influencing the robot’s decisions appropriately. Our semantic parser is trained from a treebank of command utterances, tagged with our hand-crafted semantic labels. Because no such treebank currently exists, we used automated methods to construct an artificial treebank so that we could test our semantic parser. In the constrained time framework for this project, we limited our attention to the most essential type of command: commands for the robot to move.In this paper, we describe our methodology in designing, training and testing our semantic parser, and discuss our promising test results. Finally we suggest future improvements to the system.2. Methodology2.1 Frame and Slots ModelWe used a standard “frame and slots” model to represent the semantics of utterances. We had three frames: “Macro Moves” (moves from one room or place in the building to another, requiring planning and a sequence of movements); “Micro Moves” (moves of short, simple distances); and “Control” (halting or restoring movement). The frames and their slots are shown in Figure 1.Frame MacroMove MicroMove ControlSlots Destination: holds either a room number or a named place (e.g., “front lobby”) that can be converted to a roomnumber from a building mapDirection: e.g., forward, back, left, right, aroundDistance: holds a number and a unit of distance (e.g., feet)CommandWord: e.g., stop, wait, cancel, go, ok, continueFigure 1: Frames and Slots Used By the Semantic Parser2.2 Semantic GrammarTo extract the semantic contents of an utterance, we adapted our PCFG parser from CS224N Assignment 3. Our new parser still builds parse trees for sentences in the style of the Penn Treebank; however, all the labels used to annotate the sentences represent semantic functions rather than syntactic functions. To accomplish this, we trained the PCFG parser on a corpus/treebank of semantically-annotated robot-command utterances. Just as the rules for Treebank-style annotation are explicitly defined so that human annotators have a model to follow (Bies et al. 1995), we defined our semantic labels and production rules explicitly in a handwritten CFG with 32 nonterminals/labels and about 260 production rules (including preterminals and lexical productions). (As discussed below, we later adapted this to a PCFG for building random annotated sentences for the artificial corpus.) Naturally, the frames and slots in Figure 1 appear as nonterminals in this semantic grammar. In order to recognize complete sentences and possibly succeed in the face of disfluencies or recognizer errors, the semantic grammar also includes constructions for functional words (e.g., “go to the”; various ways of forming numbers), politeness and robot addressing words (e.g., “robot could you please”), and meaningless words (e.g., “uh”; any word not known to the system).2.3 Filling Slots From the Parse TreeTo generate the correct command sequence, we pass the predicted parse tree into a command extraction function, which traces the tree via a depth first search and looks for any command in its list of valid commands. Once it has found a command, it gives default values to each parameter (because, for example, someone who says “Move!” probably doesn’t want the robot to sit still just because they didn’t specify how far to go or in what units). The extractor then looks at the children of that command node to find out if any values were given to overwrite its defaults. Defaults for MacroMove and Command are to not move, and the default for MicroMove is to go 5 feet forward (to handle the “Move!” example).How this is handled if different for each command, for example the MacroMove command must decide if it was specified to go to a specific room, or to a named place (e.g., “Christopher’s office”). If it was given a named place as a destination, it then looksthrough a known list of places to get a floor and room location for that place. On the real robot, a search which brings back an unknown location could indicate that the robot should prompt the user, e.g., “I’m sorry, I don’t know where Christopher’s office is. What is the room number?” which it can then add to its list. Either way, it then has a room number which it can then go to. Of course the room number, if given directly, will come in as a string sequence, e.g., “two hundred twelve,” but others have studies robust string to integer translation so we focused on the semantic grammar aspect instead and left these numbers in string form.In this manner, the varied tags which result from parsing the sentence are resolved into a standard format with all slots needed to execute the command filled in. Thus, MacroMove will have a filled Destination (i.e., RoomNumber) tag, MicroMove will have filled Direction, Distance and Units tags, and Command will have a filled CommandWord tag. Currently, our parser leaves this output in a Tree<String> form for ease in display and testing, with an Next tag the only optional tag of a command, which handles sequences of commands in a sentence. From here, translating to a form more readily consumed by the robot would be trivial.2.4 Artificial CorpusAccurately training a PCFG requires a sufficiently large treebank. Unfortunately,no such semantic-annotation robot-commands


View Full Document
Download Semantic Parsing for robot Commands
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Semantic Parsing for robot Commands and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Semantic Parsing for robot Commands 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?