Robotics

"Tell me Dave" robot learns simply by people talking to it

"Tell me Dave" robot learns simply by people talking to it
Cornell researchers have developed a robot that follows spoken instructions to learn new tasks
Cornell researchers have developed a robot that follows spoken instructions to learn new tasks
View 3 Images
1/3
Cornell researchers have developed a robot that follows spoken instructions to learn new tasks
2/3
Cornell researchers have developed a robot that follows spoken instructions to learn new tasks
An instruction algorithm for the Tell me Dave robot (Image: Cornell University)
3/3
An instruction algorithm for the Tell me Dave robot (Image: Cornell University)
View gallery - 3 images

Many robots today are able to follow verbal instructions. However, the robot first has to be programmed with software code that allows it to respond to those instructions in some predetermined way, and that software must be added to every time the robot's task list is enhanced. Wouldn’t it be easier if we could just avoid all that messy fiddling about with software and talk to a machine as we would a human and explain what we wanted it to do? Researchers at Cornell University thought so, that’s why they designed and built a learning robot as part of their "Tell me Dave" project.

Based on Willow Garage's PR2 robot, the Tell me Dave robot follows on from previous research at Cornell that includes teaching robots to identify people's activities by observing their movements, identifying objects and situations in an environment and responding accordingly based on previous experience, and using visual and non-visual data to refine a robot's understanding of objects.

The Tell me Dave robot is equipped with a 3D camera, and using computer vision software previously developed in the Cornell computer science laboratory, has been trained to associate objects with what they are used for. For example, the robot examines its surroundings and identifies the things in it, such as a saucepan. It knows that a saucepan can have things poured into it or from it, and that it can be placed on a stove to be heated. It also knows that a stove is used for heating, and that it has controls that operate it. The robot can also identify other things that are in its environment, such as the water faucet and the microwave oven.

As a result, if you ask the robot to "make me a bowl of noodles," it uses the knowledge it has gained from scanning its environment to then assemble the routine that it will need to make noodles from the objects at hand. That is, it will put water in the saucepan from the faucet, put the saucepan on the stove top, and then proceed to cook the noodles. But the clever thing is that the robot can do this even if you change the kitchen around by adding or removing utensils; it can adapt its routines to the available equipment. So if you tell it to "boil water," it will use the stove-top and a saucepan or a bowl and a microwave oven, depending upon the objects at hand.

To achieve these capabilities, Ashutosh Saxena, assistant professor of computer science at Cornell University, is training robots to comprehend directions in naturally-spoken language from a variety of speakers. But human language can be vague, and often instructors overlook important details. As a result, Saxena and his colleagues are helping robots to account for missing information, and adapt to the environment at hand by utilizing an algorithm that translates ordinary spoken instructions, identifies the key words available within the environment detected by the robot and then compares these to previous inputs learned in a virtual environment.

This is all then used to amass a number of instructions that fit the environment, the objects in it, and the processes required to use each of those objects to complete the entire task. The robot still doesn't get it right all the time, however. The fuzzy logic that it applies to match the environment with the objects available and the instructions stored, means it’s correct about 64 percent of the time. Considering that included in this percentage were instances where the commands were changed or the environment was altered, the robot was still able to fill in many of the missing steps three to four times better than earlier methods the researchers had tried.

As part of the continuing growth of knowledge available to the Tell Me Dave robots in a virtual environment, people are invited to teach a simulated robot to perform a kitchen task at the Cornell Computer Science Website, where inputs will help form part of a crowd-sourced library of instructions for this and other robots. The researchers are hoping that, eventually, the robots will have millions of examples in the library on which the robots can draw from.

"With crowd-sourcing at such a scale, robots will learn at a much faster rate," Saxena said.

Saxena and graduate students Dipendra K. Misra and Jaeyong Sung will outline their work at the Robotics: Science and Systems conference at the University of California, Berkeley, July 12-16.

The video below shows the Cornell University robot in action making an affogato for its human instructor.

Source: Cornell University

Tell Me Dave: making affogato

View gallery - 3 images
2 comments
2 comments
Bob Flint
How long did this preparation actually take?
Quick calculation based on 30x avg. speed during the 33 seconds runtime = 990 seconds or 16.5 minutes. In today's instant gratification society, having the robot prepare an ice cream sundae is not practical since at that point the ice cream is already melted.
Have it clean up that is not really time sensitive, and probably high on the list of menial chores.
warren52nz
You have to crawl before you walk and walk before you run. I think this is a really good step forward. We shouldn't be expecting someone to pop out with 3CPO.