Saturday, July 7, 2007

Voice Control

Processor- Dell Laptop
Voice recognition software- Dragon Naturally Speaking 9
Control Software- Visual Basic Program
Microphone- Logitech Webcam
Interface- Phidgets 0/16/16 Interface Kit


[Screenshot of Program at Bottom of Page]


Dragon software reads my visual basic program for command buttons. My command buttons' captions are words like Go Forward or STOP. Whenever the Dragon software hears any of those captions being spoken it clicks on it automatically. These buttons are all programmed with commands to control the robot which is connected to the phidgets. The computer runs a visual basic program with command buttons which can be activated by voice or by clicks (read my previous posts about the voice control). Each command button contains an instruction which tells the Phidgets 0/16/16 Interface Kit which outputs to initiate.

The Phidgets Serial Interface is connected to the USB port and has 16 digital inputs and 16 digital outputs with LED indicators by each pin. A simple command is sent to the USB port in the visual basic program to turn on an output or to detect an input. A sample command is "IFKit.OutputState(0) = True" to turn on pin 0. Video feed from my Logitech Quickcam and some other minute fixes to the program were added later on.
The all digital interface with 32 pins was purchased for 100 bucks.

AT&T Text to Speech was used to generate confirmation messages. If I were to say Go Forward then the GO Forward button would be clicked by the Dragon program and a sound would be played. (A voice saying "Going Forward"). Each command given( besides Cruise Control ) lasts for a certain number of seconds before all motors receive a STOP command automatically from the computer.

To start the voice command program the user most first open Dragon Naturally Speaking in the background. Then the user says "Listen." The voice recognition software then clicks the "Listen" button found on the page. This links to the Start Command page where a AI voice says "I am listening". On that page there is only one button, whose caption reads "Chives." Until now we have only told Chives to start the voice command program, so now we have to tell him to actually listen for commands. So the user says "Chives", and the big "Chives" button onscreen is clicked. This leads us to the voice command list. The user now says a voice command and that command button is clicked. The clicking of that button sends an instruction to the phidgets which in turn react electrically in some way. After each command, excluding movement commands, the page is automatically link back to the Chives page. Also, an AI voice is played corresponding to each command to show confirmation.

Here is a sample dialogue:

User : "Listen"

Robot: "I am listening"

User : "Chives"

Robot: "Yes Master"

User: "Go Forward"

Robot : "Going Forward"

(Both motors are turning clockwise for 3 seconds)

Robot: (After 3 seconds) “I have stopped”

***Commands must be made in that order. However, the second time around, the user is only sent back to the Chives page so that the user would say "Chives, go backward" and so as not to confuse the robot if anybody happens to say go backward in normal conversation.


2 comments:

Suren said...

Hi Eric,
I got this link from SoR..
your project is really nice..
I have posted a reply on your post in SoR..check it out..
Even have used a parallel port to drive 4 motors and take input from a IR sensor

shimniok said...

Very cool. Nevermind my question in my previous comment. :)