From Speech Recognition to Full-Blown Multithreaded Conversation
Published: Jun 20, 2007Anticipating what your interlocutor will say helps a dialogue run smoothly. Faster recognition of a conversation’s context – from the phrases, speech and grammar – is something language technologists have been striving to achieve since the early days of man-machine speech recognition. Now, European and US researchers have built a dialogue management system that promises to make conversing with our machine friends at work, in cars and at home that much more realistic.
Researchers from Edinburgh University (UK)Â and Stanford University (US) have designed and built a dialogue management system that, among other things, promises to make verbal communication between people and computers more realistic and responsive by alerting the machine to the type of phrase the operator is likely to say next.
Â
The system has applications in a variety of speech recognition exchanges, the researchers claim, including robots, telephone-based information systems, interactive gaming, in-car voice recognition systems, as well as speech interfaces for personal computers.
Â
It could also be used in computer interfaces for the visually impaired, which is also a major field of research in the EU’s Information Society Technologies (IST) program. Examples of projects include PLAYÂ and PLAY2, which are developing software and hardware for blind or visually impaired musicians, and VISUAL, which is using voice-based technology to improve the access of visually impaired people to the information society.
Â
The system goes well beyond current so-called ‘slot-filling dialogue’ systems used, for example, by airline ticketing systems. It can track multiple conversation threads – ones that switch back and forth between topics – without having to be programmed each time.
Â
It can regulate certain topics, say the researchers, and use this information to improve the rate at which it recognizes speech. It also permits the user to initiate, extend and correct dialogue threads at any time. It does this by tracking different turns of phrase and sentence types, including closed ‘yes’ or ‘no’ answers; more open ‘who, what, where’ answers; and corrections, such as “not the tree”.
Â
“Tracking the context of a conversation simplifies the speech recognition task by limiting the range of words the system must attempt to recognize. Existing dialogue management systems constrain speech recognition choices by limiting them to a certain topic, like city names,” according to a report on this new system in Technology Research News (TRN).
Â
While the Scottish-US team’s method works with most any application, because it bases recognition on the type of utterance instead of a specific topic, the researchers explain, their system can anticipate where the conversation might be going and uses this knowledge to fine-tune itself and improve recognition accuracy significantly.
Â
According to press reports, the team is working to boost their system with better dialogue-modelling software and ways to parse text. The current multithreaded dialogue management system could be found in applications within two years, the team predicts.
Â
Source: European Union Research

