TIP: Click on subject to list as thread! ANSI
echo: electronics
to: JAY EMRIE
from: MIKE ROSS
date: 2002-12-13 15:03:22
subject: TALKING BOOKS

"JAY EMRIE" wrote to "MIKE ROSS" (13 Dec 02  09:32:00)
 --- on the topic of "TALKING BOOKS"

 MR> OTOH speech to text is actually quite easy




 JE> I wonder why, if it is so easy. is it so hard for voice
 JE> recognition programs to function reasonably accurately???

Oh, rats! Yet another case of forgetting to proofread what goes out the
mail box. Thanks for pointing it out! Indeed it is "text to speech" and
not the other way around.

Mind you that I am definitely impressed with the speech recognition
program in the phone system at the airport. It never has goofed each
time I've had to use it. Then again it is usually simple numbers so I
suppose the program already knows what to expect.

My guess why speech recognition is so difficult is basically that there
is a lot of information in the audio signal. There may be any number of
quirks in the voice which can make it unrecognizable for the program
comparing between two individuals. The program has to filter for pitch,
tonal quality, speed, inflection, fricatives, and that's just a start.

One can't simply match up a waveform with another but rather there is a
lot of computation done on-the-fly in analyzing it to extract the
information sought from a forest of waveform information. Text to speech
in contrast is almost child's play since the information is already
known and data is simply output from pre-existing lookup tables.

 Mike
 ****

... Back when I was a boy, we carved our own ICs out of wood.
--- Blue Wave/DOS v2.30
* Origin: Juxtaposition BBS, Telnet:juxtaposition.dynip.com (1:167/133)
SEEN-BY: 633/267 270
@PATH: 167/133 379/1 633/267

SOURCE: echomail via fidonet.ozzmosis.com

Email questions or comments to sysop@ipingthereforeiam.com
All parts of this website painstakingly hand-crafted in the U.S.A.!
IPTIA BBS/MUD/Terminal/Game Server List, © 2025 IPTIA Consulting™.