Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
Software IT

Are You Talking to Your PC Yet? 333

An anonymous reader writes "If you have ever asked "Do those speech-to-text apps like Dragon NaturallySpeaking and IBM ViaVoice really work?" Pocket PC Addict has posted a detailed review of Dragon Naturally Speaking for Pocket PC and Desktop machines. It is written from the perspective of someone who has been burned by speech to text software in the past and had vowed to never try one of these apps again. It is encouraging for slow typists who would like to use their voice to write. Plus it details some valuable tips for using it with Pocket PCs."
This discussion has been archived. No new comments can be posted.

Are You Talking to Your PC Yet?

Comments Filter:
  • Clippy (Score:5, Funny)

    by Anonymous Coward on Thursday December 09, 2004 @02:03PM (#11043904)
    So if I ask Clippy to STFU, will he?
  • I treid with my Mac (Score:3, Interesting)

    by the_2nd_coming ( 444906 ) on Thursday December 09, 2004 @02:05PM (#11043923) Homepage
    OS X has really good system wide integration for Voice commands. and the voice interpreter is pretty good for one that comes with the OS, but I could not get it to work consistently....

    other than that I thought it was cool to say "computer give me brad's number" and it would display my buddy brad's phone number on the screen :-) (when it worked)
    • I just remember while I had it on. Someone behind me droped some books and went "crap" And it reconized it as closed application. Another trick is to adjust the sensitivity of the microphone to the right level if it goes all the way up to the top then your voice starts clipping and it doesn't work well.
    • by pdiaz ( 262591 )
      I don't know OS X Voice commands system, but I'm guessing that it works on a reduced vocabulary (i.e.: "close", "open", "mail", and such). That's the key there: with reduced vocabularies and a strict syntax speech recognition works pretty good. The challenge is to make it work with a natural lang . such as English (and in real time)

      Said in another way: when you issue a voice command to OS X, it has to choose between 40-100 alternatives (this is guesswork). (True) speech recognizers work with +60K vocabula

  • by farsideofthemoon ( 766786 ) on Thursday December 09, 2004 @02:05PM (#11043924) Homepage
    Any recommended ones? http://sourceforge.net/projects/cmusphinx/
  • by Ingolfke ( 515826 ) on Thursday December 09, 2004 @02:05PM (#11043929) Journal
    All though Text two speach is a grape gnu technology it is not red E for the main stream yet.
  • by nherc ( 530930 )
    Two comments and the link is dead! lol
  • My Accent screws up everything. I hate my Accent.
  • My problem (Score:5, Funny)

    by teiresias ( 101481 ) on Thursday December 09, 2004 @02:06PM (#11043940)
    The problem I always found with uuhhhhh voice writing was mmmmm filtering out unwanted noises and shhhhh distractions from my posts period return But I uhh guess they've fixed most of those burp problems by now right question mark
  • by The G ( 7787 ) on Thursday December 09, 2004 @02:07PM (#11043948)
    Is anyone out there giving any thought to how a programming language should be structured to make it easy to code using a speech recognition engine?

    If not, why not?
  • by mistersooreams ( 811324 ) on Thursday December 09, 2004 @02:07PM (#11043949) Homepage
    It walks just fin four mee!
  • In Japan, talking PCs are for schoolgirls.
  • by glsunder ( 241984 ) on Thursday December 09, 2004 @02:07PM (#11043953)
    I've been talking to my PC for years:

    You god damned son of a bitch! F'n Piece of shit!
  • Fun with Macros (Score:3, Insightful)

    by BlueCup ( 753410 ) on Thursday December 09, 2004 @02:07PM (#11043954) Homepage Journal
    My brother and I work at a company making efficiency programs... for awhile we toyed with the idea of having all of the programs activated by voice... we tested it out for awhile with an open source cantation originally used for games, that would execute a command, or type text based on what you said... for a while, it was awesome, every time we said something, it'd find the word from our list, and activate the program... problem was, when it listened to your voice, it only compared it to the words you had programs assigned to... so if you had four words, no problem, but if you had 60, it started choosing horribly... we eventually had to scrap the program all together... though it was funny watching what programs it would have to run through when I started cursing in frustration... I'm pretty sure the annoyance of people talking to their computers all over the building would have caused problems as well.
    • My dad got one of those really primitive speech recognition software programs like 10 years ago, one of the ones that only recognizes volume changes. It worked well for about 20 minutes, and then I coughed into the mic. It opened the calculator, we both laughed about it, and stopped using the software.
    • though it was funny watching what programs it would have to run through when I started cursing in frustration

      Years ago Apple implemented voice recognition built into OS9. The peak for me was when I managed to get it to connect the internet, download my e-mail, and read the messages to me, without having to get out of bed. Unfortunately, the system suffered from just the problem you are mentioning. If it did not understand what you said it would assume you meant "open netscape." Do you know how long it

    • Re:Fun with Macros (Score:3, Interesting)

      by drinkypoo ( 153816 )
      The Macintosh has had this since the Quadra, and it has just the same problem. You drop aliases (or whatever) into your speakable items folder and it matches them to your speech. Problem is, when you get fifteen or twenty things in there (maybe more on a powermac, but I'm talking the quadra experience here) it frequently would match things that made no sense whatsoever.
    • One question: why not have your software expect the vocal equivalent of a command prompt, like saying "Computer!" ala Star Trek?

      Seems to me that would eliminate a lot of the trouble.
  • by narsiman ( 67024 ) on Thursday December 09, 2004 @02:07PM (#11043963)
    Do not run webservers on PocketPCs even if you are an addict
  • ... for attempting to dictate message board posts for humerous effect. Gave me many hours of amusement. Plus I got a free mic which I now use with Skype :)
  • I talk to my PC all the time...if you consider swearing at it and yelling profanity at it talking to it.
  • And mine works just fine. Submit. Submit. I said submit. Why isn't this expletivedeleted thing triggering the submit button. Submit! Submit! Damn it, I have to move the mouse.
  • My computer would be asking me to repeat anything I tried to communicate by yelling over its own leafblower noise levels.
  • I recall back in the early 80s I was in a Singer shop (as in sewing machines) and they sold IBM PCs as well.... ...including speech to text recognition software.

    I tried it out, and surprise! it didn't work very well.

    I see nothing has changed.
  • ...until I noticed that the PocketPC version is just a delayed dictation device - it records, then you transfer it to your desktop computer and it's the host computer that actually does all the speech recognition.

    No wireless. Less space then a Nomad. Lame.
  • I tried Dragon several years ago. It worked, but you really need accuracy to the nines (99.99%) to be productive with it. One mistake in 100 sylables means constant corrections. I did make a little flying demo that took english commands (right, left, up, down, slower, faster) and it was cool to control it via voice commands. There was no distinction between typing commands and speaking them though. I would recommend (if they don't have it already) the Gnome and KDE folks provide a seperate input stream for
    • With KDE it's already a possibility with DCOP and all. Every KDE app's function is accessible by it. Dunno about GNOME though, I never use that :)
  • No...not yet. It's too early, and I need some time to myself right now.

    It was her fault though...crapping out on me like that when I was just past Level 5 in digdug.exe.

    And just when I was going to get her a shiny new Windows 3.11 for Christmas too. It sure is a pity. It'll be a while before I'm ready for another relationship.

  • by Rude Turnip ( 49495 ) <.valuation. .at. .gmail.com.> on Thursday December 09, 2004 @02:15PM (#11044099)
    1. It's awkward to talk when you're trying to compose something that requires a lot of thought first. I usually like to talk to myself (either out-loud or in my head) and type out what I'm thinking in a more formal fashion.

    2. It is very tedious to go back and edit or make corrections. If I make an error while typing, I'm cognizant of the error very soon after it happens. With voice recognition, techincally "someone else" is typing and it takes more time to see where the mistakes were made.

    3. I deal with lots of boilerplate text with original content intermingled. A lot of times working on such a text becomes an editing process where using the keyboard & mouse is more efficient.

    4. My voice doesn't last for much longer than 30 minutes for non-stop speaking...and that's with short breaks for water.

    Conclusion: Just hire a hot secretary that can type.
  • by bADlOGIN ( 133391 ) on Thursday December 09, 2004 @02:16PM (#11044103) Homepage
    I'll be happy when someone codes a DWIM method (Do What I mean):)


  • Though i've never used this personally. I had a co-worker who was strickened by carpel tunnel. We both worked in tech support at the time. In order to accomadate her she was allowed to use DNS. To be honest after training it ( which is the most tedious part) it worked quite well. To an auther or someone who types for aliving this is a great tool. The only other concern I've heard is it does require a reasonable amount of computing power.
  • by eno2001 ( 527078 ) on Thursday December 09, 2004 @02:19PM (#11044160) Homepage Journal
    ...it worked OK as long as you trained it properly and you had a nice quite room and a good mic. However, there are issues with "voice typing" that can't be overlooked. Primary is security. If you want to type a document or e-mail that contains sensitive data, make damn sure that no one can hear you. My bank recently moved to a voice activated system. I'm surprised they haven't gotten a ton of complaints from people since it REQUIRES you to say your SS# and PIN out loud. This means I can no longer check my account from my cell phone or at work. If you sit down and think about how many things you type that you would never want to say out loud, you can see why voice typing hasn't taken off. Imagine this emanating from your cubicle in a monotone:

    "http://www.goat.cx/ Take that you bukkake loving lunixtards."

    Your co-workers would think you were a nutjob if they saw half of what you posted as AC to Slashdot. ;P
    • Your co-workers would think you were a nutjob if they saw half of what you posted as AC to Slashdot.

      As AC? Shit man, most people would think I was nuts if they saw half of what I posted _with_ my username.
    • "Your co-workers would think you were a nutjob if they saw half of what you posted as AC to Slashdot. ;P"

      Your coworkers probably already consider you a nutjob.

    • Have you tried using your touch-tone pad when prompted to say your SSN and PIN? Your bank may very well be different, but every automated phone system I've used has allowed me to do either.

      Granted, someone with a tap on your line would still be able to determine your numbers by analyzing the touch tones just as easily as from hearing you say them...
    • Normally these systems should only ask for *parts* of you PIN, e.g. the second and third number.

      If you listen long/often enough, you'll have all the numbers together, but still not in the right order. Safe enough for most applications.

      Kind of offtopic though.

      Bye egghat.
  • back in '96, buyed OS/2 4 that come with with a headset and ViaVoice. It could be used for dictating or for just commands. But don't liked the training part, and realized that i write faster than talk, and more important, much faster than i can talk correctly.

    Looking backward, depending on my mood my voice should be like mmm remember those distorted graphics where people can say what text is in there but not OCR used for confirmations in web sites? well, the same :)

  • Even if this were perfect, it would still be stupid. Have you ever listened to yourself talk? Do you really want that recorded?

    More to the point: have you ever learned a foreign language? Remember how obscenely different the written language is than the spoken one? The same is true of English, we just don't notice it as much. There are more stringent requirements for written speech -- that's why giving dictation is so hard. Complete sentences, no body language or appreciable emphases, paragraph stru

  • An important read on this topic is The Unfinished Revolution [readinggroupguides.com] by the late Michael Dertouzos. In the book he describes the core technologies and approaches of human-centric computing, and speech interface is included as an essential ingredient. It's not just for "slow typists who would like to use their voice to write", it's for the future of computing.

  • If by "talking" you mean verbally abusing and threatening Windows with a loaded gun.... then: YES, YES I DO TALK TO MY PC.
  • Just go to System Preferences, click on Speech, choose the Recognition tab, and away you go. How well does it perform? "Naught 2 wheel." Cancel; "Knot 2 veil." Cancel; "Not to L." Cancel; oh forget it, gimme my keyboard!
  • by techmuse ( 160085 ) on Thursday December 09, 2004 @02:35PM (#11044342)
    I have used Dragon NaturallySpeaking Professional for many years, and ViaVoice before that. ViaVoice's recognition was not so great, and the program crashed constantly. Dragon works very well. I can do around 140-150 wpm with it. I seem to have to make 1-2 corrections per sentence, sometimes less. I am using Dragon 7, but there is a new version (8) out now. I highly recommend this program if you have repetitive stress injuries, or would like to avoid developing them.
  • by Speare ( 84249 ) on Thursday December 09, 2004 @02:36PM (#11044348) Homepage Journal
    I've got a few machines around the house, and one is an eMac in the living room. It's mostly for edutainment titles, games, and to act as a print server, but I played with Mac OS X's "Speakable Items" capabilities.

    I use the text-to-speech on several crontab entries. Chip (yes, that's the computer's name) will announce basic daily schedule items, such as the date in the morning, kid's bedtime, and a final signoff at 11pm. I added some checks so it wouldn't talk whenever iDVD or iTunes was running. I used to have it monitor news headlines too, but it would talk too often and we would tune it out.

    I also tried some "Speakable Items" for basic tasks. Essentially, there is a special folder with a number of AppleScript files. The filenames are their voice triggers. If the computer hears you say one of those filenames, it runs the AppleScript. There are nested directories with items for specific applications, so you can speak the global commands or the active app's specific commands. Well thought-out.

    Some Speakable Items could come in handy, but the eMac microphone is too limited to be able to command the machine from across the room. You also cannot have a set of Speakable Items somewhere which are still active when nobody's logged in. Thus, I need to have a user logged in (and then turned away with user switch). Lastly, for most of the automation tasks I'd like to run, Perl or Bash is a better choice than AppleScript, but Speakable Items must be special text-command files or AppleScript, and I can't imagine making a bunch of AppleScript stubs for each Unix-style script I would write. These each limit the usefulness of the voice-commandable appliance I was hoping for.

    On the utility side, speech command would be great for specific queries, "Chip, what day is it?" and generic countdowns: "Chip, give me ten!" and he'll tell you when ten minutes have elapsed.

  • Shouting to yer neighbour: cee dee slash enter! are em space dash are ef! enter!
  • by Infonaut ( 96956 ) <infonaut@gmail.com> on Thursday December 09, 2004 @02:38PM (#11044376) Homepage Journal
    of people talking to their computers. Some people aren't bothered by noise pollution, but it drives me crazy. The babble of people on the phone in a crowded space is bad enough. Add to that people talking to their computers constantly, and postal employees won't be the only people going off with AKs.

    • And of course, what happens when you have a room full of people all talking is that the overall volume level slowly rises as they try to compensate for not being able to hear themselves by talking louder... and louder... and louder... until everyone in the entire cube farm is screaming at the tops of their voices at their computers.

      Might make for a good sitcom scene...
  • by Spy der Mann ( 805235 ) <spydermann.slash ... m ['mai' in gap]> on Thursday December 09, 2004 @02:38PM (#11044378) Homepage Journal
    IMHO, the problem with this kind of engines is that they don't make a separation between speech to phoneme / phoneme to text.

    If someone designs a good open source speech to phoneme architecture, I'm sure people would start working on phoneme to text AI algorithms.

    They say: "Open source? Death!!! Where will our revenues for research go?"

    But... what use is patenting/selling something that doesn't work in the first place?

    Again, this is only my personal opinion. (I couldn't RTFA because... *slashdotted* :-/ )
  • and it talks back.

    Which is weird, since it doesn't have a mic or speakers.

    Right now it's telling me that it's time to go home and clean the guns.

    • WireDog;

      Please let us know when your computer tells you to bring those guns to work for "show and tell".

      Unfortunately, whatever day that will be is the day I have an off-site meeting. :-(

      But let me know the day before, just in case.

  • I played around with Dragon and Kurzweil back in the day and man were they horrible. You practically had to read the thing an entire novel to get it to 95%.

    Sometime in the post Pentium revolution, algorithms got a shot in the arm and dictation software started getting significantly better even before training.

    The biggest problem I've had is that reading a predetermined text to a computer doesn't sound anything like my causal style speech I'm going to use for voice input. Anything I read over turned out re
  • by sootman ( 158191 ) on Thursday December 09, 2004 @02:45PM (#11044454) Homepage Journal
    I've played with the speech recognition that came with my tablet PC. Works OK if I'm by myself in a quiet room where I can non-self-conciously talk unusually loud-n-clear. Every time I've demo'ed it to people, in an office environment talking normally, the results are laugable.

    The good news is, you can play "Telephone" all by yourself! Remember that game where you sat in a circle, and one person says a sentence to the person next to him, and he tells the next person, and so on all around the circle, and then you hear the final version? Just talk to your computer, then when your words are shown (incorrectly) on the screen, read those words back, and so on. Easier and more fun than going from german to french to english to spanish to french to german to english in babelfish. :-)
  • ...to Opera [opera.com]. The Voice-Feature [opera.com] lets you control the browser easily including zoom-in and -out what is very interesting for those of us that are handicapped.
  • I've been using voice for about 6 months. I had a big issue with right hand pain this year so I talked to a fellow developer who helped me get setup. We've done some custom grammars in python for or dev environment. It's been helpful. It's a long way to go if you want to reduce mouse usage. The mouse has to be the worst peripheral for the PC. I'm considering buying the SmartNav http://www.naturalpoint.com/smartnav/ [naturalpoint.com] to get rid of my mouse. I've messed with one on a PC with 2 screens. It was nice.
  • David Pogue in the New York times did a very effective review of Dragon Naturally Speaking by using the product in a split screen video. http://www.nytimes.com/videopages/2004/12/02/techn ology/20041202_STAT_VIDEO.html [nytimes.com]

    I wish I could figure out how to embed a url without printing out the entire url.

  • As much as I type, and as fast as I do it, can you imagine how hoarse I would get trying to keep up?

    Also, how would you say:
    $preg_match="/([0-9a-zA-Z\_].*)[Ee]nd/";
    if ($a=5)
    {
    echo $theTotalAmount;
    echo SubRoutine($subTotalAmount, $preg_match);
    echo '<br>There is <B>not enough money</B>, stupid!';
    }
    Keyboard for me....
  • In Korea, voice recognition is only for old people
  • by Mantorp ( 142371 )
    but I often swear at it
  • I've played with voice recog stuff on my mac under OS X. I've written several AppleScripts that I tell to do things like Open FireFox, Open App. Each app i want to open has to have a seperate entry but to be honest using apple script this is an absolute cakewalk. Also the speak like nature of apple script makes me think it is going to one day be something you just starting talking at your computer and it runs what you say.

    I must say this stuff is very forgiving and for the kinds of actions like open, c

  • It is encouraging for slow typists who would like to use their voice to write.
    Apart from the physically handicapped, are there really that many "slow typists" who type so slowly that this would be a speed improvement?

    I always assumed that speech recognition was for cases where typing was not possible (e.g. in a car, if you are handicapped, etc).

    I get more throughput typing my own letters than dictating to a stenographer, for example.

  • by Dammital ( 220641 ) on Thursday December 09, 2004 @04:48PM (#11045889)
    Nat (the Ximian dude) recently hurt himself and has been reduced to being a one-handed typist. In order to stay connected, he's hired someone to take dictation for him. In today's blog entry [nat.org] he talks about the experience, what it's like for a very competent typist to use a dictation system, and thinks aloud about future intelligent speech-to-text applications.

Make sure your code does nothing gracefully.

Working...