james mckay dot net
because there are few things that are less logical than business logic

Trying out speech recognition in Windows Vista

Over the past few months I’ve been rather intrigued by some of the reports that I’ve read about Windows Vista speech recognition.  For example, Scott Hanselman claims an increase in speed from approximately 72 words per minute when typing to about 125 words per minute with voice recognition—an improvement of approximately 75%.

Now Hanselman works for Microsoft, so it is only natural that he would give something to do with Windows Vista a glowing report, and let’s face it, a typing test is actually a very artificial way of trying out this kind of thing—you’re not chopping and changing all the time but reading verbatim off the screen.  Other people are less complimentary.  Microsoft’s ill fated demonstration of Vista’s speech recognition went down in history, with the immortal phrase “Dear aunt, let’s set so double the killer delete select all” appearing on the evening news and geek T-shirts.

I haven’t actually used Windows Vista that much until recently.  My old laptop only had Windows XP on it, and I use Windows Server 2003 at work.  However, a couple of weeks ago I got a new laptop, complete with Windows Vista, so I thought I might as well put it through its paces, and so today, I’ve been trying it out.  I’ve set myself the goal of writing a complete blog entry without using the keyboard or the mouse: opening Windows Live Writer, navigating Firefox and Google Reader, finding other pages that I want to reference, and inserting hyperlinks, using speech alone.

My experience so far has been more along the lines of the ill fated demonstration than Hanselman’s glowing report, but to be fair, it’s still early days.  Initially it was painfully awkward, and after a few hours it’s still pretty clunky, but it does seem to be learning from its mistakes and it does get it right about 80% of the time.  The problem is that the 20 per cent of the time when it doesn’t get it right, or when you want to chop and change things, it is very slow and fiddly.  Some things don’t even work: “show numbers” doesn’t show numbers correctly for hyperlinks on Ajax enabled sites in Firefox, and your blog’s category names don’t appear in the “categories” drop down list in Windows Live Writer.

Yes, it’s more comfortable than using the keyboard, but it does take a lot longer, and thoughts only trickle from your brain into your document rather than flowing.  It is also pretty frustrating if you keep chopping and changing things, as I do when I’m writing. Perhaps if I persevere at it I might find it improves, but I don’t think we’re going to be dispensing with our keyboards anytime soon. 

1 comment:

  • # Reply from Scott Hanselman at 05:35 on 13 Apr 2008

    A little presumptuous that I’d lie or give a positive review about a product just because I work for The Man. 😉 My experience is my experience regardless of employer.

    I use Dragon Naturally speaking in my every day work and I’ve written 3 books using Dragon. It takes time, training, patience, and a good high quality microsphone.

Comments are closed.