Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

Open Source Speech Recognition - With Source 404

Posted by timothy on Tuesday September 28, 2004 @07:18PM from the what-I-hear-you-saying-is dept.

Paul Lamere writes " This story on ZD-Net and this recent story on Slashdot describes the recent open sourcing of IBM's voice recognition software. This release, unfortunately, doesn't include any source for the actual speech recognition engine. Olaf Schmidt, a developer on the KDE Accessibility Project , is quoted as saying 'There is no speech-recognition system available for Linux, which is a big gap.' In an attempt to close this gap, we have just released Sphinx-4, a state-of-the-art, speaker-independent, continuous speech recognition system written entirely in the Java programming language. It was created by researchers and engineers from Sun, CMU, MERL, HP, MIT and UCSC. Despite (or because of) being written in the Java programming language, Sphinx-4 performs as well as similar systems written in C. Here are the release notes and some performance data."

This discussion has been archived. No new comments can be posted.

Open Source Speech Recognition - With Source

Load All Comments

Search 404 Comments Log In/Create an Account

Comments Filter:

Aim You Sing Ate Write How (Score:5, Funny)

by Anonymous Coward writes: on Tuesday September 28, 2004 @07:19PM (#10378930)

Ate lurks barry wall.

Share
twitter facebook
- Translation for those who still don't get it... (Score:4, Informative)
  
  by CaptainPinko ( 753849 ) writes: on Tuesday September 28, 2004 @07:40PM (#10379094)
  
  Title: I'm(Aim) using(You Sing) it(Ate) right(Write) now(How)
  Body: It(Ate) works(lurks) very(barry) well(wall).
  
  Parent Share
  twitter facebook
  - Re:Translation for those who still don't get it... (Score:2, Insightful)
    
    by Epsillon ( 608775 ) writes:
    
    Don't you feel that the joke loses its appeal when you have to explain it? This is Slashdot. If anyone failed to get it, they probably shouldn't be here ;o)
    - Re:Translation for those who still don't get it... (Score:2)
      
      by aled ( 228417 ) writes:
      
      why? oh, wait...
    - Re:Translation for those who still don't get it... (Score:3, Funny)
      
      by Anonymous Coward writes:
      
      You need review what you post when you hit "Preview." It should read: "This is Slashdot. If anyone got it, they probably shouldn't be here ;o)"
    - Re:Translation for those who still don't get it... (Score:3, Insightful)
      
      by Anonymous Coward writes:
      
      I disagree. Many of us are not native English speakers. What may be obvious to you phonetically doesn't have to be obvious to the rest of us. :)
      
      This being said, yes, that the explanation of the joke would be modded up is a bit sad, alright. ;)
    - Re:Translation for those who still don't get it... (Score:4, Insightful)
      
      by darkonc ( 47285 ) writes: <stephen_samuelNO@SPAMbcgreen.com> on Tuesday September 28, 2004 @08:37PM (#10379493) Homepage Journal
      
      Don't you feel that the joke loses its appeal when you have to explain it?
      It takes at least 3 people to make for a really good joke:
      
      One person to tell it
      
      One person to get it
      
      One perdon to laugh at for just missing the whole point.
      
      #3 is a little bit less obvious when the joke telling is online.
      
      Parent Share
      twitter facebook
- Re:Aim You Sing Ate Write How (Score:5, Funny)
  
  by BarryJacobsen ( 526926 ) writes: on Tuesday September 28, 2004 @08:43PM (#10379528) Homepage
  
  Ate lurks barry wall.
  
  Who ate my wall?
  
  Parent Share
  twitter facebook
Java!?! (Score:3, Funny)

by Anonymous Coward writes: on Tuesday September 28, 2004 @07:19PM (#10378932)

Quick someone port this to C.

Share
twitter facebook
- Re:Java!?! (Score:5, Funny)
  
  by darkonc ( 47285 ) writes: <stephen_samuelNO@SPAMbcgreen.com> on Tuesday September 28, 2004 @08:40PM (#10379512) Homepage Journal
  
  Quick someone port this to C.
  Just be glad it wasn't written in Lisp.
  
  Parent Share
  twitter facebook
- Re:Java!?! (Score:3, Informative)
  
  by Lumpy ( 12016 ) writes:
  
  there are problems with it being in java.
  
  Embedded sysrtems can not use it without huge overhead. if it was in C then it could really give a boost to the embedded linux market.
  
  Sigh, it's still a huge gap to what I do, No room for a Java VM in the embedded systems until I double all my costs on the hardware.
  
  on another note, it just might help fill in the other huge gap in linux. There is no Navigation software to use with map data and a GPS.
  
  and no kiddies, GPSdrive is NOT navigation.
  - Re:Java!?! (Score:3, Informative)
    
    by Glock27 ( 446276 ) writes:
    
    Embedded sysrtems can not use it without huge overhead. if it was in C then it could really give a boost to the embedded linux market.
    Check out gcj [gnu.org]. One of it's primary uses is targeting embedded systems. It's quite lean and mean for a Java runtime.
    HTH.
- Convert to C easily with ALMA (Score:4, Informative)
  
  by samjam ( 256347 ) writes: on Wednesday September 29, 2004 @06:49AM (#10381906) Homepage Journal
  
  Alma [freshmeat.net].
  
  It can read several high level languages and build an internal representation and the convert that to other high level languages.
  
  It is a great tool to help port this software to C for example.
  
  Unfortunately the site seems to have gone, although I have used this software in the past.
  
  See the google cache though: http://66.102.9.104/search?q=cache:Dbw7OX6Tco4J:ww w.memoire.com/guillaume-desnoix/alma/+&hl=en [66.102.9.104]
  
  Parent Share
  twitter facebook
  - Google, schmoogle, there are better ways! (Score:3, Informative)
    
    by leonbrooks ( 8043 ) writes:
    
    WayBack has it [archive.org].
    
    I've also mirrored the source [cyberknights.com.au] Just In Case (that's an ADSL link, you'd be better off downloading it directly from WayBack).
- Re:Java!?! (Score:5, Informative)
  
  by leinhos ( 143965 ) writes: on Wednesday September 29, 2004 @08:21AM (#10382215) Homepage Journal
  
  Can't gcc compile java code [gnu.org] directly to native binary code?
  
  Does this mean that one could make a shared library out of the java code for C-programmers to use?
  
  Parent Share
  twitter facebook
withOUT source surely? (Score:3, Funny)

by RobertTaylor ( 444958 ) writes: <roberttaylor1234 AT gmail DOT com> on Tuesday September 28, 2004 @07:22PM (#10378953) Homepage Journal

"Open Source Speech Recognition - With Source"

"This release, unfortunately, doesn't include any source for the actual speech recognition engine."

Share
twitter facebook
- Re:withOUT source surely? (Score:2, Interesting)
  
  by sploo22 ( 748838 ) writes:
  
  That's for the IBM one, dummy. Let me guess - you saw that sentence and had an instant knee-jerk reaction without reading the rest of the summary to find out what it's talking about.
Java comment (Score:3, Insightful)

by Aragorn992 ( 740050 ) writes: on Tuesday September 28, 2004 @07:24PM (#10378962)

Despite (or because of) being written in the Java programming language, Sphinx-4 performs as well as similar systems written in C.

Im sick of these comments. Anyone that needs to know about the performance of Java knows its very fast. Why bother commenting about it anymore?

Its like saying "... and because it was written in C, its very fast...", as if we didn't know already.

Share
twitter facebook
- Re:Java comment (Score:4, Insightful)
  
  by Taladar ( 717494 ) writes: on Tuesday September 28, 2004 @07:36PM (#10379056)
  
  Java might be as fast as C in Code Execution but if you want to build a library that Open Source Applications outside the Java-Developer-niche use you have to write it in C. C is still THE No. 1 language for libraries for use in programs written in lots of different programming languages.
  
  Parent Share
  twitter facebook
- Obviously (Score:5, Funny)
  
  by Moderation abuser ( 184013 ) writes: on Tuesday September 28, 2004 @07:59PM (#10379231)
  
  "This data was collected on a dual CPU UltraSPARC(R)-III running at 1015 MHz with 2G of memory."
  
  Looking at the performance data it just blazes along on that config. Not exactly what I'd call an embedable system, though Microsoft might beg to differ.
  
  Parent Share
  twitter facebook
- Re:Java comment (Score:2)
  
  by Zebra_X ( 13249 ) writes:
  
  Funny, as an end user of those applications I ... feel... like ... I ... am ... constantly... WAITING, FOR SOMETHING, a screen refresh, button click, tree menu to expand, what I'm really waiting for is a reasonably performing java app. Floating point operations, interger math, and benchmarks may say java is fast, but as an end user, it's Slow.
- The issue is Javas footprint and integration (Score:5, Insightful)
  
  by Ndr_Amigo ( 533266 ) writes: on Tuesday September 28, 2004 @10:28PM (#10380081)
  
  While I've been waiting for Sphinx to mature into something useful for a long time now, the move to Java makes the whole package pretty useless to me.
  
  Java is a memory hog, and it's certainly not going to be on any device I would want speech recognition on. Heck, I don't have Java installed on any of my machines, mostly because of the absolutely ridiculous footprint on disk as well as when running in ram.
  
  And integrating Java applications into other applications is very difficult. Now, Java is good for certain things, but a speech recognition engine in Java sounds like the worst abuse possible :)
  
  That and I still can't train it to recognise my slight australian accent, unlike every other bit of SR software I've used on Win32 :P
  
  Whether or not Sphinx-4 works, and whether or not Java is 'fast' enough to do speech recognition processing, its of no use to me.
  
  Parent Share
  twitter facebook
  - Frustrated Java detractors... (Score:3, Interesting)
    
    by patniemeyer ( 444913 ) writes:
    
    Every year the Java naysayers get more and more frustrated and more desperate to find a reason that Java just won't do. For years it was that Java was too slow... that one was true for about 18 months in 1995. Well, maybe now that we can do crypto in Java, play DOOM in Java, and do speech recognition in Java we can finally put it to rest.
    
    Next up - Java's footprint and startup time is too slow... Take a look at what they're doing in Java 1.5 to memory map and share core classes and pre-bind read only cl
But what about text to speech? (Score:5, Interesting)

by Anonymous Coward writes: on Tuesday September 28, 2004 @07:26PM (#10378983)

When are we going to get GOOD text to speech, that uses modeled parameters of human vocal tracts rather than stitching together a bunch of pre-recorded phonemes?

Share
twitter facebook
- Re:But what about text to speech? (Score:5, Informative)
  
  by QuantumG ( 50515 ) writes: <qg@biodome.org> on Tuesday September 28, 2004 @07:30PM (#10379014) Homepage Journal
  
  By we I assume you mean "the open source community" and the answer is "when you get off your ass and code it". If by "we" you mean the world at large then go and look at AT&T's Natural Voices [att.com] project.
  
  Parent Share
  twitter facebook
  - Re:But what about text to speech? (Score:3, Interesting)
    
    by DAldredge ( 2353 ) writes:
    
    It still doesn't sound natural, this text sounds like a female Kirk read it.
    
    We would like to know if something does not sound quite right. After entering some text and listening to it, please fill out a feedback form and tell us what was mispronounced. And please note that no language translation is done so, for example, if you choose a French voice you should submit French text.)
    
    (That text is from the same page.)
    - Re:But what about text to speech? (Score:2)
      
      by QuantumG ( 50515 ) writes:
      
      Yep, you're right. The voice is fine, it's the pronounciation and the complete lack of feeling that sux. I think what they need to do is get recordings of people reading a passage. Then make a speech synth that produces a similar sounding voice. Get it to read the same passage and then train it to produce identical output to the natural speaker. Repeat this with a few hundred passages and you'll capture a single person's reading style.
      - Re:But what about text to speech? (Score:4, Interesting)
        
        by cheezit ( 133765 ) writes: on Tuesday September 28, 2004 @07:57PM (#10379224) Homepage
        
        I'm thinking it might be a bit more complicated than that...the human voice is unfortunately far too expressive.
        
        Have the same person read the same passage ten times the same way and you will get ten very different results. Ask them to change tones/emotions and it will be even different.
        
        Parent Share
        twitter facebook
      - Re:But what about text to speech? (Score:3, Insightful)
        
        by silentbozo ( 542534 ) writes:
        
        You cannot achieve what you want without the reader understanding context. Since computers can't yet understand context, we can't yet build such a system.
        
        I disagree. If computers can't understand context, then don't use the computer to provide the context. Instead, use humans to annotate text using emotional markers, in the same way that composers and conductors add accents and other notations to sheet music. Although you can sort of do this now by explicity indicating phonemes, a better system would
  - - Re:But what about text to speech? (Score:3, Insightful)
      
      by IgnoramusMaximus ( 692000 ) writes:
      
      So to you a "community" is a bunch of people who only do things which they themselves want and never help each other? Weird
      Like any community we get all kinds. There are those who do as you say. But there are also those who care for non-programmers and try to accommodate them. It all depends. In the case of a hard-core problem like speech recognition/synthesis (which is nowhere near acceptable level of scientific understanding) you are likely to get more of the "go code it yourself" kind because this area
    - Re:But what about text to speech? (Score:3, Insightful)
      
      by misleb ( 129952 ) writes:
      
      Exactly how is it arrogant to suggest someone make something themselves if they want it so bad? It may not be productive or helpful, but it certainly isn't "arrogant." Perhaps the non-programmer should have considered that what is better to a programmer is not necessarily better to everyone else. I mean, I don't go making my car buying decisions based on the suggestion of a truck driver. Otherwise I might end up with a Mack tractor and nothing to haul...
      Non-programmer can't code it. Arrogant-open-source-p
- Re:But what about text to speech? (Score:5, Funny)
  
  by Sheetrock ( 152993 ) writes: on Tuesday September 28, 2004 @07:34PM (#10379043) Homepage Journal
  
  Given that there is already a rudimentary text-to-speech package available for Linux, and now a speech-to-text package, perhaps the secret is to pipe one to the other in a closed loop until one learns how to enunciate and the other how to listen?
  
  Parent Share
  twitter facebook
  - Re:But what about text to speech? (Score:3, Funny)
    
    by MarcoAtWork ( 28889 ) writes:
    
    I think it's more likely that this will degenerate into some sort of c3po 'language' after a few passes: beep BOOP beep beep blip
    - OT Star Wars Nitpick (Score:5, Informative)
      
      by Anonymous Coward writes: on Tuesday September 28, 2004 @07:49PM (#10379160)
      
      Hey moron, it's R2D2 that beep-booped. C3PO was fluent in over 6 million forms of communication. ;-)
      
      Parent Share
      twitter facebook
  - Re:But what about text to speech? (Score:2)
    
    by Dr Reducto ( 665121 ) writes:
    
    That would be a pretty fucked up way of playing "telephone".
  - Re:But what about text to speech? (Score:2)
    
    by Zork the Almighty ( 599344 ) writes:
    
    Garbage in, garbage out.
  - Re:But what about text to speech? (Score:3, Interesting)
    
    by mevans ( 791269 ) writes:
    
    I was sitting in English class one day, and working on a paper - a friend was editing, and I was looking to make a copy of the paper. Having no disks and a finnicky network, we decided to run text-to-speech on my machine and speech-to-text on hers. Needless to say, my paper on the Medicare Reform Bill of last year became garbage. - Evidence of a lossless transfer!
  - Re:But what about text to speech? (Score:3, Funny)
    
    by Chris Oz ( 684680 ) writes:
    
    More creatively why not pioneer IP over voice. Then you could do IP over Voice over Voice over IP ... :)
    
    In fact I suspect it sounds like a good RFC that could be released early next April. It would have to be an improvement over IP over Avain Carrier.
- Re:But what about text to speech? (Score:2, Funny)
  
  by OneDeeTenTee ( 780300 ) writes:
  
  But I like my text-to-speech output when it sounds like a Berserker. -Goodlife
- Re:But what about text to speech? (Score:2)
  
  by NichG ( 62224 ) writes:
  
  There's a program called Praat [hum.uva.nl] that does this. However, you need a medical degree, or at least working knowledge of the muscles of the human vocal tract and what positions the must be in to produce certain sounds, in order to get any use out of it. After about 5 hours of playing with the parameters, I got it to say 'e'.
  
  Now, if someone were to make a program that generated coordinates for the muscles that corresponded to going between different uterances, we'd be in business.
- Re:But what about text to speech? (Score:4, Insightful)
  
  by winterlens ( 258578 ) writes: on Tuesday September 28, 2004 @11:46PM (#10380525)
  
  Probably because speaking is incredibly complicated, and providing realistic speech from unmarked text is an intractable problem.
  When you write something down, you don't provide a pronunciation guide. Rather, the reader is guided by context. For instance, if I write the word "import", how do you pronounce it? If we're talking about trade deficits, you probably know that the stress is on the second syllable; but if we're discussing meaning, the stress is on the first.
  How do we expect computers that have a difficult time with context to make a pronunciation decision? This is a serious barrier to "good" text to speech (whatever "good" means).
  If you mean that you want the voice to sound more natural, even if it's pronouncing words incorrectly, you still have a lot of hard problems. For instance, the muscles in the tongue and lips move differently based on how phonemes are grouped. Coarticulation models are difficult to construct, and when you try to account for a convincing number of muscles and vibrations, the problem may quickly become intractable.
  Not only do we have to pay attention to the physics of speaking, but also the physics of hearing. The amount of signal processing involved can be pretty staggering if you're going to implement a complete system. Thierry Dutoit has a really good book on the subject called An Introduction to Text-to-Speech Synthesis. You should check it out if you want a somewhat more exhaustive answer to your question.
  
  Parent Share
  twitter facebook
IBM gave me speech recognition a decade ago (Score:2, Interesting)

by Anonymous Coward writes:

In OS/2. Really, it was just about a decade ago. It worked pretty well, especially when you take into account the computer power of the time.
How about open source word spotting (Score:2, Interesting)

by Anonymous Coward writes:

Old and busted = voice recognition

New hotness = word spotting

When are we going to see software for Linux that allow us to search keywords in audio or video files like Dragon MediaIndexer [scansoft.com] does?
Virtual Machine Syndrome (Score:5, Funny)

by nihilogos ( 87025 ) writes: on Tuesday September 28, 2004 @07:30PM (#10379020)

Colloquially known as "pointer-envy", this condition may affect all programmers, but is especially prevalent in java and C# developers. It is most easily recognized in a release announcement, where for no reason whatsoever the afflicted developer suddenly interjects a statement like "and it's just as fast as C", to the bewilderment of the audience.

Treat suspected cases with caution, and under no condition contradict the patient. There is no known cure.

Share
twitter facebook
- Re:Virtual Machine Syndrome (Score:5, Funny)
  
  by Xeger ( 20906 ) writes: <slashdot@tracker ... t ['ege' in gap]> on Tuesday September 28, 2004 @07:45PM (#10379127) Homepage
  
  KNOWN CAUSES: Recent research results from information-theoretic psychoanalysts shows that Virtual Machine Syndrome is most likely a pre-emptive defensive discourse strategy. VMS sufferers typically become symptomatic after months or years of constant haranguing at the hands of colleagues, friends and professional contacts that anything they write, regardless of its execution environment or portability requirements, could have been done "better and faster in C." Oftentimes, such criticism is levied against VMS sufferers even when the application in question is I/O-bound and spends 80% or more of its time suspended, waiting for network or disk I/O to complete.
  
  TREATMENT: Implement reliable and efficient systems using virtual machine of choice, regardless of criticisms. Apply free-market therapy judiciously, allowing adopters of Virtual Machine technology to thrive and become prosperous if warranted. VMS symptoms typically disappear when sufferer's stock options are valued at 300% of their strike price. Symptoms may also be temporarily relieved through just-in-time compilation.
  
  RELATED SYNDROMES: Ossified Self-Important Myopia (OSIM), which is the tendency to assume that one's favorite programming paradigm, language, or OS is unconditionally and unreservedly the best choice for any software project. Characterised by the inability to understand that the only way to guarantee maximum efficiency is to write everything in assembly language, with complete and perfect knowledge of all quirks of the specific target instruction set.
  
  Parent Share
  twitter facebook
- Re:Virtual Machine Syndrome (Score:5, Informative)
  
  by pslam ( 97660 ) writes: on Tuesday September 28, 2004 @07:50PM (#10379171) Homepage Journal
  
  It is most easily recognized in a release announcement, where for no reason whatsoever the afflicted developer suddenly interjects a statement like "and it's just as fast as C", to the bewilderment of the audience.
  An expecially odd statement considering much of speech recognition can be broken down into great big vector operations, which are perfect for hand coding in C. Bet I could quadruple the speed of it in a couple of hours with some hand coded SIMD ops in x86 assembler.
  It's funny because Java is fantastic at JIT compiling code with lots of non-local behaviour (e.g complex UIs) because it can take into account global behaviour at runtime. But it sucks at tight, heavy computation loop. DSP is a fantastic example of something Java is going to get creamed at when pitched against non-virtual machines.
  Of course, if you have some cross-platform standard API calls for those vector DSP ops, then it's a different argument...
  
  Parent Share
  twitter facebook
  - Re:Virtual Machine Syndrome (Score:3, Insightful)
    
    by Unknown Lamer ( 78415 ) writes:
    
    You could always code the easily vectorizable stuff in C with inline assembly and call it from Java using your preffered runtime's FFI.
- Re:Virtual Machine Syndrome (Score:5, Funny)
  
  by Brandybuck ( 704397 ) writes: on Tuesday September 28, 2004 @08:18PM (#10379369) Homepage Journal
  
  Try my new text editor, it's written in Java!
  
  Why should I?
  
  Because it's written in Java!
  
  How is it better than what I'm currently using?
  
  It's written in Java!
  
  I'm already using vi, emacs, kate and gedit, why should I use yours as well?
  
  Because it's written in Java!
  
  Does it have a spell checker, syntax highlighting, and auto-indent?
  
  Who cares? It's written in Java!
  
  Name two benefits to your text editor?
  
  That's easy! First, it's written in Java. Second, it's uh... uh... hang on, uh... it's written in Java! Yeah, that's it, it's written in Java!
  
  Parent Share
  twitter facebook
  - Re:Virtual Machine Syndrome (Score:4, Insightful)
    
    by deanj ( 519759 ) writes: on Tuesday September 28, 2004 @09:24PM (#10379719)
    
    Heh...you could substitute "uses Linux" for "written in Java", and you'd have the same thing.
    
    Seriously though, Sphinx-4 is really worth looking at. That group at Sun does great work.
    
    Parent Share
    twitter facebook
  - - Re:Virtual Machine Syndrome (Score:5, Insightful)
      
      by IgnoramusMaximus ( 692000 ) writes: on Tuesday September 28, 2004 @10:49PM (#10380197)
      
      Stuff written in Java is better than stuff written in C or C++ because there are no frapping buffer overflows in Java code
      True, instead there are a thousand "super-efficient" .jar libraries required by a "Hello World" app, which use the "Object Oriented Programing and Long Lasting Cure All and Testicular Itch Relief Paradigm(tm)" to such extremes that it takes 12 objects instantiated in 4 containers to flip a bit in a byte. Additionally, there is the substitution of native performance of compiled code to code compiled "Just Too Late" combined with exceptional memory usage that entails. If it were not enough, as a bonus, we get the garbage collector which is scientifically fine-tuned to run just when user is expected to interact with the application in most time sensitive manner. As an icing on the cake we also are treated to multiple, insideously incompatible with each other, versions of the so-called "universal" VM, resulting in one app demanding that SunVM is used and the other that MS VM is used thus resulting in total impossibility of using both at the same time. Yes I do speak from looong and utterly infuriorating experience with Java apps.
      At one point I considered printing a sign warning of Java advocates being shot on sight, I could probably make some serious money selling it, given similiar amount of grief my other colleagues are going through.
      Ahem, and yes, the greatest offenders in me experience are... err... frigging IBM Java apps. We actually abandoned DB2 8.x release because noone could deal with the havoc the DB2 admin tools were causing with various other retarded banking related Java apps.
      
      Parent Share
      twitter facebook
      - Re:Virtual Machine Syndrome (Score:5, Insightful)
        
        by dr2chase ( 653338 ) writes: on Tuesday September 28, 2004 @11:55PM (#10380561) Homepage
        
        Great story, but basically wrong and misleading. You can trowel on the layers in any language, and you can write fast Java programs. The speech engine is proof of that.
        Garbage collection, in particular, is coming along nicely. Check out "Metronome" by David Bacon, of IBM. You set the knobs, it tells you how much memory you will need, and it gives you GC with real time performance. No pauses.
        Or, consider the machine that Azul is working on (good luck getting details now that they are in some sort of a quiet period). It has hardware support for read and write barriers, plus a good story for stack caches. Chances are good its GC pauses will be tiny (1-10 ms).
        I can also tell you that the market very much prefers JIT compilation. I worked on an ahead-of-time-compiling JVM, and there were a couple of others built by other companies. I don't work on that JVM any more, and the other AOT JVM companies have either failed or gone into other lines of business.
        So, great story, but not exactly correlated with reality.
        On the other hand, consider all the buggy apps that we (who sometimes administer Windows machines) have needed to patch over and over again over the years. If I am unwilling to run an application in the first place because of its poor security, does it really matter how little memory it uses, how fast it runs, or how well it gets along with the other worm-friendly apps?
        
        Parent Share
        twitter facebook
        
        Re:Virtual Machine Syndrome (Score:5, Funny)
        
        by IgnoramusMaximus ( 692000 ) writes: on Wednesday September 29, 2004 @12:16AM (#10380671)
        
        Check out "Metronome"
        I dont give a damn about Metronomes and speech recognition of questionable usefulnes. None of those are Java apps I deal with. And of those I deal with all suck.
        worm-friendly apps?
        Speaking of security, most Java apps are deployed in places that need them not in the first place, as a kludge for an E-Commerce site or electronic banking interface which can be done with a bit of thinking in plain HTML. Others, like IBMs for example, are mainly administrative tools which have no communication abilities outside of their narrow scope. These, if made in any other language would not be any more prone to worms. As a matter of fact, the use of Java on some of these electronic commerce sites introduces unneeded complexity and results in code executing on customer's computers whereby they become prone to being abused by spoofed/buggy VM's etc.
        So, great story, but not exactly correlated with reality.
        Reality? Oh dear. Listen dude, I am telling you as a user of your wonderful computer science masturbation effort otherwise known as Java: No. Nada. Niet. It aint a go. No can do. The bank we deal with is rewriting their apps to be java-free because of the amount of flaq they are getting (and no they are not going to that other aberration known as C# either). IBM DB2 is banned in many companies we deal with. Etc etc. We, the users, not you, Mr. Java Wanker, have the final word on this. Trust me.
        
        Parent Share
        twitter facebook
    - - Re:Virtual Machine Syndrome (Score:3, Insightful)
        
        by IgnoramusMaximus ( 692000 ) writes:
        
        but you don't see me strutting around like I'm using God's own language!
        I think there is something specially weird about these Java priests. I have seen people exhibiting unhealthy attachment to some tools or languages, but nothing even approaching the amount of zealotry Java seems to produce. Well maybe except the Emacs/Vi war but that at least had some comedic elements. Java worshippers on the other hand are indeed appearing to believe they somehow discovered some secret source of mojo and the Ultimate
- Re:Virtual Machine Syndrome (Score:3, Informative)
  
  by jamesh ( 87723 ) writes:
  
  In theory, it's all compiled down to assembly in the end anyway so it has equal chance of being just as fast. For some types of code, JIT can be faster.
  
  Some of the advantages of byte-code are:
  . branch prediction and other speculative optimisations can be done based on observing the flow at runtime rather than guessing at compile time.
  . it's not necessarily tied to a specific architecture
  . if code optimisation technology improves, you don't need to recompile anything. the new JIT engine can do it all for yo
- - - - Re:Virtual Machine Syndrome (Score:3, Funny)
        
        by don.g ( 6394 ) writes:
        
        And may I just add,
        
        "In Soviet Russia... bullshit calls you!"
Free C++ alternative from Mississippi State Univ. (Score:5, Interesting)

by j.leidner ( 642936 ) writes: <leidner@COLAacm.org minus caffeine> on Tuesday September 28, 2004 @07:34PM (#10379042) Homepage Journal

Another open source system, but implemented in C++ (like all industrial systems I know of) can be found at here [msstate.edu] (a vision statement is here [msstate.edu].
--
Try Nuggets [mynuggets.net], the mobile search engine. We answer your questions via SMS, across the UK.

Share
twitter facebook
- Re:Free C++ alternative from Mississippi State Uni (Score:2, Funny)
  
  by Anonymous Coward writes:
  
  Eef ya uses thees one ya'lls gonna hafta talks like dis heers.
  - Another alternative: HTK (Score:3, Informative)
    
    by j.leidner ( 642936 ) writes:
    
    Dear anonymous,
    Maybe you like the Cambridge HTK [cam.ac.uk] better, then ;-)
    --
    Try Nuggets [mynuggets.net], the mobile search engine. We answer your questions via SMS, across the UK.
Cool... (Score:3, Funny)

by j_cavera ( 758777 ) writes: on Tuesday September 28, 2004 @07:35PM (#10379053)

Now my linux box can wreck a nice beach!

Share
twitter facebook
Open Source - With Source! (Score:4, Funny)

by NSash ( 711724 ) writes: on Tuesday September 28, 2004 @07:37PM (#10379064) Journal

From dept-of-redundancy-department?

I'm not one to be picky about titles, but sheesh...

Share
twitter facebook
Build Instructions (Score:2, Insightful)

by Anonymous Coward writes:

Given those build instructions, you are better off writing your own engine. This is exactly what is wrong with Linux today, and I dont see *any* solution to it. A maze of hidden dependencies and incompatabilities. No thanks.
- Re:Build Instructions (Score:3, Funny)
  
  by dspfreak ( 666482 ) writes:
  
  While you're looking out for the hidden dependencies and incompatibilities, could you keep an eye out for the Markov model?
Telephony (Score:2, Interesting)

by Anonymous Coward writes:

So how long before this is integrated with Asterix for voice activated linux telephone apps?

Michael
- Re:Telephony (Score:3, Informative)
  
  by dalabrat ( 575903 ) writes:
  
  December 2003 http://www.voip-info.org/wiki-Sphinx
Sphinx 2 (Score:5, Informative)

by PiGuy ( 531424 ) writes: <squirrel.wpi@edu> on Tuesday September 28, 2004 @07:44PM (#10379122) Homepage

"There is no speech-recognition system available for Linux, which is a big gap."

Um, Sphinx 2 [sourceforge.net] (a predecessor of Sphinx 4) has been around for quite some time now. Like Sphinx 4, it's speaker-independent. Unlike Sphinx 4, it's a C library, and is thus easily interfaced with other languages (insert shameless plug for a simple Python interface [wpi.edu] for Sphinx 2 I wrote).

Share
twitter facebook
- Re:Sphinx 2 (Score:3, Informative)
  
  by Ndr_Amigo ( 533266 ) writes:
  
  The speaker-independency of Sphinx2 is debable, I have never been able to get a single successful word recognised :)
Speech recognition (Score:5, Interesting)

by CastrTroy ( 595695 ) writes: on Tuesday September 28, 2004 @07:44PM (#10379123)

Speech recognition is one of the worst means of input there is for a computer. Keyboards work so much better. Even for those who don't have full use of their hands, there are many other options for user input, all of which are better than speech recognition. Worst thing ever is someone trying to use speech input in a cubicle environment.

Share
twitter facebook
- Re:Speech recognition (Score:3, Informative)
  
  by NanoGator ( 522640 ) writes:
  
  "Speech recognition is one of the worst means of input there is for a computer. Keyboards work so much better."
  
  This statement is far too general to be true. The keyboard is only faster if you know what the command is you're trying to enter AND how to spell it. Voice recognition, used correctly, is much more intuitive. Maybe it's not so hot for dictation, but imagine if an app you're using didn't have to have a bunch of hard-to-sift-through menus. Just say 'Italic!' or "Bold!'
  
  SR is much more interest
- And lo, the point was missed... (Score:5, Insightful)
  
  by DarkMan ( 32280 ) writes: on Tuesday September 28, 2004 @08:49PM (#10379553) Journal
  
  You are, of course, perfectly correct in everything you said.
  
  There are a number of HCI aspects where speech recognition is not a good solution.
  
  However, let me enumerate a number of other ones, where it's superior:
  
  Minutes of meetings, or similar. Imagine having a verbatim record of a discussion there by the time you get back to your desk.
  
  Someone who cannot type - e.g. no hands. Rare, granted, but still a viable use.
  
  Someone whose hands are busy. The cannonical example here is a pathologist doing an autopsy, where they dictate everything. Speech recogition saves time in transcription (and money for the audio typist).
  
  I'd love to be able to issue voice commands to a computer, for a few, isolated cases. For example, diagnosing hardware. Bring up a doc, and be able to get the computer to flip pages, without having to remove the probes from the hardware. Re locating them is a pain, and sucks time.
  
  Moreover, I'm certain that there are others, some of which will only be realised when it's common and cheap enough to be widely available.
  
  It's like a mouse. It's one of the worst general purpose input devices for a computer [0], but it's excels at indicating a single element on a display. The mouse and keyboard complement each other, and there are a bunch of other, more specifc input devices, such as the graphics tablet. I have no doubt that if speech recognition was as accurate and reliable as a graphics tablet, it would get a similar amount of use.
  
  [0] Try inputing a block of prose with only a mouse. Even specilist software makes it only suck marginally less.
  
  Parent Share
  twitter facebook
  - TV Remotes (Score:2)
    
    by DarkMan ( 32280 ) writes:
    
    Replying to myself, because I just had one of those silly ideas, that Might Just Work (tm).
    
    You know how the TV remote gets lost from time to time? And it's always a pain to find. Or that the remote is the other side of the room, so you have to walk away from the TV, pick it up, and end up moving further than to the TV itself?
    
    Put a microphone on the set top box, and use voice recognition instead / as well as a remote.
    
    Sonic contamination is easily solved, by subrtracting out of the picked up audio the TV
- Um...I don't want to get out of my bed! (Score:2)
  
  by MacFury ( 659201 ) writes:
  
  If I didn't have speech recognition on my powerbook...I couldn't tell it to turn the lights on or off. Jesus...are you saying I should get off my lazy ass and actually hit a light switch?
  I love x10 :-)
Obligatory Commercial Break (Score:5, Funny)

by Noksagt ( 69097 ) writes: on Tuesday September 28, 2004 @07:45PM (#10379129) Homepage

Woman: [dictating into cell phone] To: Mike. I had fun last night.
Cell Phone: To: Mike. I have lip fungus.
Woman: [into cell phone, angrily] I had FUN, not lip fungus!
Cell Phone: I have fungus, not lip fungus.
Woman: I DON'T HAVE LIP FUNGUS!!!

Share
twitter facebook
Never a huge fan... (Score:4, Insightful)

by ImaLamer ( 260199 ) writes: on Tuesday September 28, 2004 @07:47PM (#10379146) Homepage Journal

I've used a few packages for speech recognition but none really got me too excited. Well, Dragon Naturally Speaking did have me read a few chapters of Dave Berry to it. I bet it didn't work because of all the laughing, I was in tears.

I must say though that speech recognition is something that the whole computer community needs to work on. Now, we can finally do that. All the "open source community" needs is source that works a little. In a year or so, I bet this works better then most options available today.

Now, I know that isn't the rule but this is the type of thing that computer/math engineers could sit down to and contribute where others can't. It seems to be the rule that the really smart ones tend to work with open source software...

Really the cool thing is that this could get people involved who otherwise wouldn't because they don't know where to start.

Share
twitter facebook
- Re:Never a huge fan... (Score:3, Insightful)
  
  by jwsd ( 718491 ) writes:
  
  It seems to be the rule that the really smart ones tend to work with open source software...
  
  Or it seems to be the rule that those who work with open source software tend to think they are the really smart ones...
The Myth Must Die (Score:2, Offtopic)

by GroundBounce ( 20126 ) writes:

"Despite (or because of) being written in the Java programming language, Sphinx-4 performs as well as similar systems written in C"

It's amazing that the myth of Java being slow is so persistant. In fact, for computational tasks, many benchmarks have shown that a modern optimized JVM with JIT compilation is roughly equivalent with most implementations of C++, with some benchmarks being better for Java and some being better for C++ [javaworld.com].

Java *used* to be slow, in the days before optimized JIT JVMs. IMHO, anot
- Re:The Myth Must Die (Score:5, Funny)
  
  by nihilogos ( 87025 ) writes: on Tuesday September 28, 2004 @08:09PM (#10379303)
  
  many benchmarks have shown that a modern optimized JVM with JIT compilation is roughly equivalent with most implementations of C++, with some benchmarks being better for Java and some being better for C++.
  
  And many studies have shown that going with Microsoft software is cheaper than going with open sourced software.
  
  Parent Share
  twitter facebook
- There's more than one kind of overhead. (Score:5, Insightful)
  
  by argent ( 18001 ) writes: <peter AT slashdo ... taronga DOT com> on Tuesday September 28, 2004 @08:17PM (#10379367) Homepage Journal
  
  I could easily live with 10-15% slower, IF Java didn't have the startup overhead. I can run inetd-style fork-exec-terminate servers in C on CPUs that a cellphone would spit on, and handle hundreds of connections a second. Bringing up a JVM on the same processor would take minutes. Bringing up a JIT runtime would be out of the question.
  
  For applications where you can create a JVM and use it as you need it, Java's great. Webservers, sure, no problem. Desktop applications, heck, the GUI overhead's getting to be the same order of magnitude (though that HAS to change, we can't afford to depend on Moore's Law much longer unles someone comes up with a clever way to cut the power consumption of processors faster than the speed increases). Browser plugins? For content, yes, but not for navigation... if it takes 10s to start up a JVM your customer's already hit "back".
  
  Parent Share
  twitter facebook
  - Re:There's more than one kind of overhead. (Score:2)
    
    by Dr.Dubious DDQ ( 11968 ) writes:
    
    I could easily live with 10-15% slower, IF Java didn't have the startup overhead.
    This is the reason I keep wondering about whether or not this, that, or the other Java package can be compiled to native code with GCJ [gnu.org]. If so, that should solve the overhead issues involved in calling up a JVM...
  - Re:There's more than one kind of overhead. (Score:4, Informative)
    
    by LarryRiedel ( 141315 ) writes: on Tuesday September 28, 2004 @09:54PM (#10379901)
    
    I can run inetd-style fork-exec-terminate servers in C on CPUs that a cellphone would spit on, and handle hundreds of connections a second. Bringing up a JVM on the same processor would take minutes.
    [...]
    if it takes 10s to start up a JVM your customer's already hit "back".
    
    I find that startup/shutdown for a simple Java program takes about 200ms at 1GHz with the vanilla Sun JDK 1.5 JVM, or 150ms using gcj (gcc), and an equivalent C program takes about 2ms.
    
    Browser plugins? For content, yes, but not for navigation.
    
    The overhead of starting a JVM should be incurred only once per browsing session.
    
    Larry
    
    Parent Share
    twitter facebook
    - Re:There's more than one kind of overhead. (Score:3, Insightful)
      
      by argent ( 18001 ) writes:
      
      I find that startup/shutdown for a simple Java program takes about 200ms at 1GHz with the vanilla Sun JDK 1.5 JVM, or 150ms using gcj (gcc), and an equivalent C program takes about 2ms.
      
      A factor of 100 difference in the overhead is a bit better than I've seen. I assume that I've never tried it on a sufficiently simple Java program, or you're talking about a dynamically linked C program. Still, a factor of 100 difference in the startup overhead is hardly a negligable consideration.
      
      The overhead of starting
- Re:The Myth Must Die (Score:2)
  
  by Brandybuck ( 704397 ) writes:
  
  It's amazing the myth that Java is fast just because it can optimize some benchmarks and computationally intensive loops. In my real world with a real world computer running real world applications, Java still runs slower C.
  
  I've ported several computationally intensive image processing programs from C to Java and have experienced a speed degradation of perhaps 10-15%
  
  Aha! Even you admit it!
- Benchmarks are TIGHT LOOPS with no GC !! (Score:4, Interesting)
  
  by Gopal.V ( 532678 ) writes: on Wednesday September 29, 2004 @03:22AM (#10381314) Homepage Journal
  
  > It's amazing that the myth of Java being slow is so persistant
  
  Before you mod me down as a Troll , I work on a virtual machine as a hobby.
  
  The problems with Java being slow have little to do with the "execution of code" part. The part that takes a hit are the Garbage Collector and the Class Loader. The latter causes a HUGE hit in the start up. The former is responsible for those strange Swing freezes I've been seeing when I switch into a Java app.
  
  Unicode also brings its own set of junk , for example "Hello World" in dotgnu's JIT does 7302 hastable inserts, 6000+ StringBuffer operations to initialize the Unicode encoder/decoder. And that is the standard way of decoding unicode (mono uses the same code).
  
  Lastly , C/C++ commonly uses a lot of fields while Java brings in get/set methods for these. A method calls for a get or set is a LOT more expensive than a pointer read . Design has a lot to do with why Java is slow.
  
  The enterprise apps where Java is popular are essentially backend applications which run for long periods of time (so have all the classes looked up and loaded) with a HUGE heap (256 MB or more) where occasional GC freeze won't destroy the entire experience (as it is often JSP/Web based interfaces).
  
  Java *is* fast, if you don't count the slow parts.
  
  Parent Share
  twitter facebook
Argh! (Score:3, Funny)

by kaffiene ( 38781 ) writes: on Tuesday September 28, 2004 @07:57PM (#10379222)

Resist! It's SUN trying to ruin Linux and OS again!
It even uses Java!!!! Slashbots must fight back!

Share
twitter facebook
Good Success (Score:2, Informative)

by billdar ( 595311 ) writes:

I've been using sphinx for about a year or so now for a linux-base home automation project. I must say that it has worked out very well for me so far.
The speaker independant feature is the best part. Not all words were recongnized, about 70%. Probably because I slur the other 30%. It works equally well with either my wife or myself issuing commands.
70% is more than I need for this particular project, but I'm sure this new release closes the gap even further.
no more music (Score:3, Funny)

by kongit ( 758125 ) writes: on Tuesday September 28, 2004 @08:08PM (#10379300)

Guess I won't be listening to music when root anymore. In fact I am sound proofing my room to keep the noises from infiltrating my microphone and causing me to accidently delete /home

Share
twitter facebook
I hope they don't expect to get paid! (Score:2)

by museumpeace ( 735109 ) writes:

Sorry for airing reruns but its more relevant here. As I have said before:
Re NSF blowing a measly million to put speech recognition in silicon [slashdot.org] [for which there were many interesting and informative comments posted] I said:
Just a million? Pfft! I went down the tubes with one S.R. startup back in '92 that ate far more of some VC's money than that. Now NSF is not in it to get rich and I hope I am right in assuming that a successful chip design, if a mere $1000000 gets that far, would then be available at no
Can it do transcription? (Score:2)

by argent ( 18001 ) writes:

That's what I want, not SR. I tryed using the voice recorder feature on my PDA but it's not something I can use without a secretary to transcribe my voice into text. It's bad enough taking or leaving voice mail... it's just not my medium.

But if I could take those wave files from my PDA and convert them to text notes... even in the background offline after I sync, then they'd be useful. But you need accurate transcription for that. Is that in there?
with with source source (Score:2, Funny)

by dhart ( 1261 ) * writes:

Open Source Speech Recognition - With Source

Does it come "with au jus sauce" ?

Would that make it "with with source source" ?
nifty desktop control with sphinx and festival (Score:5, Interesting)

by Danny Rathjens ( 8471 ) writes: <slashdot2@PERIODrathjens.org minus punct> on Tuesday September 28, 2004 @09:07PM (#10379635)

http://perlbox.sourceforge.net/ [sourceforge.net]
The very small vocabulary needed for desktop control makes the speech recognition much more accurate and usable.

Share
twitter facebook
Ok (Score:2)

by cubicledrone ( 681598 ) writes:

Speech recognition seems similar to VRML. It would be really cool if it worked. But it never quite seems to work.
I used to work for MacSpeech, doing UI work (Score:5, Insightful)

by notthepainter ( 759494 ) writes: <oblique@alum.mit. e d u> on Tuesday September 28, 2004 @09:33PM (#10379779) Homepage

And before that, I worked for Articulate Systems, also doing UI work.
With that said, you can probably guess I have a lot to say about Speech Recognition. (Not Voice Recognition, that's different, that would be able to distinguish Ben from Charlie for example.)
A good SR engine is, of course, essential. And I've not read the details on the two recent giveaways, but I suspect that they are only the engine.
The SR engine is just a begining. There is a ton of UI work that needs to be done. Sit and think about spacing around punctuation marks and then think about capitalization around puncuation marks. Yeah, it is all pretty cut and dried and known but the details really need to be sweated to get it right. This is very time consuming.
Next you have to worry about exactly where you are editing. Is that into Microsoft Word (or Open Office), or emacs, or where? It can make a huge difference when you want to go back and correct misrecognitions. You just don't want to send N delete characters and retype it, that results in a lousy user experience. So just exactly where is the input cursor at all times? This is not an impossible problem, but one where the details must be sweated.
Next is command and control. Just how are you going to let the user grab the text of all the menus and all the text in the dialog box buttons. Again, not impossble, but more of those pesky details.
Finally, is your SR engine good enough? Maybe, maybe not. Let just say that 98% accuracy might look good on paper, but that is one in 50 words wrong. Unless your correction mechanism is smooth, an error rate that high greatly slow you down.
Is Open Source SR a good thing? Oh yes sir, yes! But lets not forget the details. One thing the Open Source community has been accused of, perhaps justly, perhaps, unjustly, is not sweating the details.
Speech Recognition has an awful lot of details.

Share
twitter facebook
Why speech recognition on Linux will kill Windows (Score:5, Insightful)

by MarsF ( 631122 ) writes: on Tuesday September 28, 2004 @09:41PM (#10379825)

I was thinking about this the other day, and was wondering if this is a huge gap in the Windows user interaction model.

Think about how you input info using windows. You click on a few locations using the mouse, perhaps use some keyboard input, click some more. The output from these inputs is arbitrary: it may result in anything from a 'File/Save' dialog to a custom error dialog box. There is no linear path for inputting commands, or for mapping inputs to results.

Compare this to the command line. You enter a few distinct atomic commands, and view the results in the same medium. You then enter more commands, refining your actions. The key here is that you already have a linear model for input that produces well defined expected results, all in a common medium that is conceptually simple, visible to the user, and easily processed by machines. Extending this model to accept voice input or output is trivial.

How is one supposed to quantify basic tasks and turn them into equivelant voice commands without a baseline framework or paradigm to extend from? How do you automate, simplify, or extend existing tasks without a common input or output medium? GUIs provide no such medium or framework; that same framework is at the heart of the command line interface!

Perhaps this is why we never saw voice recognition technology take off on Windows. It's blinking impossible to script actions for an arbitrary task, let alone process the arbitrary results!

On a similar note we may see voice recognition on Linux take off like a rocket. Anybody can add voice recognition to perform almost any command because the actions are all scriptable throught the CLI already. If you can type it, you can get your computer to do it when you say 'computer, foo!'

Mars

P.S. It would be greatly appreciated if someone could please clarify my point. It's buried in there somewhere...

Share
twitter facebook
I am sick of these "Java sucks" comments (Score:4, Insightful)

by janoc ( 699997 ) writes: on Wednesday September 29, 2004 @07:30AM (#10382011)

Hello, did actually any of you Java bashers actually try the Sphinx4 engine ? I tried it and it is pretty good. Actually a lot better (faster and more accurate) than the older Sphinx2 engine which was written in *gasp* C! Or are we bashing a project just because it is written in "slow and bloated" Java ?
I think some people should open their eyes, otherwise the world will leave you behind while you are happily consoling each other how Java is slow and unusable. Wake up, folks!
To people which argument about hand writing C and assembly - well, you obviously didn't try to implement any of the algorithms (like hidden Markov models or the statistical searches) used in speech recognition. It is pain in the butt to do it even in Java, but at least you do not have the pointer mess you would have in C/C++. The engine has a good performance already, I am not sure what you would gain by rewriting it, except of bugs (the older Sphinx2 was for sure buggy as hell).
Something about the memory footprint. Java can have a large memory footprint, however with speech recognition, you will always have it. Just the accoustic models for one language can be easily in the order of several hundreds of megabytes. Memory footprint of Java is completely irrelevant here.
And before somebody compares Sphinx with speech "recognition" on you mobile phone or in your car - be aware, that you are comparing scateboard with a Concorde here. Sphinx family of engines are intended for recognition of continuous, large vocabuly speech and to be speaker independent. Your phone/car is small vocabulary, single words and speaker dependent - i.e. completely different problem. You cannot think about Sphinx as something "to have on some device". It is more intended to act as a speech recognition server on a dedicated machine e.g. for a large call center or ticket reservation system. I guess it could be used also in KDE for the KAccessibility purposes, but it is a bit heavy for that (especially with the large datasets).
So next time, before you start spouting BS about Java and applications written in it, at least check the facts. People will not see you as a complete idiot.

Share
twitter facebook
Part of Galaxy Communicator (Score:3, Informative)

by mattr ( 78516 ) writes: on Wednesday September 29, 2004 @10:08AM (#10383006) Homepage Journal

Wow that is great that Sphinx-4 is open! A-And guess what, Galaxy Communicator [sourceforge.net] also has snuck onto sourceforge too, quietly, a year ago. A year or so ago I had written one of the partners to try to get a copy with no reply.. but some googling found it. Most slashdotters probably don't know Galaxy but it is the same partners - CMU, MITRE, DARPA etc. It is the plug and play hub [sourceforge.net] for related technologies. This stuff has been used to make voice-recognizing automated telephone information services for weather and flight info I believe. Well what I found on sourceforge is 2002-2003 version (when grant ran out?) and has a list of modules [sourceforge.net] which could use some updating i.e. about how Sphinx-4 is available. So can we expect a new Galaxy Communicator distro? I always had trouble finding out about it because each participating institution had their own site, their own distro, some focusing on different things, etc. I remember looking at CMU and I think Colorado U., anyway.
Note in the 2002 version that the dialog server is not included, this would be great to have too. MIT also has some very cool technologies in this area - SUMMIT, TINA, GENESIS, ... - which I do not believe are public, they just show little bits and pieces of PR about them, but include natural language parsing, question answering, sentence generation, etc. It would be cool if someone on the inside could document just what things are available, what works with what, what is definitely ready for prime time, etc. There must be some people who hacked on this in the past few years and are still developing things, it would be cool if some of their experimentation was available to the open source community so people could get an idea of what things are possible. When I did my survey just about 1 year ago, Communicator was daunting, intriguing, and it looked like you could do tons of stuff if you had some secret decoder docs and a spare year to hack. Maybe now's the time to dig into it hip deep?

Share
twitter facebook
- Re:Speech Recognition is a Mature Technology. (Score:2, Insightful)
  
  by Anonymous Coward writes:
  
  Yeah i'm sure it's just that easy... you dumb f*ck
- Re:Speech Recognition is a Mature Technology. (Score:4, Funny)
  
  by lawpoop ( 604919 ) writes: on Tuesday September 28, 2004 @07:51PM (#10379175) Homepage Journal
  
  Because it's hard to get a computer to wreck a nice beach
  Ba-dum-dum ding!
  
  Parent Share
  twitter facebook
- WTF! (Score:3)
  
  by Bricklets ( 703061 ) writes:
  
  What is the problem? Speech recognition is a mature technology, and algorithms for speech recognition are well documented in the research journals. ... If you want to code some speech recognition software for Linux, just get a good book on C# ... and photocopy some relevant papers from the IEEE Transaction journals.
  
  Let's apply your logic on why there isn't a good OS implementation of Java: What's the problem? Java is mature technology and its API is well documented. If you want to code some JRE or Java co
- Rolling your own speech recognition isn't so easy (Score:5, Informative)
  
  by belmolis ( 702863 ) writes: on Tuesday September 28, 2004 @09:02PM (#10379616) Homepage
  
  Speech recognition is not really a solved problem. For some applications it works adequately, but if you take a look at the error rates for the Sphinx system to which the post links, you'll see that the Word Error Rate for large vocabulary is over 18%. Even for 5,000 words it is 7%. For many applications that is unacceptable.
  
  A second factor is that these statistical speech recognition systems require extensive data for their language model. Building such a system requires recording real speech, segmenting it and creating a set of examples from which to compute the probabilities, which requires some knowledge of acoustic phonetics, and doing the computation for the model. This is time-consuming.
  
  Speech recognition technology isn't a dark secret, but it isn't trivial to create a system with good performance either.
  
  Parent Share
  twitter facebook
  - Re:Rolling your own speech recognition isn't so ea (Score:3, Informative)
    
    by starm_ ( 573321 ) writes:
    
    Thats true. It is an underestimated problem. People assume that we can recognise a word by using just the sound of it. That is simply not true. When speaking at a reasonable speed humans do not utter words clearly. This is not a problem to us because we can guess the words by using context and semantics.
    
    In order to have a good speech recognition system, the computer would have to actually understand the meaning of the sentences and put it in context. There are different levels of analysis necessary to do t
- Re:Interesting side note (Score:2, Funny)
  
  by Anonymous Coward writes:
  
  Unfortunately, your professors didn't teach you a thing about scams. Stop promoting a pyramid scheme!

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Aim You Sing Ate Write How (Score:5, Funny)

Translation for those who still don't get it... (Score:4, Informative)

Re:Translation for those who still don't get it... (Score:2, Insightful)

Re:Translation for those who still don't get it... (Score:2)

Re:Translation for those who still don't get it... (Score:3, Funny)

Re:Translation for those who still don't get it... (Score:3, Insightful)

Re:Translation for those who still don't get it... (Score:4, Insightful)

Re:Aim You Sing Ate Write How (Score:5, Funny)

Java!?! (Score:3, Funny)

Re:Java!?! (Score:5, Funny)

Re:Java!?! (Score:3, Informative)

Re:Java!?! (Score:3, Informative)

Convert to C easily with ALMA (Score:4, Informative)

Google, schmoogle, there are better ways! (Score:3, Informative)

Re:Java!?! (Score:5, Informative)

withOUT source surely? (Score:3, Funny)

Re:withOUT source surely? (Score:2, Interesting)

Java comment (Score:3, Insightful)

Re:Java comment (Score:4, Insightful)

Obviously (Score:5, Funny)

Re:Java comment (Score:2)

The issue is Javas footprint and integration (Score:5, Insightful)

Frustrated Java detractors... (Score:3, Interesting)

But what about text to speech? (Score:5, Interesting)

Re:But what about text to speech? (Score:5, Informative)

Re:But what about text to speech? (Score:3, Interesting)

Re:But what about text to speech? (Score:2)

Re:But what about text to speech? (Score:4, Interesting)

Re:But what about text to speech? (Score:3, Insightful)

Re:But what about text to speech? (Score:3, Insightful)

Re:But what about text to speech? (Score:3, Insightful)

Re:But what about text to speech? (Score:5, Funny)

Re:But what about text to speech? (Score:3, Funny)

OT Star Wars Nitpick (Score:5, Informative)

Re:But what about text to speech? (Score:2)

Re:But what about text to speech? (Score:2)

Re:But what about text to speech? (Score:3, Interesting)

Re:But what about text to speech? (Score:3, Funny)

Re:But what about text to speech? (Score:2, Funny)

Re:But what about text to speech? (Score:2)

Re:But what about text to speech? (Score:4, Insightful)

IBM gave me speech recognition a decade ago (Score:2, Interesting)

How about open source word spotting (Score:2, Interesting)

Virtual Machine Syndrome (Score:5, Funny)

Re:Virtual Machine Syndrome (Score:5, Funny)

Re:Virtual Machine Syndrome (Score:5, Informative)

Re:Virtual Machine Syndrome (Score:3, Insightful)

Re:Virtual Machine Syndrome (Score:5, Funny)

Re:Virtual Machine Syndrome (Score:4, Insightful)

Re:Virtual Machine Syndrome (Score:5, Insightful)

Re:Virtual Machine Syndrome (Score:5, Insightful)

Re:Virtual Machine Syndrome (Score:5, Funny)

Re:Virtual Machine Syndrome (Score:3, Insightful)

Re:Virtual Machine Syndrome (Score:3, Informative)

Re:Virtual Machine Syndrome (Score:3, Funny)

Free C++ alternative from Mississippi State Univ. (Score:5, Interesting)

Re:Free C++ alternative from Mississippi State Uni (Score:2, Funny)

Another alternative: HTK (Score:3, Informative)

Cool... (Score:3, Funny)

Open Source - With Source! (Score:4, Funny)

Build Instructions (Score:2, Insightful)

Re:Build Instructions (Score:3, Funny)

Telephony (Score:2, Interesting)

Re:Telephony (Score:3, Informative)

Sphinx 2 (Score:5, Informative)

Re:Sphinx 2 (Score:3, Informative)

Speech recognition (Score:5, Interesting)

Re:Speech recognition (Score:3, Informative)

And lo, the point was missed... (Score:5, Insightful)

TV Remotes (Score:2)

Um...I don't want to get out of my bed! (Score:2)

Obligatory Commercial Break (Score:5, Funny)

Never a huge fan... (Score:4, Insightful)

Re:Never a huge fan... (Score:3, Insightful)

The Myth Must Die (Score:2, Offtopic)

Re:The Myth Must Die (Score:5, Funny)

There's more than one kind of overhead. (Score:5, Insightful)

Re:There's more than one kind of overhead. (Score:2)

Re:There's more than one kind of overhead. (Score:4, Informative)

Re:There's more than one kind of overhead. (Score:3, Insightful)