State of Sound Development On Linux Not So Sorry After All 427
An anonymous reader writes "There have been past claims by Adobe and others that development on Linux is a jungle, particularly with regards to audio. However today, the author of the popular 'The Sorry State of Sound in Linux' has posted a follow up showing Adobe's claims to be FUD, as well as being a good update on where OSS and ALSA are holding today, and why PulseAudio isn't a good idea."
Re:Main blocker (Score:3, Interesting)
Re:it's all relative (Score:5, Interesting)
Sorry - It's still pretty "sorry"... (Score:2, Interesting)
Really, it is.
It can be a pain in the ass to get working still, and is buggy.
I'm sure it works well for some, but many others still have problems.
What are we trying to achieve? (Score:5, Interesting)
Yes, Linux audio sucks. If nothing else, we have three common and incompatible APIs to perform a single tasks, and none of them are definitively better than the others. So, my question: what exactly is it that we're trying to achieve? What's the end goal of creating newer APIs instead of perfecting the old ones, such as moving from OSS to ALSA to whatever they roll out this month?
For comparison, FreeBSD uses multi-channel OSS. You can have a whole passel of processes writing to /dev/dsp simultaneously, because whenever a process attempts to open it, the OS spawns off a new copy. It Just Works. I'm a little amazed that my FreeBSD server's sound handling is so much better than my Linux desktop's and requires approximate zero client configuration. So again, what was Linux hoping to achieve by dropping old "obsolete" OSS in favor of increasingly complex solutions?
He makes one excellent and crucial point (Score:5, Interesting)
And that is that ALSA's way of handling mixing is completely moronic.
As an user, I care about hearing sound first of all. Sound quality (no pops or crackles) comes second, latency comes third.
There should always be sound mixing, with no ifs, buts, exceptions, or configuration required. It should be there by default for anything that tries to play sound, whether through ALSA or the OSS backwards compatibility.
The result of this nonsense is that crap like pulseaudio continues to exist, which is CPU hungry, often skips, fails to work with some programs and crashes frequently (what the hell is up with that?).
Is there any document out there which explains why /dev/dsp doesn't get mixing with ALSA? And why nobody tried to patch that yet?
Audio sounds better on Windows (Score:1, Interesting)
The audio quality is disappointing on Linux. I don't know if it's the decoding or the playback, but audio sounds much better on Windows.
Re:Main blocker (Score:3, Interesting)
Not to mention the state of media players on Linux...
Going a little bit off-topic first: a cousin of mine had Ubuntu on his laptop (featuring a Geforce 9300M G) and couldn't get rid of image tearing in VLC. Who would be the culprit in this case? The video drivers or the media player? I have kept wondering since then and my enquiring mind would like to know.
At any rate, could you please elaborate? What makes media players bad under Linux?
Re:it's all relative (Score:3, Interesting)
Alsa to OSS (Score:3, Interesting)
Over the years I had a lot of prolbems with ALSA, the biggest being the lack of sound mixing with the sound card on my motherboard. To get around it, I went out and bought a different sound card that supported hardware mixing. I still had problems where ALSA would just break periodically and require restarting it. Then at one point it just plain broke and nothing would fix it.
I had enough and installed OSS. What a difference. Latency is better and it just works. There is no excuse to not providing consistent audio mixing. I should have switched to OSS in the beginning rather than buy an expensive sound card because ALSA couldn't do software mixing.
A sound API should provide sufficient abstraction so that basic operations do not depend on the underlying hardware. Mixing, sample rate conversion (when needed) and per-application volume settings fall under basic operation as far as I'm concerned.
multiple sound cards and braindead applications (Score:3, Interesting)
Re:The fundamental problem (Score:1, Interesting)
Moreover, I don’t think many programmers get excited about managing audio buffers and performing sample rate and format conversion, so they would still need a userspace library to do at least those jobs (the kernel can’t even do floating point!). So here comes PulseAudio, which gives the developers a far greater freedom than any kernel-based implementation could ever do. How would you deal in-kernel with features like sound over bluetooth, user-provided codecs, sound over the network, or sound redirection for whatever reason you could ever think of?
Re:The fundamental problem (Score:5, Interesting)
I agree with the specification point, and agree that the lowest level API should be as basic (and standard) as possible. Then, once you have that, you can layer whatever higher-level architecture you like on top, as the low-level drivers are "just there" and will "just work".
However, this doesn't help applications, necessarily. I would argue to help apps writers, you need to standardize the glue between layers, such that sound and commands can be passed from one layer to another in a predictable manner. Innovators can always add new commands that are parsed by their own injectable layer.
I would also argue that it's impossible to chain userland software a-la JACK via the kernel efficiently, as you've a double context switch per element in the chain. Since transforms are CPU intensive, you want to do the fewest composite transforms possible, which means a software mixer should be something you can chain, which means that the heavy-lifting mixer needs to be in userspace.
(Either that, or you're going to need LADSPA and LV2 support in the kernel, plus some way of coaxing "smart" sound cards into supporting such effects. Since the kernel developers would force the first coder who tried to submit such a patch to walk the plank, I don't see it as likely.)
This would leave the low-level mixer for mixing between kernel threads (rather than between applications per-se) and normalizing the inputs. If we're not having to normalize values anywhere else in the process, we should end up with improved quality and less latency. (Anything that mucks with precision hurts quality, and any operation at all hurts latency.)
Are you kidding? (Score:5, Interesting)
App -> libao -> OSS API -> OSS Back-end - Good sound, low latency.
App -> libao -> OSS API -> ALSA Back-end - Good sound, minor latency.
App -> libao -> ALSA API -> OSS Back-end - Good sound, low latency.
App -> libao -> ALSA API -> ALSA Back-end - Bad sound, horrible latency.
App -> SDL -> OSS API -> OSS Back-end - Good sound, really low latency.
App -> SDL -> OSS API -> ALSA Back-end - Good sound, minor latency.
App -> SDL -> ALSA API -> OSS Back-end - Good sound, low latency.
App -> SDL -> ALSA API -> ALSA Back-end - Good sound, minor latency.
App -> OpenAL -> OSS API -> OSS Back-end - Great sound, really low latency.
App -> OpenAL -> OSS API -> ALSA Back-end - Adequate sound, bad latency.
App -> OpenAL -> ALSA API -> OSS Back-end - Bad sound, bad latency.
App -> OpenAL -> ALSA API -> ALSA Back-end - Adequate sound, bad latency.
App -> OSS API -> OSS Back-end - Great sound, really low latency.
App -> OSS API -> ALSA Back-end - Good sound, minor latency.
App -> ALSA API -> OSS Back-end - Great sound, low latency.
App -> ALSA API -> ALSA Back-end - Good sound, bad latency.
Do you by any chance buy Monster cables, and a wooden volume knob, because it "sounds better"?
I'm sorry, but without proper ABX tests, I do not believe a single word of this table.
And about the latency: Please enlighten us, how you actually measured them?
Re:What are we trying to achieve? (Score:3, Interesting)
I think the main difference is that they broke it up into a client/server design. The advantage of that being the freedom to use whatever UI you like.
In theory that sounds like a good idea, but I've never tried it since the development was going pretty slowly.
Tearing (Score:3, Interesting)
Try using the OpenGL output driver, and make sure 'wait for vertical blank' (vsync) or a similarly-worded option is enabled.
Re:By saying that he proves his former point (Score:5, Interesting)
Which is great, but it's not so great if you are trying to produce audio.
When I plug my guitar in, I can notice a latency greater than 5ms. And greater than 25ms, it drives me insane.
Compare that to what I get with PluseAudio (usually): 100-150ms. No thank you.
Re:He makes one excellent and crucial point (Score:1, Interesting)
dmix doesn't work for OSS emulation, its a different path, so if you have an app that uses OSS and want software mixing with the ALSA apps you're screwed. I've got plenty of those it seems including commercial games.
It is still a mess. Setting up ALSA is not awful but not great. Why should I have to know to edit /etc/modprobe.d/alsa-base and explicitly add this: /dev/dsp and not the onboard Intel HDA or the onboard HDMI output? Don't know if that second line is right, good luck finding documentation on it but it works. Oh I know I can make another card the 'default' one for ALSA but that doesn't help the OSS part. Jeez...
options snd-emu10k1 index=0
options snd-hda-intel index=1,2
in order to get my Audigy2 to take
OSS emulation isn't just for *old* apps, its for new ones too where the developer takes one look at the ALSA API, barfs, trudges through alsa wikis, sparce documentation and decides screw this, I'll use the OSS API. Much cleaner.
The new OSS is cool, mostly a one man show though but its new mixer API is a major headache. Kind of sympathise since the new hardware is also a headache.
So what do I do? Give up on getting ALSA and OSS to play nice with the onboard HDA and scour amazon marketplace for an old Audigy2 with a well supported emu10k1/2 that has hardware mixing. Ahhh sanity at last. Its damn silly really.
Re:Main blocker (Score:3, Interesting)
Re:Main blocker (Score:4, Interesting)
Pretty sure VLC doesn't do hardware acceleration on any platform period. Nvidia supports VDPAU in linux which allows you to play HD flawlessly with practically any card as long as the video player supports it (and a number do, mplayer and XBMC are two that come to mind off the top of my head).
See: http://www.phoronix.com/scan.php?page=article&item=nvidia_vdpau_gpu&num=1 [phoronix.com]
Re:A sure road to success ..... (Score:2, Interesting)
For so many ALSA just works when it is installed, not for me.
For so many OSS just works when it is installed, not for me.
For a few, PulseAudio just works when it is installed, not for me.
The one thing I will fault ALSA and OSS for, is not allowing multiple audio streams to play simultaneously without crashing the system; in all circumstances. The support forums are littered with issues like this. It definitely dos NOT work for everyone. At least handle one and play one, just choose one, but to crash and not play anything, that just sucks. And to force a reboot before working again, well that is a FAIL! This is not from my personal experience, as I have posted, I could not get any of them to work for my scenario, but based on days and weeks of searching through forum posts looking for solutions, I know I am far from alone.
I read support requests for all three: ALSA, OSS and PulseAudio. So to date, outside of BSD (which I am currently NOT running, but may in the future) Linux + SOUND is a very REAL ISSUE! (see my solution at the end of this post, there is one solid solution, guaranteed to work)
For any one of these to be the best solution, they must handle additional audio streams, from any source, without crashing the system. For instance, when my VoIP phone rings, and I answer the phone, the Audio Radio stream, CD playing, or Video should pause until I restart it and let me answer the phone and hear the person talking to me.
Ideally if I want to listen to the music and watch a video at the same time, I should be able to mix in the sound levels and do that. The solution should have a way to handle it. Heck I should be able to hear the radio, video and VoIP phone all at the same time if I wanted this. I should be able to mix the sound levels and it should work.
Back in the mid 90s I was using a midi keyboard to play a sound track, save it. Then use that same keyboard to play another sound track and save it. I could even convert what I played on the keyboard and make it seem like it was a different musical instrument. An Oboe, a flute, a trumpet, etc. The software (Audio Visual Communications 1.3 running on OS/2 1.2, when the marketeers would have you think only a MacIntosh PC could do this; I was doing it on both IBM PCs and MACs.) would then let me play back all the sound tracks together. I could literally create my own symphony.
Based on what I read, this was one of the major things PulseAudio was going for that was NOT available in either OSS or ALSA. The fact that OSS had gone proprietary was not helpful either. I think they have both a proprietary and open source OSS solution today, but am not sure.
And this was over a decade and a half ago. So Linux should have this today. Perhaps an API solution would allow for this, but first, just to handle multiple audio streams in an intelligent way without crashing the system would be HUGE!
For anyone reading this that wants to avoid these types of issues, there is a solution. If your PC was installed with Linux out of the box, with everything you need: WiFi, 10/100/1000 Ethernet, Sound (audio), Video, Burn CDs, Burn DVDs, plug n play USB support, Ext Monitor support if a netbook or laptop, You should be okay!
Stop going to any vendor and buying a PC with any other operating system installed on it. Only buy hardware with Linux pre-installed and you avoid allot of issues. Avoid vendor LOCK IN [slashdot.org].
A
Re:Problem with PulseAudio? (Score:5, Interesting)
I agree. I think it has more to do with some kernel developers who refuse to consider OSS after OSS3.
The OSS kernel interface is simple and the audio mixing is performed in the kernel (if needed) where it should be. All an app needs to do is open /dev/dsp and perform a few ioctl calls and they're ready to go. They don't need to care whether some other application is also playing audio or not.
It's much cleaner than ALSA, which is a mess IMO. I've had a lot of problems with ALSA until I finally dumped it for OSS4 which solved the constant clicking, stuttering and lack of audio mixing. ALSA would often need to be restarted and it finally got to the point after a kernel upgrade where ALSA just plain refused to work at all.
With OSS I can basically choose the format of the audio, the sample rate and the volume and just set it and go. If the hardware doesn't support multi-stream mixing and volume then OSS does it in software. Similarly, if the hardware doesn't support the sample rate (i.e. 44100) then OSS will resample it to match the hardware, thus abstracting the hardware from the software, which is the way it should be.
Re:Main blocker (Score:1, Interesting)
Since you're the only one who asked politely, I'll share with you. Nothing necessarily makes media players bad under Linux, but similarly, nothing guarantees one will be good either. The reason I don't believe any Linux media players are good is simply that nobody has written one yet. While Amarok looks promising, that's all it looks, but next to a commercial offering like iTunes it's nothing. It's not that I particularly like iTunes, it's just that:
short version: bugs bugs bugs bugs bugs
All this ignoring the unacceptable quality of some sound drivers, the nForce4 AC97 being my current example, but most cards that I've used in the last ten years suffering OSS->esd->aRTS->ALSA and back again have exhibited artifacts, often DC bias in the output. I guess it's probably because a lot have been incompletely reverse engineered.
Believe me, I've never been a fan of iTunes either, but on the other hand it's given me no reason to hate it either. It does what's promised nice and smoothly, and stays out of my way. Ultimately, here's a hypotheical example of a key difference I believe to be between a commercial closed source player like iTunes and an OSS one like Amarok.
I accept that if you run Linux, Amarok is as good as you're going to get. Until a month ago, it was the best *I* had for about three years. (Before that was XMMS, but it became stable so Gentoo removed it). I hated every minute of it, and now that I've got the choice I'll never use it again.
Re:Main blocker (Score:3, Interesting)
I do, actually after I post this I'll be using it to produce a Jazz album that I also recorded on a linux machine (Fedora, ccrma). I think Jack is great to hook up different audio applications and I think the resulting production process is a real step forward from existing digital mixing/mastering processes. Not perfect, sure, but I'm not certain it can be duplicated on Mac's and windows boxen.
I've produced three albums so far under linux and the software has come forwards significantly since I started playing around with it in 2003.
In recording mode - where it matters most - I have a machine that is stable because if there are any problems during the recording the musicians are not likely to be understanding. Usually the machine is set to record over a day or two with 16 channels of input and very little interference. The underlying features that a Linux box offers like LVM's, fast file systems like reiserfs, tunable kernels are a bit of a hassle to set up at first but the result is an exceptionally stable system.
There are shortcomings but I just develop new habits to overcome them. With the money I saved on a mac and protools I have bought some great recording equipment. I plan to start donating to the Ardour and jack projects because that is what they need to improve and make them progress a lot faster.
Without the Alsa project as a foundation I don't think any of the sound projects happening now under linux would have been possible.
Re:So, when do we go ALSA - OSSv4? (Score:3, Interesting)
Of course, developers will have to support ALSA for a long time (dropping ALSA altogether would break nearly ALL the current linux applications, not just flash player) so the support burden for distributions maintainers would become even heavier.
All of this - because ALSA does not match the pipe dream about sound systems of TFA writer. In the end, the features offered to the end user by a OSSv4 stack would be less than those provided by a working ALSA + PulseAudio stack, as even the writer itself states (about hybernation support).
Not to mention the fact that nowadays many applications will make use of high level libraries that hide the details of the sound system from them, so they couldn’t care less about ALSA or OSS.
So no, thank you! Please report bugs, do complain as loud as you can, but yet another fork is the last thing we need now.
Re:What are we trying to achieve? (Score:3, Interesting)
OSS became commercial (non-Free) and so new versions couldn't be imported into the Linux or FreeBSD kernels. It also lacked in-kernel software mixing, so if your sound card didn't support multiple channels you could only have one device playing sound at once. At this point, the two camps went in different directions.
The FreeBSD team kept adding features to the open source version of OSS. They followed the 4Front APIs, and included support for mixing. They maintained backwards compatibility with all of the existing software, and exposed newer features to new software via new ioctls. FreeBSD now supports most of the OSS4 APIs with their own code.
The Linux team decided that OSS was now evil and proprietary, so they deprecated the OSS3 APIs in favour of the new ALSA APIs. ALSA fixed the problem of sound mixing but, unfortunately, did it in userspace. I say unfortunately, because the OSS compatibility APIs in ALSA are implemented in the kernel. This means that you can't have two 'legacy' (read: portable) OSS applications playing sound at the same time.
The moral of this story is that throwing away a working code base and starting again rarely produces better results than incremental improvements. Oddly enough, in spite of this you still get a lot of people claiming that Linux is ready for the desktop, while FreeBSD isn't.
Re:What are we trying to achieve? (Score:2, Interesting)
So again, what was Linux hoping to achieve by dropping old "obsolete" OSS in favor of increasingly complex solutions?
Some people have it in their heads that the only things that should be in kernel space are things that absolutely have to be, and everything else should be in user space. Since mixing doesn't absolutely have to be in kernel space, they decided to do it in user space. ...but, from user space, you can't receive device file ioctls, and so the userspace portion is alsalib, which is C, which means that if you want to use ALSA, you have to write in C. Sure, you could link to the libraries in any compiled langauge, but you'll still need those header files, and they're only in C.
Moronic decisions such as this piss me off. The purpose of a monolithic kernel is to provide services for applications so that they don't have to run on bare hardware, instead they run on an idealized system, regardless of the features of the actual hardware. For example, we don't say that you need ten CPUs to run ten processes at once, so why should my sound card have to have ten hardware mixers to play ten audio streams at once? The kernel multitasks the CPU, but why not the sound card?
Doing the bare minimum in each process is a micro kernel design. Personally, I'd prefer a micro kernel, but Linux just isn't designed that way, and trying to both at once just gives you the worst of both designs.
Another area that really pisses me off is video drivers. X11 should not be the video driver, it should be an ordinary application that implements the X11 protocol via the kernel's video API. It's very slowing moving in that direction, but it's far from there yet. Things like "/dev/fb0" are bare-minimum solutions, for example, they don't implement console switching. Writing to the device writes directly to the screen no matter which console you switch to.
It's really dumb. The kernel has full and complete drivers for network cards, USB devices, hard drives, and everything else except audio and video where it does the bare minimum, creating situations where, for example, typing "killall -9 X" leaves your system completely fucked, whereas what should happen is that X11 dies, and the kernel kicks the console back to the text mode it was in before X11 asked for a graphics mode.