Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
Quake Linux

Open Source ARM Mali Driver Runs Q3A Faster Than the Proprietary Driver 71

An anonymous reader writes "The lima driver project, the open source reverse engineered graphics driver for the ARM Mali, now has Quake 3 Arena timedemo running 2% faster than the ARM Binary driver." There's a video showing it off. Naturally, a few caveats apply; the major one is that they don't have a Free shader compiler and are forced to rely on the proprietary one from ARM, for now.

This discussion has been archived. No new comments can be posted.

Open Source ARM Mali Driver Runs Q3A Faster Than the Proprietary Driver

Comments Filter:
  • by TejWC ( 758299 ) on Wednesday February 06, 2013 @10:24AM (#42808625)

    Based on the article, it seems like they first ported Q3A from OpenGL ES1 to OpenGL ES2, and then they used the closed source shader compiler to do most of the work (OpenGL ES2 forces most of the code to be in the form of shaders). It seems like they really didn't make much of an actual driver and just off-loaded most of the work to the shaders (I could be wrong though).

    • That sounds like a feature to me, so long as all the pieces are there. I'd sure love a completely-open-to-the-microcode platform, but what I need is something sufficiently open for there to continue to be drivers.

    • Re: (Score:3, Informative)

      by Anonymous Coward

      Hey there, I'm Connor Abbott, the lima compiler guy. No, porting from GLES1 to GLES2 was not necessary, it was just to debug a performance issue. While it is true that the demo uses the binary compiler, we *do* have the knowledge to write our own shaders - it's just the compiler that's lacking, and maybe my laziness :). For fragment shaders, we could pretty easily write our own shaders in assembly, it just hasn't been done yet (when I get around to it ;) ). For vertex shaders, we can't write anything in ass

  • by Anonymous Coward

    AIUI the FOSS codebase is based on reverse-engineering the binary driver. So, there would be almost no reason to expect it to be faster. There may be some CPU time saved if they can create the command buffer quicker than the binary driver manages, but it's highly unlikely they can create a general solution that makes the GPU time reduce, since they're going to have to send the same commands to the hardware anyway. A better shader compiler might achieve something but ... they don't have that.

    Ergo, 2% is a

    • Quite often binary drivers are written by people who, either ported the code from other Operating Systems, or must maintain the code in such as way as to be able to share the code base with operating systems having different driver models. A pure free driver can lose a lot of cruft and can often have things like memory management better tuned for the system or interact with the hardware in more efficient ways.

      The NVIDIA Ethernet driver from a few years back was a good example of that. The Linux people created a free driver that ran a lot faster than the binary driver forcing NVIDA to abandon their driver.

    • by Khyber ( 864651 )

      "There may be some CPU time saved if they can create the command buffer quicker than the binary driver manages, but it's highly unlikely they can create a general solution that makes the GPU time reduce, since they're going to have to send the same commands to the hardware anyway"

      Or, they don't have to send the same commands, and have implemented a wrapper that actually works more efficiently than the native graphics code.

    • Re: (Score:3, Informative)

      by Anonymous Coward

      The numbers are in the blog post, which you haven't bothered to look at.

      This is an ARM Cortex A8, running at 1GHz, with a Mali-400MP1 at 320MHz, and with 1GB DDR3 at 360MHz. Timedemo is fully consistent, every time. 46.2 for the binary opengles1 driver, 47.2 for the open source driver.

      We are getting close to a shader compiler of our own, yesterday we had our first stab at compiling the few shaders needed for q3a, it failed though, but we are creeping closer on this insane and massive task of reverse enginee

      • Re: (Score:2, Informative)

        by Anonymous Coward

        Furthermore, this blog extensively explains how well the hardware behaves and how this 2% is mostly due to the fact that the prototype driver has less checking to do than a proper driver. No special tricks were used, especially none which are Q3A specific, this is how fast the hardware is, and we succeeded in using it just as efficiently as the binary driver, which is unbelievably significant for a reverse engineered graphics driver.

  • by zAPPzAPP ( 1207370 ) on Wednesday February 06, 2013 @10:31AM (#42808693)

    if (Quake3) show_fps += 30;

    • Re: (Score:2, Redundant)

      by CastrTroy ( 595695 )
      Without much analysis done on the actual output, this is a very relevant statement. It's happened in the past that certain drivers have claimed better performance while at the same time completely ignoring certain things they were supposed to be doing in order to get the framerate up. Do the frames end up looking exactly the same with both drivers? What exactly is making it faster. Did they improve a specific part which only helps for Q3A demo files and doesn't actually make any difference when playing a
      • by serviscope_minor ( 664417 ) on Wednesday February 06, 2013 @10:43AM (#42808847) Journal

        It's happened in the past that certain drivers have claimed better performance while at the same time completely ignoring certain things they were supposed to be doing in order to get the framerate up. Do the frames end up looking exactly the same with both drivers? What exactly is making it faster. Did they improve a specific part which only helps for Q3A demo files and doesn't actually make any difference when playing a real game.

        All interesting questions. If only there was a long block of text which covered those points. I've never heard such of a thing though. But, I'm going to coin a new term, "TFA" to refer to the hypothetical object.

        Anyone with me on this?

        • by yabos ( 719499 )
          Quiet, you
        • by mikael ( 484 )

          Sometimes the driver uses special optimized paths depending on the name of the executable. That was known in the past, so they could optimize for benchmarks and games. Even certain configurations of GL function calls were faster than others eg. glDrawArrays

          http://www.spec.org/gwpg/gpc.static/Rulesv16.html [spec.org]

  • Either take the original code open source for the benefit of all or hire the open source team before someone else does cause they obviously rock.
  • While its quite nice to have a quake III bench, and be on a mobile platform that in fact means some great fun could be had amongst friends, its an old bench, and an old game.
    It used to be something Amiga people benched against in later years to try to implicate an idea on relevance.

    Having capable GPU's in mobile stuff (Hi Intel Atom based netbooks!) is a great idea. All for it, and you have to love the low cost of the platforms making it available to more people.

    • While its quite nice to have a quake III bench, and be on a mobile platform that in fact means some great fun could be had amongst friends, its an old bench, and an old game.

      and yet it is the best looking usable game on tablets/phones right now :)

  • 2 whole percent? (Score:5, Interesting)

    by abigsmurf ( 919188 ) on Wednesday February 06, 2013 @10:39AM (#42808777)
    So it's a value that's well within random fluctuation levels? Meanwhile, how's the reliability, memory usage, compatibility, performance outside of that single game?
    • by Bigby ( 659157 )

      It isn't within random fluctuation levels. I would assume the tests were run with a large enough sample size to make random fluctuations statistically insignificant. Just that 2% is not a significant change for gaming. If we were in the world of high frequency trading, 2% would be worth billions.

    • So it's a value that's well within random fluctuation levels?

      Now compare it to the performance before this update, and get back to us on whether it's news, at least to people who care about this chip.

      • by Anonymous Coward

        The news is that there _is_ an open source, reverse engineered, driver which is matching the binary driver in performance. Matching as there really is little more to gain from this hardware without hacking Q3A itself. This is as fast as the hardware is, and we actually manage to use it just as well as the binary driver, without any Q3A specific hacks.

        --libv

  • Texture switching (Score:4, Informative)

    by Hsien-Ko ( 1090623 ) on Wednesday February 06, 2013 @10:44AM (#42808873)
    Quake III Arena has a ton of it. Not even its models are well paged, like the rocket which uses around 4 different textures alone. The only things atlased are console text, menu text and lightmaps, so it's not a very efficient data set for OpenGL ES to begin with
    • by Khyber ( 864651 )

      >mfw models and textures shouldn't be shit on a more modern system like an ARM core.

    • This is probably because Q3 had to work on Voodoo cards, which had very limited texture sizes (256x256 I think? something dumb...) and so you couldn't atlas textures to the extent that you do on today's GPUs.

      • It wasn't due to texture size limits or the Voodoo. One 256x128 could fit everything an ammo box or rocket needed.

        Q3Test was worse in this regard, the Railgun model used around 12 textures
  • by Anonymous Coward

    Your work is appreciated!

    Ignore all the idiots who hate their lives that lurk around /. criticizing every accomplishment of others. /. is starting to suck. Your work though is great!

As you will see, I told them, in no uncertain terms, to see Figure one. -- Dave "First Strike" Pare

Working...