Follow Slashdot blog updates by subscribing to our blog RSS feed

Linux x32 ABI Not Catching Wind 262

Posted by Soulskill on Tuesday December 24, 2013 @07:20PM from the try-a-bigger-sail dept.

jones_supa writes "The x32 ABI for Linux allows the OS to take full advantage of an x86-64 CPU while using 32-bit pointers and thus avoiding the overhead of 64-bit pointers. Though the x32 ABI limits the program to a virtual address space of 4GB, it also decreases the memory footprint of the program and in some cases can allow it to run faster. The ABI has been talked about since 2011 and there's been mainline support since 2012. x32 support within other programs has also trickled in. Despite this, there still seems to be no widespread interest. x32 support landed in Ubuntu 13.04, but no software packages were released. In 2012 we also saw some x32 support out of Gentoo and some Debian x32 packages. Besides the kernel support, we also saw last year the support for the x32 Linux ABI land in Glibc 2.16 and GDB 7.5. The only Linux x32 ABI news Phoronix had to report on in 2013 was of Google wanting mainline LLVM x32 support and other LLVM project x32 patches. The GCC 4.8.0 release this year also improved the situation for x32. Some people don't see the ABI as being worthwhile when it still requires 64-bit processors and the performance benefits aren't very convincing for all workloads to make maintaining an extra ABI worthwhile. Would you find the x32 ABI useful?"

This discussion has been archived. No new comments can be posted.

Linux x32 ABI Not Catching Wind

Load All Comments

Search 262 Comments Log In/Create an Account

Comments Filter:

no (Score:4, Insightful)

by Anonymous Coward writes: on Tuesday December 24, 2013 @07:24PM (#45778949)

no

Share
twitter facebook
- Re: (Score:2)
  
  by rudy_wayne ( 414635 ) writes:
  
  Catching Wind
  LOL
- Re:no (Score:5, Insightful)
  
  by mlts ( 1038732 ) writes: on Tuesday December 24, 2013 @09:27PM (#45779627)
  
  For general computing, iffish.
  For embedded computing where I am worried about every chunk of space, and I can deal with the 3-4 GB RAM limit, definitely.
  This is useful, and IMHO, should be considered the mainstream kernel, but it isn't something everyone would use daily.
  
  Parent Share
  twitter facebook
  - Re:no (Score:4, Informative)
    
    by Just Brew It! ( 636086 ) writes: on Tuesday December 24, 2013 @10:14PM (#45779849)
    
    For most embedded applications you're probably better off just running a 32-bit OS and calling it a day. Embedded is mostly on 32-bit ARM processors anyway.
    
    Parent Share
    twitter facebook
- Re:no (Score:5, Insightful)
  
  by GPLHost-Thomas ( 1330431 ) writes: on Wednesday December 25, 2013 @06:16AM (#45781123)
  
  Well, I do find it extremely useful. Especially in Debian & Ubuntu, we have multi-arch support. For some specific workload using interpreted languages, it just reduces the memory footprint by a half. For example, PHP and Perl. If you once ran Amavis and spamassassin, you certainly know what I mean: it takes double the amount of RAM on 64 bits. Since most of our servers are running PHP, Amavis and Spamassassin, this would be a huge benefits (from 800 MB to 400 MB as the minimum server footprint), while still being able to run the rest of the workloads using 64 bits: for example, Apache itself and MySQL, which aren't taking much RAM anyway compared to these anti-spam dogs.
  
  Parent Share
  twitter facebook
  - - Re: (Score:3)
      
      by Bengie ( 1121981 ) writes:
      
      You need to run in 64bit mode if you want to take advantage of many cache eviction reducing IPC increasing instructions. If you want to gain this benefit while keeping your pointer size to a minimum, then you need the x32 mode. aka, 64bit mode with truncated pointers. You can probably gain 10%-15% performance with few changes over true 32bit mode. A lot of that is hidden when using 64bit pointers because of the reducing data density for some work loads.
      
      x32 mode is great for anything that can take advantage
Subject (Score:2, Insightful)

by Daimanta ( 1140543 ) writes:

With memory being dirt cheap I ask: Who cares?
- Re:Subject (Score:4, Insightful)
  
  by mellon ( 7048 ) writes: on Tuesday December 24, 2013 @07:48PM (#45779103) Homepage
  
  Memory? What about cache? Is cache dirt cheap?
  
  Parent Share
  twitter facebook
  - Re: (Score:3)
    
    by Bengie ( 1121981 ) writes:
    
    Yes and no. The larger your cache, the higher its latency. Can't get around this. L1 caches tend to be small to keep the execution units fed with typically 1 or 2 cycle latencies. L2 caches tend to be about 16x larger, but have about 10x the latency.
    
    L2 cache may have high latency, but it still has decent bandwidth. To help hide the latency, modern CPUs have automatic pre-fetching and also async low-priority pre-fetching instructions that allow programmer to tell the CPU to attempt to load data from memor
  - - Re: (Score:2)
      
      by TheRealMindChild ( 743925 ) writes:
      
      Sort of. It will be in the form or l4 or even a next layer, l5 cache. While this is still faster than grabbing system memory, we are approaching the point where it isn't
    - Re:Subject (Score:5, Insightful)
      
      by mellon ( 7048 ) writes: on Tuesday December 24, 2013 @08:34PM (#45779327) Homepage
      
      In answer to my question, no, it is not dirt cheap. For any size cache you will get fewer cache misses if your data structures are smaller than if they are larger. Until the cache is so big that everything fits in it, you always win if you can double what you can cram into it.
      
      Parent Share
      twitter facebook
      - Re: (Score:2)
        
        by ultranova ( 717540 ) writes:
        
        Until the cache is so big that everything fits in it, you always win if you can double what you can cram into it.
        Which is all nice and good except this implies your data structure was mostly pointers to begin with, so if you want to increase cache efficiency forget about pointer size and redesign them for better locality.
        I suspect this is the real reason why this ABI has not caught wind: anyone who cares has already taken steps that render it pointless.
        
        Re:Subject (Score:5, Informative)
        
        by dmbasso ( 1052166 ) writes: on Tuesday December 24, 2013 @09:32PM (#45779651)
        
        Which is all nice and good except this implies your data structure was mostly pointers to begin with
        And that's exactly the case of scripting languages, where every structure (say, a Python object) is a collection of pointers to methods and data.
        
        Parent Share
        twitter facebook
        
        Re: (Score:3)
        
        by foobar bazbot ( 3352433 ) writes:
        
        You're running a Python script and you care about L1/L2 cache efficiency??
        Your system is probably context switching between hundreds of MB.
        Amongst.
        Your system is context-switching amongst hundreds of MB.
        Frohe Weihnachten from das Grammer-SS!
      - Re: (Score:2)
        
        by Austerity Empowers ( 669817 ) writes:
        
        Until the cache is so big that everything fits in it
        A day, that by virtue of this arguement, can never come!
- Re:Subject (Score:5, Interesting)
  
  by KiloByte ( 825081 ) writes: on Tuesday December 24, 2013 @07:49PM (#45779105)
  
  For some workloads, it's ~40% faster vs amd64, and for some, even more than that vs i386. For a typical case, though, it's typical to see ~7% speed and ~35% memory boost over amd64.
  As for memory being cheap, this might not matter on your home box where you use 2GB of 16GB you have installed, but vserver hosting tends to be memory-bound. And using bad old i386 means a severe speed loss due to ancient instructions and register shortage.
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by Tim12s ( 209786 ) writes:
    
    That seams reasonable advantage. If it could take me from 60K tps to 100K tps per blade its a no-brainer. I doubt its going to allow office/home application to run any noticeably quicker but with a blade centre of 16 blades, I'll want to get my monies worth before needing to expand.
  - Re: (Score:2)
    
    by mwvdlee ( 775178 ) writes:
    
    ~35% memory boost is quite nice if you're running memory-bound multithreading processes; each thread being relatively light on CPU% but uses lots of memory.
    I run a webserver where one of the batch jobs is exactly that. ~35% memory boost would be very close to ~35% increase in throughput.
- Re:Subject (Score:4, Interesting)
  
  by Evan Teran ( 2911843 ) writes: on Tuesday December 24, 2013 @07:49PM (#45779107) Homepage
  
  It's not just about "having enough RAM". While that certainly is a factor, it's not the only one. As you suggest, pretty much everyone has enough RAM to run just about any normal application with 64-bit pointers.
  But if you want speed, you also have to pay attention to things like cache lines. 64-bit pointers often means larger instructions are needed to be encoded to do the same work, larger instructions means more cache misses. This can be a large difference in performance.
  
  Parent Share
  twitter facebook
  - - - Re:Subject (Score:5, Interesting)
        
        by TheRaven64 ( 641858 ) writes: on Wednesday December 25, 2013 @06:04AM (#45781101) Journal
        
        He's right. If you mix x32 and amd64 binaries on the same system, then you need two copies of every shared library that they use to be mapped at the same time. And this means that every context switch between them is going to be pulling things into the i-cache that would already be present (assuming a physically-mapped cache, which is a pretty safe assumption these days) because the other process is using them.
        This is why x32 doesn't make sense on a consumer platform like Ubuntu unless the entire system is compiled to use it, making the entire article a 'well, duh'. The real advantage of x32 is on custom deployments and embedded systems where you can build everything in x32 mode.
        Oh, and on the subject of caches, x86 chips typically have 64 byte cache lines. If you make pointers 4 bytes instead of 8, then you can fit twice as many in a cache line, which is usually nice. It can be a problem for multithreaded applications though, because you may now end up with more contention in the cache coherency protocol.
        
        Parent Share
        twitter facebook
- Re: (Score:2)
  
  by ProzacPatient ( 915544 ) writes:
  
  Desktop memory is cheap but ECC server memory can be very expensive
  - Re: (Score:2)
    
    by haruchai ( 17472 ) writes:
    
    Damn straight. Just spent $1000 for used 16 4GB sticks of HP DDR3 ECC registered memory; that's considered a bargain. New sticks would be $120 each.
    - Re:Subject (Score:5, Funny)
      
      by LordLimecat ( 1103839 ) writes: on Tuesday December 24, 2013 @08:50PM (#45779437)
      
      Of course its a tradeoff, because the new RAM will have less of its spare ECC bits used up.
      
      Parent Share
      twitter facebook
      - Re: (Score:2)
        
        by haruchai ( 17472 ) writes:
        
        "Spare ECC bits" - what??
    - - Re: (Score:2)
        
        by haruchai ( 17472 ) writes:
        
        These are RDIMMs, not UDIMMs.
        Besides it's the company's money and we can only buy from approved buyers or we don't get reimbursed.
  - Re: (Score:2, Insightful)
    
    by Anonymous Coward writes:
    
    ECC memory is artificially expensive. Were ECC standard as it ought to be, it would only cost about 12.5% more. (1 bit for every byte) That is a pittance when considering the cost of the machine and the value of one's data and time. It is disgusting that Intel uses this basic reliability feature to segment their products.
    - Re: (Score:3)
      
      by macpacheco ( 1764378 ) writes:
      
      That's right. Unfortunately it's called the market. The same boneheads that says x32 isn't worth it, are the same boneheads which have no idea how ECC is important, how hard it is to properly code everything worrying about cache hits is. Probably people that never wrote a single line of C or assembly code.
      But the Intel way of making the same physical hardware cost 50% more (with a simple on/off switch) will continue until ARM Cortex start giving intel some real competition (at least competing with the lates
- Re: (Score:2)
  
  by Mike Buddha ( 10734 ) writes:
  
  This is what Apple needs for it's silly 64 Bit mobile processors & OS.
- Re:Subject (Score:4, Informative)
  
  by Reliable Windmill ( 2932227 ) writes: on Tuesday December 24, 2013 @08:45PM (#45779391)
  
  You've not understood this correctly. x32 is an enhancement and optimization for executable files that do not require gigabytes of RAM, primarily regarding performance. It has nothing to do with the availability or lack of RAM in the system, or how much RAM costs to buy in the computer store.
  
  Parent Share
  twitter facebook
- Re: (Score:2)
  
  by Austerity Empowers ( 669817 ) writes:
  
  The other applications running on your system who also want to use memory and are programmed by people who don't care about resource utilization. Or the other VM, or the 8 other VMs.
  Processor power, memory and disk space should be considered like earth's natural resources. They're in limited supply and should never be wasted no matter how available they may seem at the present.
- Re:Subject (Score:5, Informative)
  
  by Forever Wondering ( 2506940 ) writes: on Wednesday December 25, 2013 @12:03AM (#45780287)
  
  With x32 you get:
  - You get 16 registers instead of 8. This allows much more efficient code to be generated because you don't have to dump/reload automatic variables to the stack because the register pressure is reduced.
  - You also get a crossover from the 64 bit ABI where the first 6 arguments are passed in registers instead of push/pop on the stack.
  - If you need a 64 bit arithmetic op (e.g. long long), compiler will gen a single 64 instruction (vs. using multiple 32 ops).
  - You also get the RIP relative addressing mode which works great when a lot of dynamic relocation of the program occurs (e.g. .so files).
  You get all these things [and more] if you port your program to 64 bit. But, porting to 64 bit requires that you go through the entire code base and find all the places where you said:
  int x = ptr1 - ptr2;
  instead of:
  long x = ptr1 - ptr2;
  Or, you put a long into a struct that gets sent across a socket. You'd need to convert those to int's
  Etc ...
  Granted, these should be cleaned up with abstract typedef's, but porting a large legacy 32 bit codebase to 64 bit may not be worth it [at least in the short term]. A port to x32 is pretty much just a recompile. You get [most of] the performance improvement for little hassle.
  It also solves the 2037 problem because time_t is now defined to be 64 bits, even in 32 bit mode. Likewise, in struct timeval, the tv_sec field is 64 bit
  
  Parent Share
  twitter facebook
  - Re:Subject (Score:4, Informative)
    
    by TheRaven64 ( 641858 ) writes: on Wednesday December 25, 2013 @06:15AM (#45781119) Journal
    
    The C standard does not guarantee that sizeof(long) is as big as sizeof(void*). The type that you want is intptr_t (or ptrdiff_t for differences between pointers). If you've gone through replacing everything with long, then good luck getting your code to run on win64 (where long is 4 bytes).
    
    Parent Share
    twitter facebook
- - Re: (Score:2)
    
    by Rockoon ( 1252108 ) writes:
    
    Having smaller data structures is much better for the small 64-byte cache lines of modern CPUs.
    If your data structure includes pointers that you actually use, then you are randomly accessing memory anyways. If you arent using those pointers, then I suggest 0-sized pointers which are compatible with x64.
- - - - Re: (Score:3)
        
        by Austerity Empowers ( 669817 ) writes:
        
        Colonel Panic to be precise. He reports directly to General P. Fault.
Eh? (Score:4, Insightful)

by fuzzyfuzzyfungus ( 1223518 ) writes: on Tuesday December 24, 2013 @07:27PM (#45778973) Journal

If I wanted to divide my nice big memory space into 32-bit address spaces, I'd dig my totally bitchin' PAE-enabled Pentium Pro rig out of the basement, assuming the rats haven't eaten it...

Share
twitter facebook
- - Re: (Score:2)
    
    by Phydeaux314 ( 866996 ) writes:
    
    http://ask.slashdot.org/story/09/02/12/2115242/how-to-keep-rats-from-eating-my-cables [slashdot.org]
  - Re: (Score:2)
    
    by hey! ( 33014 ) writes:
    
    I assumed "it" referred to his basement.
  - Re: (Score:2)
    
    by Pinhedd ( 1661735 ) writes:
    
    I used to have pet rats. Rats will eat anything
You got it. (Score:2)

by Qwertie ( 797303 ) writes:

Some people don't see the ABI as being worthwhile when it still requires 64-bit processors
There's your answer. If I'm writing a program that won't need over 2GB, the decision is obvious: target x86. How many developers even know about x32? Of those, how many need what it offers? That little fraction will be the number of users.
- Re: (Score:2)
  
  by loufoque ( 1400831 ) writes:
  
  This way you'll be able to make it magically much faster when building it for x32 or amd64.
- Re: (Score:2)
  
  by VortexCortex ( 1117377 ) writes:
  
  Some people don't see the ABI as being worthwhile when it still requires 64-bit processors
  There's your answer. If I'm writing a program that won't need over 2GB, the decision is obvious: target x86. How many developers even know about x32? Of those, how many need what it offers? That little fraction will be the number of users.
  Wait, what are you talking about? "target x86" Wat? Are you writing code in Assembly? How do you target C or higher level code code for x86 vs x86-64, or ARM for that matter?
  Ooooh, wait, you're one of those proprietary Linux software developers? Protip: 1's and 0's are in infinite supply, so Economics 101 says they have zero price regardless of cost to create. What's scarce is your ability to create new configurations of bits -- new source code -- not the bits. Just like a mechanic, home builder, burg
- - Re: (Score:2)
    
    by Chalnoth ( 1334923 ) writes:
    
    True. But for the vast majority of applications, that greater number of registers only translates into a small performance increase. I can potentially see x32 being useful for a rather small amount of heavily hand-optimized code (e.g. a massively optimized math or physics library), but for the vast majority of applications this performance benefit will be tiny.
    To me, the real problem for the adoption of x32 is that so few programs on PC's need to worry that much about optimization. When it does become wo
Nice concept (Score:3, Insightful)

by Anonymous Coward writes: on Tuesday December 24, 2013 @07:34PM (#45779025)

I do not see many cases where this would be useful. If we have a 64-bit processor and a 64-bit operating system then it seems the only benefit to running a 32-bit binary is it uses a slightly smaller amount of memory. Chances are that is a very small difference in memory used. Maybe the program loads a little faster, but is it a measurable, consistent amount? For most practical use case scenarios it does not look like this technology would be useful enough to justify compiling a new package. Now, if the process worked with 64-bit binaries and could automatically (and safely) decrease pointer size on 64-bit binaries then it might be worth while. But I'm not going to re-build an application just for smaller pointers.

Share
twitter facebook
- Re: (Score:2)
  
  by mjrauhal ( 144713 ) writes:
  
  You misunderstand the desired impact. "Loads a little faster" doesn't really enter into it. It's rather that system memory is _slow_, and you have to cram a lot of stuff into CPU cache for things to work quickly. That's were the smaller pointers help, with some workloads. Especially if you're doing a lot of pointery data structure heavy computing where you often compile your own stuff to run anyway.
  Still not saying it's necessarily worth the maintenance hassle, but let's understand the issues first.
- Re: (Score:3, Informative)
  
  by maswan ( 106561 ) writes:
  
  The main benefit is that it runs faster. 64-bit pointers take up twice the space in caches, and especially L1 cache is very space-limited. Loading and storing them also takes twice the bandwidth to main memory.
  So for code with lots of complex data types (as opposed to big arrays of floating point data), that still has to run fast, it makes sense. I imagine the Linux kernel developers No1 benchmark of compiling the kernel would run noticably faster with gcc in x32.
  The downside is that you need a proper fully
  - Re: (Score:3)
    
    by sribe ( 304414 ) writes:
    
    So for code with lots of complex data types (as opposed to big arrays of floating point data), that still has to run fast, it makes sense.
    Well, here's the problem. Code that is that performance-sensitive can often benefit a whole lot more from a better design that does not have so many pointers pointing to itty-bitty data bits. (For instance, instead of a binary tree, a B-tree with nodes that are at least a couple of cache lines, or maybe even a whole page, wide.) There are very very few problems that actually require that a significant portion of data memory be occupied by pointers. There are lots and lots of them where the most convenient
  - Re: (Score:3)
    
    by Rockoon ( 1252108 ) writes:
    
    64-bit pointers take up twice the space in caches, and especially L1 cache is very space-limited.
    
    L1 cache is typically 64KB, which is room for 8K 64-bit pointers or 16K 32-bit pointers. Now riddle me this.. if you are following thousands or more pointers, what are the chances that your access pattern is at all cache friendly?
    
    The chance is virtually zero.
    
    Of course, not all of the data is pointers, but that actually doesnt help the argument. The smaller the percentage of the cache that is pointers, the less important their size actually is, for after all when 0% are pointers then pointer size cannot
- Re: (Score:2)
  
  by LWATCDR ( 28044 ) writes:
  
  Simple.
  It is just as fast.
  Takes less drive space.
  Uses less memory.
  As to rebuilding apps it should be just a simple compile and yes while memory is cheap it is not always available even today. What about x86 tablets on Atom? I mean really does ls need to be 64bit what about more?
- - Re: (Score:2)
    
    by loufoque ( 1400831 ) writes:
    
    Any application that does heavy-numerical computation should not be affected by much by the ABI if at all. All function calls are inlined inside the critical loop.
    - Re: (Score:2)
      
      by cnettel ( 836611 ) writes:
      
      Any application that does heavy-numerical computation should not be affected by much by the ABI if at all. All function calls are inlined inside the critical loop.
      The ABI here also defines the size of all pointers. All pointers are 32-bit here. Any purely compute intensive application will not be affected much, but something including some complexity in data structures, with pointers, could possibly benefit a lot. On the other hand, if all your code does is traversing trees, you should seriously consider allocating them in one bunch and using internal indices (of smaller integer type) rather than native pointers anyway.
      - Re: (Score:2)
        
        by loufoque ( 1400831 ) writes:
        
        Number crunching rarely involve any pointers in the critical parts, the only exception I can think of is sparse matrices, which is actually usually done with fixed-size indexes rather than pointers.
        Game engines however probably have a lot of trees of pointers for their scene graph, so they could be affected. But if they're well-optimized, they're designed to that each level fits exactly inside a cache line, and changing the size of the pointers will mess that up.
Who cares if I'll use it? (Score:5, Interesting)

by 93 Escort Wagon ( 326346 ) writes: on Tuesday December 24, 2013 @07:37PM (#45779045)

The maintainer(s) find it interesting, and they're developing it on their own dime... so I don't get the hate in some of these first few posts. No one's forcing you to use it, or even to think about it when you're coding something else.
If it's useful to someone, that's all that matters.

Share
twitter facebook
- Re: (Score:2)
  
  by turkeydance ( 1266624 ) writes:
  
  "not catching wind"? now, there's an open to interpretation analogy.
  - Re: (Score:2)
    
    by Phydeaux314 ( 866996 ) writes:
    
    Yeah, you can't use sailing analogies here. Cars only.
It's not only RAM (Score:5, Informative)

by jandar ( 304267 ) writes: on Tuesday December 24, 2013 @07:41PM (#45779071)

The company I work for compiles almost all programms with 32 bits on x86-64 CPUs. It's not only cheap RAM usage, it's also expensive cache which is wasted with 64 pointer and 64 bit int. Since 3 GB is much more than our programms are using, x86-64 would be foolish. I'm eager waiting for a x32 SuSE version.

Share
twitter facebook
- Re: (Score:2)
  
  by Ecuador ( 740021 ) writes:
  
  I don't get it. x86-64 doubles the general purpose and SSE registers over x86. This alone makes a (usually quite big) difference even for programs that don't use 64bit arithmetic. The point of the x32 ABI as I understand it is to keep that advantage without having 64bit pointers.
  But you just compile with 32bits losing all the advantages of x86-64?
- - Re: (Score:3)
    
    by Austerity Empowers ( 669817 ) writes:
    
    Your comment reminded me of what Larry Wall, inventor of the wrecking ball, said about Miley Cirus:
    "Leeeeroooooy Jenkins!"
Comment removed (Score:4, Interesting)

by account_deleted ( 4530225 ) writes: on Tuesday December 24, 2013 @07:43PM (#45779081)

Comment removed based on user account deletion

Share
twitter facebook
- Re: (Score:2)
  
  by mysidia ( 191772 ) writes:
  
  But I suspect the problem is that the benefits simply outweigh the inconvenience of having to run with an entirely separate ABI.
  Well; if the benefits outweigh the inconvenience --- then it seems x32 should be catching on more than it is.
  Personally I think it is a bad idea because of the 4GB program virtual address space limit; which applications will be frequently exceeding, especially the server applications that would otherwise benefit the most from optimization.
  - Re: (Score:2)
    
    by account_deleted ( 4530225 ) writes:
    
    Comment removed based on user account deletion
  - Re: (Score:2)
    
    by tlhIngan ( 30335 ) writes:
    
    Personally I think it is a bad idea because of the 4GB program virtual address space limit; which applications will be frequently exceeding, especially the server applications that would otherwise benefit the most from optimization.
    
    You're making an assumption that the 4GB limit is prohibitive. For some applications, it could be - databases and scientific processing, and definitely games. But there are plenty of other applications that won't really benefit from the enlarged address space - would a word proc
- Re: (Score:2)
  
  by VortexCortex ( 1117377 ) writes:
  
  and faster (because the code is smaller, in theory cache should be used more efficiently
  Your skill is Not enough. when you blow registers onto the stack the code crawls. x86-64 has more registers. Code compiled for is far faster than x86 because of the extra registers. The L1 cache is how big on your CPU? Is your binary MEGABYTES in size? If your code is jumping all over the digital universe generating cache misses then you're purposefully doing something more idiotic than this universe should care about.
Maybe (Score:2)

by cold fjord ( 826450 ) writes:

It depends on the delta. There are still many 32bit problems out there, and there are plenty of cases where having extra performance helps. If you have enough of the right size problems you could even reduce the number of systems that you would need.
It looks like it could allow packing a single system tighter with less wasted resources.
Reducing the footprint of individual programs could also have some benefits from system performance / management, especially in tight resource situations.
One minor drawback
Try it (Score:3)

by KiloByte ( 825081 ) writes: on Tuesday December 24, 2013 @07:56PM (#45779145)

debootstrap --arch=x32 unstable /path/to/chroot http://ftp.debian-ports.org/debian/ [debian-ports.org]
Requires an amd64 kernel compiled with CONFIG_X86_X32=y (every x32-capable kernel can also run 64 bit stuff).

Share
twitter facebook
Great for smart phones (Score:2)

by MobyDisk ( 75490 ) writes:

This could have a home on smart phones. A smaller memory footprint is *key* on smartphone apps.
Seems reasonable. (Score:2)

by gallondr00nk ( 868673 ) writes:

There's plenty of applications around still without a 64 bit binary. From what I understand this layer just allows 32 bit programs to utilize some performance enhancing features of 64 bit architecture. It seems a genuinely good idea.
- Re: (Score:2)
  
  by cnettel ( 836611 ) writes:
  
  There's plenty of applications around still without a 64 bit binary. From what I understand this layer just allows 32 bit programs to utilize some performance enhancing features of 64 bit architecture. It seems a genuinely good idea.
  It allows 32-bit programs, which are *recompiled*, to benefit from those features. You still need the source and x32 builds of all dependencies. However, sometimes I guess there could be porting issues due to pointer size assumptions (but no other hard assumptions of x86 ABI behavior). Those codebases could not be recompiled for x64, but might port to x32 more easily.
Too little, too late (Score:2)

by TeknoHog ( 164938 ) writes:

x32 would have been nice as the first transition away from x86-32, but memory needs keep increasing, and we are far too used to full 64-bit spaces. In fact, it feels like we're finally over with the 32-64 bit transition, and people no longer worry about different kinds of x86 when buying new hardware. So introducing this alternative is a needless complication. As others have pointed out, it's too special a niche to warrant its own ABI.
- Re: (Score:2)
  
  by Reliable Windmill ( 2932227 ) writes:
  
  It's not a complication, it's an enhancement. A majority of software does not need a 64-bit address space and can thus be streamlined while still getting the benefits of doing fast 64-bit integer math, among other things. Obviously you just select the target when compiling and that's that, it's like enabling an optimization, so what are you talking about?
Is kernel still 64bit? (Score:2)

by ThePhilips ( 752041 ) writes:

General question about x32 ABI: is the OS still can use more than 4GB RAM w/o penalties? IOW, is kernel still 64bit? Only userspace is x32? Or x32 and pure 64-bit can run alongside?
Anyway. Most performance-sensitive programs went 64-bit anyway - since RAM is cheap and there are bunch of faster but memory-hogging algorithms.
- Re: (Score:2)
  
  by mjrauhal ( 144713 ) writes:
  
  The kernel needs to be an amd64 one for x32 to work, at least as things stand now. The most common situation would _probably_ be an amd64 system with some specialist x32 software doing performance intensive stuff. (Or possibly a hobbyist system running an all-x32 userspace for the hack value.)
  Yeah, working with big data is unlikely to benefit, and data _is_ generally getting bigger.
- Re: (Score:2)
  
  by Reliable Windmill ( 2932227 ) writes:
  
  Of course the OS is still 64-bit in that regard, it's just the address space of that particular application which is reduced to 32-bit to streamline it. The majority of all executable files do not require several gigabytes of RAM, hence it makes sense to streamline their address space.
  - Re: (Score:2)
    
    by ThePhilips ( 752041 ) writes:
    
    The majority of all executable files do not require several gigabytes of RAM, hence it makes sense to streamline their address space.
    
    I know that. Many commercial *NIX systems are doing it. Though... Having a 32-bit "cat" doesn't really changes anything.
    That why I have mentioned the memory hungry algorithms. Many applications are doing it this days. Needless to mention that java this days is started almost exclusively with the "-d64".
    The market for 4GB address space is really small. Because modern general programming practices generally disregard the resources in general, RAM in particular. (The (number of) CPUs being the most disrega
- Re: (Score:2)
  
  by VortexCortex ( 1117377 ) writes:
  
  I do some alternative OS development. When I setup a program to run there are 3 different 64bit modes (programming models) for me to select to run the program under: ILP64, LLP64, and LP64. In ILP64 you get 64 bit ints, longs, long longs, and pointers. In LLP64 you get 32bit longs and ints, and 64bit long longs and pointers. In LP64 you get 32bit ints, 64 bit longs, long longs and pointers. Note: All these pointers are 64 bit (but the hardware may have less bits than this, the OS will query it, code mus
What about shared libraries? (Score:4, Insightful)

by billcarson ( 2438218 ) writes: on Tuesday December 24, 2013 @08:22PM (#45779267)

Wouldn't this require all common shared libraries (glib, mpi, etc.) to be recompiled for both x86-64 and x32? What am I missing here?

Share
twitter facebook
- Re:What about shared libraries? (Score:4, Informative)
  
  by mjrauhal ( 144713 ) writes: on Tuesday December 24, 2013 @08:32PM (#45779307) Homepage
  
  Yes it would. That's among the nontrivial maintenance costs.
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by Arker ( 91948 ) writes:
    
    Funny thing I notice in articles of this sort. There are always comments saying it's dumb because there is no point in optimising software for performance because hardware is so cheap. And there are comments like yours, complaining that having to do a recompile to achieve it is too big a burden.
    Do you see the tension between the thoughts? Because if hardware is so cheap that it is more reasonable to tell the user to upgrade his computer, rather than optimise your software, then does it not follow that same
    - Re: (Score:3)
      
      by bogjobber ( 880402 ) writes:
      
      Nontrivial doesn't necessarily mean large. It just means significant enough that it needs to be accounted for. The actual cost will of course be dependent on the size and complexity of your codebase.
The main use cases are vertically integrated (Score:2)

by BusterB ( 10791 ) writes:

Think Atom processors running Android, or High-performance computing applications. Neither of these require a huge external ecosystem, but if you get a 30-40% boost in some workload, they are worth it. It's my understanding that small-cache Atoms benefit from this more than huge Xeons.
No (Score:2)

by Technomancer ( 51963 ) writes:

And I don't want another set of libraries in my system in addition to 64 bit and 32 bit emulation.
The return of memory models? (Score:4, Interesting)

by Just Brew It! ( 636086 ) writes: on Tuesday December 24, 2013 @10:29PM (#45779933)

This sure feels a lot like a throwback to the old 16-bit DOS days, where you had small/medium/large memory models depending on the size of your code and data address spaces. We've already got 32-bit mode for supporting pure 32-bit apps and 64-bit mode for pure 64-bit; supporting yet a third ABI is just going to result in more bloat as all the runtime libraries need to be duplicated for yet another combination of code/data pointer size.
I hate to say this since I'm sure a lot of smart people put significant effort into this, but it seems like a solution in search of a problem. RAM is cheap, and the performance advantage of using 32-bit pointers is typically small.

Share
twitter facebook
BSD (Score:2)

by manu0601 ( 2221348 ) writes:

I understand it is the same beast as the COMPAT_NETBSD32 [netbsd.org] option that has been available in NetBSD for 15 years now. It works amazingly well: one can throw a 64 bit kernel on a 32 bit userland and it just works, except for a few binaries that rely on ioctl(2) on some special device to cooperate with the kernel.
NetBSD even had a COMPAT_LINUX32 [netbsd.org] option for 7 years, which enables running a 32 bit Linux binary on a 64 bit NetBSD kernel. Of course the Linux ABI is a fast moving target, and one often misses the lat
- Re: (Score:3)
  
  by adri ( 173121 ) writes:
  
  No, it's not the same.
  The idea is that you use the 32 bit pointer model, with 32 bit indirect instructions, but you're doing it all using the x86-64 instruction set. Ie, the task is in 64 bit mode. The 64 bit mode includes primarily more registers, so you can write / compile to tighter code.
  The stuff you described is for running 32 bit binaries that use the i386/i485/i586 instruction set, complete with the limited set of temporary registers. x86-64 has many more registers to use.
  It's not just about cache li
First let's understand this x32 correctly. (Score:3)

by macpacheco ( 1764378 ) writes: on Tuesday December 24, 2013 @11:38PM (#45780175)

While it's possible to have a system with 16GB that could use only x32 (the kernel is still x86_64 under x32, so the kernel can see the 16GB), for instance running thousands of tasks using up to 4GB each just fine, plus the page cache is a kernel thing, so the I/O cache can always use all memory.
On the other hand, there are workloads that run on a 4GB system but that need x86_64 (mmaping of huge files for instance), and so boneheaded tasks reserve tons of never used RAM, it could actually use 1GB of RAM but reserve 8GB, the issue there really should be putting the coder in jail, but I digress.
But the vast majority of linux workloads today that use even a 8GB system would run just fine under x32. Like 95-98%.
And nobody is even suggesting a mainstream linux distro without x86_64 userland. I'm sugesting all standard tools using x32, but keeping the x86_64 shared libraries and compilers, so if you need you could use some apps with full 64bit capability. Just use x32 by default.
Plus it's a good way to remind lazy developers that no matter how cheap RAM is, you should be serious about being efficient (specially to the KDE developers) !
KDE functionality is great, but they really have no clue about efficiency (RAM and CPU).

Share
twitter facebook
- - Re: (Score:2)
    
    by macpacheco ( 1764378 ) writes:
    
    humm, I'm running firefox / chrome with 3GB total system RAM just fine
    dozens of tabs, flash, java, you name it
    many pages with hundreds of jpegs open
    the maximum virtual memory space for those jobs don't even get to 1GB
    I'm a MySQL/pgsql/Progress DBA and the only case I've seen that would require x86_64 is a customer with 6 Progress databases, a single local client attaches to all 6 dbs, requiring over 4GB of address space, all other cases don't come even close, all jvm I've ever seen, maxed out at 1.3GB
    again,
Too specific (Score:2)

by Kagetsuki ( 1620613 ) writes:

So for me the answer is no. The whole thing reminds me of doing ARM assembler with thumb code mixed in. If you have a very specific usage for it then yes, it would certianly be useful - but it's going to be up to the people who need it to actually use and improve it. Everyone else has no need to care and the average developer shouldn't *need* to care or even be aware of it.
Errm (Score:5, Interesting)

by countach ( 534280 ) writes: on Wednesday December 25, 2013 @09:10AM (#45781517)

Won't this require a 2nd copy of the shared libraries in memory, which will negate the benefit of a slightly smaller binary?

Share
twitter facebook
- Re: (Score:3)
  
  by mjrauhal ( 144713 ) writes:
  
  x32 at least has some merit, unlike your grasp of the history of computing. (Just not very much and probably not worth the trouble; you can probably relate.)
- Re: (Score:3)
  
  by s.petry ( 762400 ) writes:
  
  I would not go that far since I'm sure a special case may exist, but that's exactly what it would be for. Hence the 'no massive wide scale adoption' or 'applications written for this' becomes an (what should be) obvious outcome.
  If I'm custom Joe and see a workload that benefits from 32 vs. 64bit OS constraints I load a 32bit OS. The reason we went to larger memory however means those special cases are extremely rare today. They happen more because "we can't get new hardware" than by choice.
- Re: (Score:3)
  
  by Reliable Windmill ( 2932227 ) writes:
  
  You've just misunderstood it. It is in essence a performance enhancement, and you would benefit from it simply from selecting x32 target (instead of x86-64) when compiling.
- - ARM (Score:2)
    
    by tepples ( 727027 ) writes:
    
    I thought "embedded cost-sensitive systems" would be using ARM CPUs, not Intel or AMD x86-64 CPUs with 32-bit pointers.
  - Re: It has some value for embedded systems (Score:2)
    
    by jmauro ( 32523 ) writes:
    
    I think the embedded systems that need this would be better off just getting a faster 32-bit processor.
  - Re: (Score:3)
    
    by Pinhedd ( 1661735 ) writes:
    
    wrong architecture.
    Cost sensitive embedded systems use ARM based microprocessors to which this is not applicable.
- - Re: (Score:3)
    
    by ShanghaiBill ( 739463 ) writes:
    
    Who would want this, some niche embedded guys?
    Not many NEGs are using 64 bit processors, and this ABI offers too little advantage to bother with. Most embedded systems run a single primary process. If that process fits in a 4GB address space (as is required to use this ABI), then the system would just use a native 32 bit ABI on a 32 bit CPU, not this 32 bit ABI on a more expensive 64 bit CPU.
- Re: (Score:2, Insightful)
  
  by Anonymous Coward writes:
  
  My dad drives a Ford and your dad drives a Chevy. Your dad sucks.
  Didn't we do this already? Like when we were twelve years old.
- Re: (Score:3)
  
  by mjrauhal ( 144713 ) writes:
  
  I could get into specifics but I shan't, because what you're blathering about has zero relevance for x32. It's not a replacement-to-be for the usual amd64 ABI, nobody is going to break amd64 to make x32 run. It's mostly a specialist tool for specific workloads (aside from being a hacker's playground, as are many things). Whether thinking it's useful as such is misguided or not, you're more so.
- Re: (Score:3)
  
  by armanox ( 826486 ) writes:
  
  I can recompile and run 20 year old SunOS apps no problem with OpenSolaris. Try that with Linux?
  Depends on what it's looking for, but in theory should work. 20 years? CLI or GUI based? Probably wants TCL/TK and/or Motiff if it's GUI, make sure they're installed. I'm willing to try, if you have source code that old...
  Hairyfeet mentioned he tried linux and people kept calling back angry that their printer stopped working after an Ubuntu update.
  I did not even know it existed? I will keep Linux on a VM I suppose but only CentOS as Redhat likes to make somewhat ABIs that do not break after each freaking update!
  If you need stability then you should go with a stable OS. Fedora, OpenSuSE, and Ubuntu change too fast for enterprise use - which is what makes RHEL great.
  With that said, I don't seem to have issues running some older software I have laying around for Linux. Oracle Database 8 instal
- More than one reason for x86-64 (Score:5, Interesting)
  
  by tepples ( 727027 ) writes: <tepples AT gmail DOT com> on Tuesday December 24, 2013 @08:23PM (#45779271) Homepage Journal
  
  we went 64 bit for a reason.
  We went to x86-64 for three reasons: 64-bit integer registers, more integer registers, and 64-bit pointers. Some applications need only the first two of these three, which is why x32 is supposed to exist.
  
  Parent Share
  twitter facebook
- Re: (Score:2, Interesting)
  
  by 0123456 ( 636235 ) writes:
  
  Eventually, I assume that all binaries which don't need 64-bit addressing (which will probably always be more than 90% of them) will switch to this ABI since having access to the extended register set without the overhead of all the bus bandwidth and cache space lost to store lots of zeroes is a HUGE win with zero cost.
  Uh, no.
  Really, no.
  It's just not going to happen.
  90+% of applications are not CPU-intensive, so they don't give a crap. 90% of the other applications that are CPU-intensive would benefit far more from removing pointer accesses than from making the pointers half the size. Only the remaining 1% are going to go through the hassle of dicking around with a complete second set of libraries on their system just so they can halve the size of their pointers.
  There's simply no benefit at all from compiling the vast maj
- Re: (Score:3)
  
  by Just Brew It! ( 636086 ) writes:
  
  ABI = Application Binary Interface. Defines the pointer sizes and conventions for passing function arguments at the object code level (among other things). The ABI determines how the compiler generates object code for function call/entry/exit, and the width of pointer types.
  API defines the interfaces seen by the programmer.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

no (Score:4, Insightful)

Re: (Score:2)

Re:no (Score:5, Insightful)

Re:no (Score:4, Informative)

Re:no (Score:5, Insightful)

Re: (Score:3)

Subject (Score:2, Insightful)

Re:Subject (Score:4, Insightful)

Re: (Score:3)

Re: (Score:2)

Re:Subject (Score:5, Insightful)

Re: (Score:2)

Re:Subject (Score:5, Informative)

Re: (Score:3)

Re: (Score:2)

Re:Subject (Score:5, Interesting)

Re: (Score:2)

Re: (Score:2)

Re:Subject (Score:4, Interesting)

Re:Subject (Score:5, Interesting)

Re: (Score:2)

Re: (Score:2)

Re:Subject (Score:5, Funny)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2, Insightful)

Re: (Score:3)

Re: (Score:2)

Re:Subject (Score:4, Informative)

Re: (Score:2)

Re:Subject (Score:5, Informative)

Re:Subject (Score:4, Informative)

Re: (Score:2)

Re: (Score:3)

Eh? (Score:4, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

You got it. (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Nice concept (Score:3, Insightful)

Re: (Score:2)

Re: (Score:3, Informative)

Re: (Score:3)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Who cares if I'll use it? (Score:5, Interesting)

Re: (Score:2)

Re: (Score:2)

It's not only RAM (Score:5, Informative)

Re: (Score:2)

Re: (Score:3)

Comment removed (Score:4, Interesting)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Maybe (Score:2)

Try it (Score:3)

Great for smart phones (Score:2)

Seems reasonable. (Score:2)

Re: (Score:2)

Too little, too late (Score:2)

Re: (Score:2)

Is kernel still 64bit? (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

What about shared libraries? (Score:4, Insightful)

Re:What about shared libraries? (Score:4, Informative)

Re: (Score:2)

Re: (Score:3)

The main use cases are vertically integrated (Score:2)

No (Score:2)