Linux Breaks 100 Petabyte Ceiling 330
*no comment* writes: "Linux has broken the barrier with the 100 petabyte ceiling, and
doing it at 144 petabytes." And this is even more impressive in pebibytes, too.
A list is only as strong as its weakest link. -- Don Knuth
Re:Forgot my Greek (Score:3, Informative)
XFS (Score:5, Informative)
XFS is a full 64-bit filesystem, and thus, as a filesystem, is capable of handling files as
large as a million terabytes.
263 = 9 x 1018 = 9 exabytes
In future, as the filesystem size limitations of Linux are eliminated XFS will scale to the
largest filesystems
Article got it wrong on BeOS - 18 EXAbytes! (Score:5, Informative)
Just wanted to set the record straight.
FreeBSD had it first. (Score:1, Informative)
http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/d
Re:Ok... (Score:3, Informative)
Well, it's good to see that Linux has caught up, but the article is not correct that Linux is the first OS to support 48-bit ATA; FreeBSD has had this support for over a month now.
See for example: this file [freebsd.org] which is one of the files containing the ATA-6r2 code, committed to FreeBSD on October 6.
Just to put this into perspective... (Score:2, Informative)
http://www.cacr.caltech.edu/~roy/dataquan/ [caltech.edu]
Uh, no? (Score:3, Informative)
I'm looking at the Linux XFS feature page [sgi.com], which states:
My understanding is that the 2TB limit per block device (including logical devices) is firm (regardless of the word size of your architecture), and unrelated to what Mr. Hedrick did. Am I wrong? Does this limit disappear if you build the kernel on a 64-bit architecture?And, on 32-bit architectures, there's no way to get the buffer cache to address more than 16TB.
Re:XFS (Score:1, Informative)
In future, as the filesystem size limitations of Linux are eliminated XFS will scale to the largest filesystems
Before this, you couldn't access drives bigger than 128GB, and a 64-bit filesystem wouldn't have helped. You make it sound like this update was for a specific filesystem, but that's not true; this update was at the device level.
Re:OK this is great... (Score:2, Informative)
Re:Nice! (Score:1, Informative)
BeOS.
Re:OK this is great... (Score:2, Informative)
Re:512? That can't be right. (Score:2, Informative)
2^48 blocks * 512 bytes/block = 144115188075855872 bytes
1st desktop OS? Well, not quite. (Score:5, Informative)
A slashdot story pointing out how without the FreeBSD ATA code, the Linux kernel would be 'lacking'
The FreeBSD press release announcing the code is stable [freebsd.org]
If The Reg actually researched the story, Andy would have notice it is not a 'first' but more a 'dead heat' between the 2 leading software libre OSes. Instead, The Reg does more hyping of *Linux.
Pebibytes? (Score:4, Informative)
Well, according to the IEC standard [nist.gov], one petabyte is 10^15 (or 1e+15) bytes, while one pebibyte is 2^50 (or 1.125899e+15) bytes.
So 144 petabytes is 1.44e+17 bytes or 127.89769 pebibytes. Can't say that's more impressive tho. :P
Re:working with large files (Score:4, Informative)
cat file | ssh user@host "cat > file"
More recent builds of SCP will also support +2GB, so:
scp file user@host:/path
or
scp file user@host:/path/file
will both work.
In fact, probably the best way for syncing two directories is rsync. Rsync's major weakness is that it's *tremendously* slow for large numbers of files, and I believe it has to read every byte of a large file before it can incrementally transfer it(so you're looking at 2GB+ of reading before transfering). The following will do rsync over ssh:
rsync -e ssh file user@host:/path/file
rsync -e ssh -r path user@host:/path
For incremental log transfers, I actually had a system built that would ssh into the remote side, determine the filesize of the remote file, and then tail from the total file size minus the size of the remote file. It was a bit messy, but it was incredibly reliable. Did have problems when the remote logs got cycled, but it wasn't too ugly to detect that remote filesize was smaller than localfilesize. Just a shell script, after all.
SFTP should, as far as I know, handle 2GB+ without a hitch.
Both SCP and SSH of course have compression support in the -C tag; alternatively you can pipe SSH through gzip.
Email me for further info; there's some SSH docs onto my home page as well. Good luck
--Dan
www.doxpara.com
Re:Random statistics.... (Score:4, Informative)
Example... (Score:3, Informative)
BTW, it may also re-open the debate:
Reality check... (Score:5, Informative)
I'm already fed up of the time it takes to back up large disks to tape. Drive transfer rate has not improved at the rate of disk capacity in the last few years and is becoming a bottleneck. It was unimportant when the backup time of a single disk was well below one hour (our Ultrium tapes give about 40Gb/hour).
Just figure that if you want to transfer 144PB in about one day, you need a transfer rate of the order of 1TB/s. Electronics is far from there since it means about 10 terabits/second. Even fiber is not yet there. Barring a major revolution, magnetic media and heads can't be pushed that far. At least it is way further than the foreseeable future.
Don't get me wrong, it is much better to have more address bits than needed to avoid the painful limitations of 528 Mb, 1024 cylinders etc... But, as somebody who used disks over 1Gb on mainfranmes around 1984-1985, I easily saw all the limitations of the early IDE interfaces (with the hell of CHS addresses and its ridiculously low bit numbers once you mixed the BIOS and interface limitations) and insisted on SCSI on my first computer (now CHS is history thanks to LBA, but the transition has been sometimes painful).
However, right now big data centers don't always use the biggest drives because they can get more bandwidth by spread the load on more drives (they are also slightly wary of the greatest and latest because reliability is very important). Backing up starts to take too much time,
In short, the 48 bit block number is not a limit for the next 20 years or so. I may be wrong, but I'd bet it'll take at least 15 years, perhaps much more because it is too dependent on radically new technologies and the fact that the demand for bandwidth to match the increase in capacity will become more prevalent. Increasing the bandwidth is much harder since you'll likely run into noise problems, which are fundamental physical limitations.
Re:Big deal (Score:4, Informative)
This obviously mattered to the people who implemented it. If you'd rather see development move in a different direction, by all means, write some code that you feel is useful.
See, the people who implemented this probably don't give a damn what you feel is important, they care about what they feel is important.
It's really very simple, put up or shut up.
Limit is for a single IDE disk (Score:3, Informative)
Since my machine has 2 IDE controllers, with 2 buses each, and 2 drives per bus, you could make a system with 8 144 pB drives, put an XFS partition on it, and have 1152.92 pB of storage.
And for meaningless statistics sake: I make my MP3s (from CDs that I own, thankyouverymuch) at an average of 160 kb/sec. At that rate, the specified drive array would store 1826693 YEARS of MP3s. None of which would be Brittany Spears.
Re:OK this is great... (Score:1, Informative)
Not intended as a flame, just interested
I work at a large credit card bank (we're the largest issuer of VISA cards, and our analytic data store is in the top 500 supercomputer sites [top500.org]). Our main Oracle data warehouse has about 38 TB of tablespace in use. It'll be awhile before we need drives with PB capacity.
Re:OK this is great... (Score:1, Informative)
NATIONAL VIRTUAL OBSERVATORY TO PUT UNIVERSE ONLINE
The National Science Foundation has earmarked $10 million for the
development of a National Virtual Observatory (NVO), a single,
searchable database of astronomical knowledge culled from
observatories. The current total volume of astronomical information
comprises roughly 100 terabytes, and scientists predict this number
will swell to over 10 pentabytes by 2008. Caltech computer scientist
Paul Messina said that a single repository for this vast amount of
data is essential, otherwise, "we will end up like shipwrecked
sailors on a desert island, surrounded by an ocean of salt water
and unable to slake our thirst." The goal of the project is to be
able to conduct intricate computations by using the NVO to leverage
the computing power of 17 research databases.
(Newsbytes, 30 October 2001)
Article Updated (Score:2, Informative)
The Register [slashdot.org] updated their article. It now acknowledges FreeBSD as being the first Unix to support multi-petabyte filesizes.
However, NTFS 5.0 (the filesystem that is used by Windows 2000) has had 64-bit addressing since Windows 2000 was released. This yields a maximum capacity of 16 exabytes, which is 8388608 Petabytes. That's right, Windows has supported files eighty thousans times larger than Linux with an experimental patch for the past few years. Still, by the time people actually start needing this kind of storage, I don't think it'll actually matter much...