The Many Paths To Data Corruption 121
Runnin'Scared writes "Linux guru Alan Cox has a writeup on KernelTrap in which he talks about all the possible ways for data to get corrupted when being written to or read from a hard disk drive. This includes much of the information applicable to all operating systems. He prefaces his comments noting that the details are entirely device specific, then dives right into a fascinating and somewhat disturbing path tracing data from the drive, through the cable, into the bus, main memory and CPU cache. He also discusses the transfer of data via TCP and cautions, 'unfortunately lots of high performance people use checksum offload which removes much of the end to end protection and leads to problems with iffy cards and the like. This is well studied and known to be very problematic but in the market speed sells not correctness.'"
End-to-end (Score:5, Informative)
Hello ZFS (Score:5, Informative)
I am looking forward to the day when all RAM has ECC and all filesystems have checksums.
Re:Hello ZFS (Score:4, Informative)
But, ECC is available. If it is important to you, pay for it.
Real-life proof of ZFS detecting problems (Score:4, Informative)
http://blogs.sun.com/elowe/entry/zfs_saves_the_day_ta [sun.com]
And you'll understand
Re:RAM = the weakest link (Score:1, Informative)
You meant 975x, not 965x. The successor of 975x is X38 (Bearlake-X) chipset supporting ECC DRAM. It should debut this month.
Re:RAM = the weakest link (Score:1, Informative)
What's worse? It IS free!
Motherboard chips (e.g. south bridge, north bridge) are generally limited in size NOT by the transistors inside but by the number of IO connections. There's silicon to burn, so to speak, and therefore plenty of room to add features like this.
How do I know this? Oh wait, my company made them.... We never had to worry about state-of-the-art process technology because it wasn't worth it. We could afford to be several generations behind for exactly this reason.