World's Five Biggest SANs 161
An anonymous reader writes "ByteandSwitch is searching the World's Biggest SANs, and has compiled a list of 5 candidate with networks supports 10+ Petabytes of active storage. Leading the list is JPMorgan Chase, which uses a mix of IBM and Sun equipment to deliver 14 Pbytes for 170k employees. Also on the list are the U.S. DoD, which uses 700 Fibre Channel switches, NASA, the San Diego Supercomputer Center (it's got 18 Pbytes of tape! storage), and Lawrence Livermore."
Not so accurate (Score:4, Informative)
Shouldn't this be written somewhere? (Score:5, Informative)
...and why does the article say "Pbytes", "Tbytes" (Score:3, Informative)
The abbreviated units are "PB" and "TB".
See: http://en.wikipedia.org/wiki/Petabyte [wikipedia.org]
Re:14Pb for 170k employees... (Score:3, Informative)
I'm also curious about Google and the like. Do they not disclose their storage?
Re:Shouldn't this be written somewhere? (Score:3, Informative)
Re:Very U.S. Centric... (Score:3, Informative)
Re:14Pb for 170k employees... (Score:3, Informative)
To a certain extend they have disclosed some numbers in a paper about their distributed storage system called "BigTable". The title of the paper is "Bigtable: A Distributed Storage System for Structured Data" and it can be found right here [google.com].
Some numbers can be found on page 11:
Project and Table size in TB:
Crawl: 800
Crawl: 50
Google Analytics: 20
Google Analytics: 200 (Raw click table)
Google Base: 2
Google Earth: 0.5
Google Earth: 70
Orkut: 9
Personalized Search: 4
Total so far: 1,155.5 TB
It's a very interesting paper to read. One of the many papers [google.com] Google has put online:
Re:security, resilience, risk, etc (Score:3, Informative)
You also talk about copies of data as if a disk went bad, you'd lose the data. These storage arrays have multiple redundancies (RAIDs of VDisks which are RAIDs themselves) as well as having live replication capability to remote sites -- at which point you likely have a copy (or copies) of an entire datacenter in a different geographic location that is running as a hot spare.
Within a datacenter, you would not have more than dual fabrics. Your fabrics' switches will also be redundantly connected within themselves. And if you're killing an entire fabric with an upgrade, you're doing it wrong.
You'll also have service contracts with lockers of disks, switches, linecards, etc., *on site* with field technicians from the vendors on-call 24/7.
Fibre Channel installations are not like some small company's Ethernet LAN.
Re:... That we know about (Score:1, Informative)
Re:14Pb for 170k employees... (Score:4, Informative)
Re:Very U.S. Centric... (Score:2, Informative)
Still, the data aquisition and storage system is impressive. Most of the storage will be distributed over different sites, so I don't know if there will be a huge central storage system.
Re:Very U.S. Centric... (Score:5, Informative)
I'll talk about one of the experiments, ATLAS. Yes we "generate" petabytes of data per day. It's rather easy to calculate actually. One collision in the detector can be compressed down to about 2MB raw data-- after lots of zero-suppression and smart-storage of bits from a detector that has ~100 million channels worth of readout information.
There are ~30 million collisions a second -- as the LHC machine runs are 40Mhz but has a "gap" in its beam structure.
Multiplying: 2 * 10^6 * 30 * 10^6 = 6* 10^13 Bytes per second. So ATLAS "produces" 1 petabyte of information in about 13 seconds!! :)
But ATLAS is limited to being able to store about ~300 MB per second. This is the limit coming from how fast you can store things. Remember, there are 4 LHC experiments after all, and ATLAS gets its fair share of store capability.
Which means that about of 30 million collisions per second, ATLAS can only store 150 collisions per second.... which it turns out is just fine!! The *interesting* physics only happens **very** rarely -- due to the nature of *weak* interactions. At the LHC, we are no-longer interested in the atom falling apart, and spitting its guts (quarks and gluons out). We are interested in rare processes such as dark-matter candidates or Higgs, or top-top production (which will dominate the 150Hz btw) and interesting and rare things. In most of the 30 million collisions, the protons spit their guts out and much much *rare* things occur. The catch of the trigger of ATLAS (and any other LHC experiment for that matter) is to find those *interesting* 150 events out of 30 million every second -- and do this in real time, and without a glitch. ATLAS uses about ~2000 computers to do this real-time data reduction and processing... CMS uses more, I believe.
In the end, we get 300 MB/second worth of raw data and that's stored on tape at Tier 0 at CERN permanently -- and until the end of time as far as anyone is concerned. That data will never *ever* be removed. Actually the 5 Tier 1 sites will also have a full-copy of the data among themselves.
Which brings me to my point that CERN storage is technically not a SAN (Storage Area Network)... (My IT buddies are insisting on this one. ) I am told that CERN storage counts as a NAS (Network Attached Storage). But I am going to alert them to this thread and will let them elaborate on that one!
Re:Pebibytes (Score:2, Informative)
Re:... That we know about (Score:3, Informative)
Re:Very U.S. Centric... (Score:3, Informative)
Can't imagine why you write this as AC, but ok...
Answer: "That's only 300MB/s 24/7 for more than half a year for writing the raw data to storage. Then there are the other three experiments with the same amount of data, actually one of them does 1.2GB/s of raw data. The data ends up on disk first with an aggregate write speed of ~1.5GB/s (let's not exaggerate). The data is read immediately from disk again to be written to tape (our final storage media), so ~1.5GB/s reads ... Then, all this data is being exported to external computing centers pretty much immediately too (multiple copies etc. etc., so aggregate is much higher than 1.5GB/s), so we get ~3GB/s of reads just from this data export (it can, potentially, be a lot more. we have already a total of 120Gbit/s of network connectivity to those sites). So, we are already at ~6GB/s of I/O and nobody even had a look into the data itself!! If we talk data analysis, we talk about repeated reprocessing runs over the entire collection of raw data in order to "create" the data format that physicists can more easily use for their analysis, we talk about several thousand people accessing all the accumulated data in a perfectly random way ... mind you, we keep all the raw data active, so 10 years there will be at least 100PB, probably more like 150PB, maybe even 200PB of active storage. The current estimate for the I/O caused by the data analysis is in the order of 50GB/s (big B). "