History of Supercomputing at Los Alamos things from Bradbury Science Museum.
Created by BradburySciMuseum on Jul 25, 2011
Last updated: 04/05/12 at 02:20 PM
The Road to Roadrunner and Beyond has no followers yet. Be the first one to follow.
The next milestone in computing speed is the exaflop: a million trillion floating point operations per second. As with the petaflop, it will likely rely on new technologies and perhaps new materials. Some predict this could come as early as 2018. 1 exaflop = 1,000,000,000,000,000,000 floating point ops/sec = 1 quintillion floating point operation per second
Cielo arrived at Los Alamos from Cray Inc. with more installed in 2011. As a “capability” machine, its design will support large single computing jobs taking the entire platform to run.
At a yearly conference, major designers, manufacturers, and users of supercomputers convene to show off their latest work. Here are two videos recently presented by Los Alamos National Laboratory computer scientists, engineers, and technicians.
At a yearly conference, major designers, manufacturers, and users of supercomputers convene to show off their latest work. Here are two videos recently presented by Los Alamos National Laboratory computer scientists, engineers, and technicians.
Roadrunner is helping scientists study the HIV "family tree."
Ten unclassified projects were selected to use Roadrunner so that scientists could optimize the way large codes are able to run.
Petaflop barrier broken during trial runs Power draw: 2.3 megawatts 278 server racks containing 6,562 AMD Operon dual-core processors and 12,240 PowerXCell 8i Cell processors - special IBM-developed variant of the Cell Processor Used in the Playstation3. 1.3 quadrillion floating point operation per second 1,300,000,000,000,000 floating point ops/sec 1.3 petaflops
2009
Roadrunner moved from IBM and installed in the Metropolis Center at Los Alamos.
1.3 quadrillion floating point operations per second 1,300,000,000,000, 000 floating point ops/sec 1.3 petaflops
Novel hybrid architecture, consisting of 3,060 of these tri-blade nodes each containing two cell blades, one host blade, and one expansion blade that connected them EachTri-blade capable of 409.6 gigaflops form the IBM chips and an additional 14..4 gigaflops from the AMD chips, for a total of 424 gigaflops Tri-blades are interconnected with over 95 miles of fiber cabling and Infiniband network switches 1.3 quadrillion floating point operations per second 1,300,000,000,000, 000 floating point ops/sec 1.3 petaflops
Roadrunner Tri-blade Node
The Lab assesses the PlayStation Cell processor for Roadrunner.
On May 26, 2008, the Roadrunner supercomputer exceeded a sustained speed of 1 petaflop/s—a million billion calculations per second—becoming the first computing system ever to reach a petaflop and holding the record as fastest computer in the world for a year and a half.
2002 Q Machine: Processor Card from Los Alamos National Laboratory's Hewlett-Packard Q Machine • 8,192 microprocessors • 2,000 computers in cluster • Able to vector-process • Capable of 3-D simulations comprising millions of atoms 20 trillion floating point operations per second 20,000,000,000,000 floating point ops/sec 20 teraflops
20 trillion floating point operations per second 20,000,000,000,000 floating point ops/sec 20 teraflops • 8,192 microprocessors • 2,000 computers in cluster • Able to vector-process • Capable of 3-D simulations comprising millions of atoms
9/11 attacks generate requirement for the computing power to receive, store, and correlate vast amounts of counter-terrorism information.
Blue Mountain Processor Chip • 6,144 microprocessors • 48 computers in cluster 3 trillion floating point operations per second 3,000,000,000,000 floating point ops/sec 3 teraflops
3 trillion floating point operations per second 3,000,000,000,000 floating point ops/sec 3 teraflops • 6,144 microprocessors • 48 computers in cluster
Lab reaches major production and simulation milestones in Stockpile Stewardship program to ensure reliability of U.S. nuclear weapons.
United Nations panel finds threat of global warming serious, reinforcing the demand for continually more accurate and detailed climate models.
Major powers agree to cease all nuclear testing, making continuous advances in modeling and simulation essential to the Lab’s nuclear mission.
In 1983, decades after playing key roles in the Manhattan Project, Richard Feynman was recruited by Thinking Machines Corporation to help develop a supercomputer that would combine thousands of microprocessors in one machine. He brought to the task the same conceptual tools he had invented to analyze problems in quantum mechanics and for which he had won the 1965 Nobel prize in physics.
In 1965, computer engineer Gordon Moore predicted that computer speeds would double every year into the foreseeable future. Later amended to eighteen months, Moore’s Law has held true in large part because integrated-circuit technology has advanced at the same steady pace, with the number of transistors and other components that can be packed into a chip—a factor that translates directly into speed—also doubling every one to two years. Vector, parallel, and cluster computing have also played crucial roles, as has the innovative software and brilliant engineering that have enabled every other advance.
• 1,024 high-performance microprocessors • Unprecedented power tapped to run Lab-developed PAGOSA, the first fully 3-dimensional fluid-modeling program • Provides quantum leap in the simulation of processes ranging from nuclear reactions to oil-well extraction 130 billion operations per second 130,000,000,000 ops/sec 130 gigaflops
130 billion operations per second 130,000,000,000 ops/sec 130 gigaflops • 1,024 high-performance microprocessors • Unprecedented power tapped to run Lab-developed PAGOSA, the first fully 3-dimensional fluid-modeling program • Provides quantum leap in the simulation of processes ranging from nuclear reactions to oil-well extraction
In the late 1980s, it became feasible to combine thousands of microprocessors into one machine, enabling it—with the right programming—to work on thousands of parts of a problem at the same time. This “massively parallel” approach led to vast increases in speed and power. Another way to do the same thing: link many individual computers—each quite powerful—and operate the resulting cluster as one giant, massively parallel machine.
Parallel-processor and cluster computers would be impossible without systems to transfer huge amounts of data at tremendously high speeds. HIPPI became an industry standard for networking computer clusters. HIPPI tester
Say you have two sets of numbers, and want to add the two numbers in the first position, the two in the second position, and so on, to get a set of sums. A scalar processor would first issue an instruction to compute the sum of the elements in the first position, then issue the next instruction to compute the sum for the second position, then the next, and so on. A vector processor, on the other hand, can be programmed to calculate the sums for the entire set at once. This ability gives vector processors an advantage in simultaneously executing an operation over a set of values of a particular size.
Seymour Cray first gained fame as the genius behind Control Data Corporation, but he became a legend with his line of Cray supercomputers, machines that broke ground in every area—from fundamental architecture to the smallest details— and held the title of fastest computers in the world for years. Cray was in fierce competition with a new generation of supercomputer designers when he died after a car accident in 1996.
Lab introduces High-Performance Parallel Interface (HIPPI) to remove communications roadblocks in parallel and cluster computing.
AIDS epidemic is shown to be caused by Human Immunodeficiency Virus, or HIV, which immediately becomes high-priority target for Lab modeling and simulation.
Lab acquires series of Cray X-MP/4 computers that approach gigaflop speeds.
• Unique circular design contributed to speed by allowing designers to minimize the length of internal wiring • Operable in scalar or vector mode • Built-in refrigeration system to control the heat generated by the machine's high component density • Footprint so small that the entire machine can fit inside this exhibit. Take a look! 133 million floating point operations per second 133,000,000 floating point ops/sec 133 megaflops
133 million floating point operations per second 133,000,000 floating point ops/sec 133 megaflops • Unique circular design contributed to speed by allowing designers to minimize the length of internal wiring • Operable in scalar or vector mode • Built-in refrigeration system to control the heat generated by the machine’s high component density • Footprint so small that the entire machine can fit inside this exhibit. Take a look!
Early ICs contained just a handful of components, but the number grew steadily—into the thousands, millions, and beyond. By the early 1970s, a single IC could provide all the basic functions of room-sized computers that preceded them. Relentless advances in technology then took microprocessors—as these “computers on a chip” came to be known—far beyond even this level of performance. Intel 4004, the first general-purpose, commercial microprocessor
Intel 4004 is introduced; the first commercially available microprocessor, it ignites an explosion in computer and electronics power that continues to this day.
New missions and challenges—from gauging the reliability of nuclear weapons to revealing the structure of proteins and viruses—have created an endless demand for ever-faster, ever-more powerful computers. The answer has been continuous development of the supercomputer—made possible by technological breakthroughs such as the microprocessor and new concepts such as parallel processing and cluster computing.
ENIAC and MANIAC used fixed point arithmetic, storing a number in a single location in memory, with the decimal point always assumed to be in the same position—say, in the middle of eight digits. They might, for example, store 0.031415926 as “00000314,” losing much of the original precision. Exactly like scientific notation, floating point arithmetic keeps track of the digits and decimal point separately. The computer uses two locations to store the number (“31415926” in one and “–2” in another, meaning 3.1415926 x 10-2) and thus retains the original precision.
• Another product of Seymour Cray and a small group of engineers at CDC • Complete makeover of the 6600 • Temperamental—but powerful and productive • World’s fastest computer in general use until Cray-1 10 million floating point operations per second 10,000,000 floating point ops/sec 10 megaflops

