×
Businesses

Stability AI Reportedly Ran Out of Cash To Pay Its Bills For Rented Cloud GPUs (theregister.com) 45

An anonymous reader writes: The massive GPU clusters needed to train Stability AI's popular text-to-image generation model Stable Diffusion are apparently also at least partially responsible for former CEO Emad Mostaque's downfall -- because he couldn't find a way to pay for them. According to an extensive expose citing company documents and dozens of persons familiar with the matter, it's indicated that the British model builder's extreme infrastructure costs drained its coffers, leaving the biz with just $4 million in reserve by last October. Stability rented its infrastructure from Amazon Web Services, Google Cloud Platform, and GPU-centric cloud operator CoreWeave, at a reported cost of around $99 million a year. That's on top of the $54 million in wages and operating expenses required to keep the AI upstart afloat.

What's more, it appears that a sizable portion of the cloudy resources Stability AI paid for were being given away to anyone outside the startup interested in experimenting with Stability's models. One external researcher cited in the report estimated that a now-cancelled project was provided with at least $2.5 million worth of compute over the span of four months. Stability AI's infrastructure spending was not matched by revenue or fresh funding. The startup was projected to make just $11 million in sales for the 2023 calendar year. Its financials were apparently so bad that it allegedly underpaid its July 2023 bills to AWS by $1 million and had no intention of paying its August bill for $7 million. Google Cloud and CoreWeave were also not paid in full, with debts to the pair reaching $1.6 million as of October, it's reported.

It's not clear whether those bills were ultimately paid, but it's reported that the company -- once valued at a billion dollars -- weighed delaying tax payments to the UK government rather than skimping on its American payroll and risking legal penalties. The failing was pinned on Mostaque's inability to devise and execute a viable business plan. The company also failed to land deals with clients including Canva, NightCafe, Tome, and the Singaporean government, which contemplated a custom model, the report asserts. Stability's financial predicament spiraled, eroding trust among investors, making it difficult for the generative AI darling to raise additional capital, it is claimed. According to the report, Mostaque hoped to bring in a $95 million lifeline at the end of last year, but only managed to bring in $50 million from Intel. Only $20 million of that sum was disbursed, a significant shortfall given that the processor titan has a vested interest in Stability, with the AI biz slated to be a key customer for a supercomputer powered by 4,000 of its Gaudi2 accelerators.
The report goes on to mention further fundraising challenges, issues retaining employees, and copyright infringement lawsuits challenging the company's future prospects. The full expose can be read via Forbes (paywalled).
Businesses

Microsoft, OpenAI Plan $100 Billlion 'Stargate' AI Supercomputer (reuters.com) 41

According to The Information (paywalled), Microsoft and OpenAI are planning a $100 billion datacenter project that will include an artificial intelligence supercomputer called "Stargate." Reuters reports: The Information reported that Microsoft would likely be responsible for financing the project, which would be 100 times more costly than some of the biggest current data centers, citing people involved in private conversations about the proposal. OpenAI's next major AI upgrade is expected to land by early next year, the report said, adding that Microsoft executives are looking to launch Stargate as soon as 2028. The proposed U.S.-based supercomputer would be the biggest in a series of installations the companies are looking to build over the next six years, the report added.

The Information attributed the tentative cost of $100 billion to a person who spoke to OpenAI CEO Sam Altman about it and a person who has viewed some of Microsoft's initial cost estimates. It did not identify those sources. Altman and Microsoft employees have spread supercomputers across five phases, with Stargate as the fifth phase. Microsoft is working on a smaller, fourth-phase supercomputer for OpenAI that it aims to launch around 2026, according to the report. Microsoft and OpenAI are in the middle of the third phase of the five-phase plan, with much of the cost of the next two phases involving procuring the AI chips that are needed, the report said. The proposed efforts could cost in excess of $115 billion, more than three times what Microsoft spent last year on capital expenditures for servers, buildings and other equipment, the report stated.

Crime

Former Google Engineer Indicted For Stealing AI Secrets To Aid Chinese Companies 28

Linwei Ding, a former Google software engineer, has been indicted for stealing trade secrets related to AI to benefit two Chinese companies. He faces up to 10 years in prison and a $250,000 fine on each criminal count. Reuters reports: Ding's indictment was unveiled a little over a year after the Biden administration created an interagency Disruptive Technology Strike Force to help stop advanced technology being acquired by countries such as China and Russia, or potentially threaten national security. "The Justice Department just will not tolerate the theft of our trade secrets and intelligence," U.S. Attorney General Merrick Garland said at a conference in San Francisco.

According to the indictment, Ding stole detailed information about the hardware infrastructure and software platform that lets Google's supercomputing data centers train large AI models through machine learning. The stolen information included details about chips and systems, and software that helps power a supercomputer "capable of executing at the cutting edge of machine learning and AI technology," the indictment said. Google designed some of the allegedly stolen chip blueprints to gain an edge over cloud computing rivals Amazon.com and Microsoft, which design their own, and reduce its reliance on chips from Nvidia.

Hired by Google in 2019, Ding allegedly began his thefts three years later, while he was being courted to become chief technology officer for an early-stage Chinese tech company, and by May 2023 had uploaded more than 500 confidential files. The indictment said Ding founded his own technology company that month, and circulated a document to a chat group that said "We have experience with Google's ten-thousand-card computational power platform; we just need to replicate and upgrade it." Google became suspicious of Ding in December 2023 and took away his laptop on Jan. 4, 2024, the day before Ding planned to resign.
A Google spokesperson said: "We have strict safeguards to prevent the theft of our confidential commercial information and trade secrets. After an investigation, we found that this employee stole numerous documents, and we quickly referred the case to law enforcement."
Supercomputing

How a Cray-1 Supercomputer Compares to a Raspberry Pi (roylongbottom.org.uk) 145

Roy Longbottom worked for the U.K. covernment's Central Computer Agency from 1960 to 1993, and "from 1972 to 2022 I produced and ran computer benchmarking and stress testing programs..." Known as the official design authority for the Whetstone benchmark), Longbottom writes that "In 2019 (aged 84), I was recruited as a voluntary member of Raspberry Pi pre-release Alpha testing team."

And this week — now at age 87 — Longbottom has created a web page titled "Cray 1 supercomputer performance comparisons with home computers, phones and tablets." And one statistic really captures the impact of our decades of technological progress.

"In 1978, the Cray 1 supercomputer cost $7 Million, weighed 10,500 pounds and had a 115 kilowatt power supply. It was, by far, the fastest computer in the world. The Raspberry Pi costs around $70 (CPU board, case, power supply, SD card), weighs a few ounces, uses a 5 watt power supply and is more than 4.5 times faster than the Cray 1."


Thanks to long-time Slashdot reader bobdevine for sharing the link.
China

China's Secretive Sunway Pro CPU Quadruples Performance Over Its Predecessor (tomshardware.com) 73

An anonymous reader shares a report: Earlier this year, the National Supercomputing Center in Wuxi (an entity blacklisted in the U.S.) launched its new supercomputer based on the enhanced China-designed Sunway SW26010 Pro processors with 384 cores. Sunway's SW26010 Pro CPU not only packs more cores than its non-Pro SW26010 predecessor, but it more than quadrupled FP64 compute throughput due to microarchitectural and system architecture improvements, according to Chips and Cheese. However, while the manycore CPU is good on paper, it has several performance bottlenecks.

The first details of the manycore Sunway SW26010 Pro CPU and supercomputers that use it emerged back in 2021. Now, the company has showcased actual processors and disclosed more details about their architecture and design, which represent a significant leap in performance, recently at SC23. The new CPU is expected to enable China to build high-performance supercomputers based entirely on domestically developed processors. Each Sunway SW26010 Pro has a maximum FP64 throughput of 13.8 TFLOPS, which is massive. For comparison, AMD's 96-core EPYC 9654 has a peak FP64 performance of around 5.4 TFLOPS.

The SW26010 Pro is an evolution of the original SW26010, so it maintains the foundational architecture of its predecessor but introduces several key enhancements. The new SW26010 Pro processor is based on an all-new proprietary 64-bit RISC architecture and packs six core groups (CG) and a protocol processing unit (PPU). Each CG integrates 64 2-wide compute processing elements (CPEs) featuring a 512-bit vector engine as well as 256 KB of fast local store (scratchpad cache) for data and 16 KB for instructions; one management processing element (MPE), which is a superscalar out-of-order core with a vector engine, 32 KB/32 KB L1 instruction/data cache, 256 KB L2 cache; and a 128-bit DDR4-3200 memory interface.

AMD

AMD-Powered Frontier Remains Fastest Supercomputer in the World (tomshardware.com) 25

The Top500 organization released its semi-annual list of the fastest supercomputers in the world, with the AMD-powered Frontier supercomputer retaining its spot at the top of the list with 1.194 Exaflop/s (EFlop/s) of performance, fending off a half-scale 585.34 Petaflop/s (PFlop/s) submission from the Argonne National Laboratory's Intel-powered Aurora supercomputer. From a report: Argonne's submission, which only employs half of the Aurora system, lands at the second spot on the Top500, unseating Japan's Fugaku as the second-fastest supercomputer in the world. Intel also made inroads with 20 new supercomputers based on its Sapphire Rapids CPUs entering the list, but AMD's EPYC continues to take over the Top500 as it now powers 140 systems on the list -- a 39% year-over-year increase.

Intel and Argonne are currently still working to bring Arora fully online for users in 2024. As such, the Aurora submission represented 10,624 Intel CPUs and 31,874 Intel GPUs working in concert to deliver 585.34 PFlop/s at a total of 24.69 megawatts (MW) of energy. In contrast, AMD's Frontier holds the performance title at 1.194 EFlop/s, which is more than twice the performance of Aurora, while consuming a comparably miserly 22.70 MW of energy (yes, that's less power for the full Frontier supercomputer than half of the Aurora system). Aurora did not land on the Green500, a list of the most power-efficient supercomputers, with this submission, but Frontier continues to hold eighth place on that list. However, Aurora is expected to eventually reach up to 2 EFlop/s of performance when it comes fully online. When complete, Auroroa will have 21,248 Xeon Max CPUs and 63,744 Max Series 'Ponte Vecchio' GPUs spread across 166 racks and 10,624 compute blades, making it the largest known single deployment of GPUs in the world. The system leverages HPE Cray EX â" Intel Exascale Compute Blades and uses HPE's Slingshot-11 networking interconnect.

China

Chinese Scientists Claim Record-Smashing Quantum Computing Breakthrough (scmp.com) 44

From the South China Morning Post: Scientists in China say their latest quantum computer has solved an ultra-complicated mathematical problem within a millionth of a second — more than 20 billion years quicker than the world's fastest supercomputer could achieve the same task. The JiuZhang 3 prototype also smashed the record set by its predecessor in the series, with a one million-fold increase in calculation speed, according to a paper published on Tuesday by the peer-reviewed journal Physical Review Letters...

The series uses photons — tiny particles that travel at the speed of light — as the physical medium for calculations, with each one carrying a qubit, the basic unit of quantum information... The fastest classical supercomputer Frontier — developed in the US and named the world's most powerful in mid-2022 — would take over 20 billion years to complete the same task, the researchers said.

The article claims they've increased the number of photons from 76 to 113 in the first two versions, improving to 255 in the latest iteration.

Thanks to long-time Slashdot reader hackingbear for sharing the news.
Supercomputing

Europe's First Exascale Supercomputer Will Run On ARM Instead of X86 (extremetech.com) 40

An anonymous reader quotes a report from ExtremeTech: One of the world's most powerful supercomputers will soon be online in Europe, but it's not just the raw speed that will make the Jupiter supercomputer special. Unlike most of the Top 500 list, the exascale Jupiter system will rely on ARM cores instead of x86 parts. Intel and AMD might be disappointed, but Nvidia will get a piece of the Jupiter action. [...] Jupiter is a project of the European High-Performance Computing Joint Undertaking (EuroHPC JU), which is working with computing firms Eviden and ParTec to assemble the machine. Europe's first exascale computer will be installed at the Julich Supercomputing Centre in Munich, and assembly could start as soon as early 2024.

EuroHPC has opted to go with SiPearl's Rhea processor, which is based on ARM architecture. Most of the top 10 supercomputers in the world are running x86 chips, and only one is running on ARM. While ARM designs were initially popular in mobile devices, the compact, efficient cores have found use in more powerful systems. Apple has recently finished moving all its desktop and laptop computers to the ARM platform, and Qualcomm has new desktop-class chips on its roadmap. Rhea is based on ARM's Neoverse V1 CPU design, which was developed specifically for high-performance computing (HPC) applications with 72 cores. It supports HBM2e high-bandwidth memory, as well as DDR5, and the cache tops out at an impressive 160MB.
The report says the Jupiter system "will have Nvidia's Booster Module, which includes GPUs and Mellanox ultra-high bandwidth interconnects," and will likely include the current-gen H100 chips. "When complete, Jupiter will be near the very top of the supercomputer list."
AI

To Build Their AI Tech, Microsoft and Google are Using a Lot of Water (apnews.com) 73

An anonymous Slashdot reader shares this report from the Associated Press: The cost of building an artificial intelligence product like ChatGPT can be hard to measure. But one thing Microsoft-backed OpenAI needed for its technology was plenty of water, pulled from the watershed of the Raccoon and Des Moines rivers in central Iowa to cool a powerful supercomputer as it helped teach its AI systems how to mimic human writing.

As they race to capitalize on a craze for generative AI, leading tech developers including Microsoft, OpenAI and Google have acknowledged that growing demand for their AI tools carries hefty costs, from expensive semiconductors to an increase in water consumption. But they're often secretive about the specifics. Few people in Iowa knew about its status as a birthplace of OpenAI's most advanced large language model, GPT-4, before a top Microsoft executive said in a speech it "was literally made next to cornfields west of Des Moines."

Building a large language model requires analyzing patterns across a huge trove of human-written text. All of that computing takes a lot of electricity and generates a lot of heat. To keep it cool on hot days, data centers need to pump in water — often to a cooling tower outside its warehouse-sized buildings. In its latest environmental report, Microsoft disclosed that its global water consumption spiked 34% from 2021 to 2022 (to nearly 1.7 billion gallons, or more than 2,500 Olympic-sized swimming pools), a sharp increase compared to previous years that outside researchers tie to its AI research. "It's fair to say the majority of the growth is due to AI," including "its heavy investment in generative AI and partnership with OpenAI," said Shaolei Ren, a researcher at the University of California, Riverside who has been trying to calculate the environmental impact of generative AI products such as ChatGPT. In a paper due to be published later this year, Ren's team estimates ChatGPT gulps up 500 milliliters of water (close to what's in a 16-ounce water bottle) every time you ask it a series of between 5 to 50 prompts or questions...

Google reported a 20% growth in water use in the same period, which Ren also largely attributes to its AI work.

OpenAI and Microsoft both said they were working on improving "efficiencies" of their AI model-training.
Supercomputing

Can Computing Clean Up Its Act? (economist.com) 107

Long-time Slashdot reader SpzToid shares a report from The Economist: What you notice first is how silent it is," says Kimmo Koski, the boss of the Finnish IT Centre for Science. Dr Koski is describing LUMI -- Finnish for "snow" -- the most powerful supercomputer in Europe, which sits 250km south of the Arctic Circle in the town of Kajaani in Finland. LUMI, which was inaugurated last year, is used for everything from climate modeling to searching for new drugs. It has tens of thousands of individual processors and is capable of performing up to 429 quadrillion calculations every second. That makes it the third-most-powerful supercomputer in the world. Powered by hydroelectricity, and with its waste heat used to help warm homes in Kajaani, it even boasts negative emissions of carbon dioxide. LUMI offers a glimpse of the future of high-performance computing (HPC), both on dedicated supercomputers and in the cloud infrastructure that runs much of the internet. Over the past decade the demand for HPC has boomed, driven by technologies like machine learning, genome sequencing and simulations of everything from stockmarkets and nuclear weapons to the weather. It is likely to carry on rising, for such applications will happily consume as much computing power as you can throw at them. Over the same period the amount of computing power required to train a cutting-edge AI model has been doubling every five months. All this has implications for the environment.

HPC -- and computing more generally -- is becoming a big user of energy. The International Energy Agency reckons data centers account for between 1.5% and 2% of global electricity consumption, roughly the same as the entire British economy. That is expected to rise to 4% by 2030. With its eye on government pledges to reduce greenhouse-gas emissions, the computing industry is trying to find ways to do more with less and boost the efficiency of its products. The work is happening at three levels: that of individual microchips; of the computers that are built from those chips; and the data centers that, in turn, house the computers. [...] The standard measure of a data centre's efficiency is the power usage effectiveness (pue), the ratio between the data centre's overall power consumption and how much of that is used to do useful work. According to the Uptime Institute, a firm of it advisers, a typical data centre has a pue of 1.58. That means that about two-thirds of its electricity goes to running its computers while a third goes to running the data centre itself, most of which will be consumed by its cooling systems. Clever design can push that number much lower.

Most existing data centers rely on air cooling. Liquid cooling offers better heat transfer, at the cost of extra engineering effort. Several startups even offer to submerge circuit boards entirely in specially designed liquid baths. Thanks in part to its use of liquid cooling, Frontier boasts a pue of 1.03. One reason lumi was built near the Arctic Circle was to take advantage of the cool sub-Arctic air. A neighboring computer, built in the same facility, makes use of that free cooling to reach a pue rating of just 1.02. That means 98% of the electricity that comes in gets turned into useful mathematics. Even the best commercial data centers fall short of such numbers. Google's, for instance, have an average pue value of 1.1. The latest numbers from the Uptime Institute, published in June, show that, after several years of steady improvement, global data-centre efficiency has been stagnant since 2018.
The report notes that the U.S., Britain and the European Union, among others, are considering new rules that "could force data centers to become more efficient." Germany has proposed the Energy Efficiency Act that would mandate a minimum pue of 1.5 by 2027, and 1.3 by 2030.
Supercomputing

Cerebras To Enable 'Condor Galaxy' Network of AI Supercomputers 20

Cerebras Systems and G42 have introduced the Condor Galaxy project, a network of nine interconnected supercomputers designed for AI model training with a combined performance of 36 FP16 ExaFLOPs. The first supercomputer, CG-1, located in California, offers 4 ExaFLOPs of FP16 performance and 54 million cores, focusing on Large Language Models and Generative AI without the need for complex distributed programming languages. AnandTech reports: CG-2 and CG-3 will be located in the U.S. and will follow in 2024. The remaining systems will be located across the globe and the total cost of the project will be over $900 million. The CG-1 supercomputer, situated in Santa Clara, California, combines 64 Cerebras CS-2 systems into a single user-friendly AI supercomputer, capable of providing 4 ExaFLOPs of dense, systolic FP16 compute for AI training. Based around Cerebras's 2.6 trillion transistor second-generation wafer scale engine processors, the machine is designed specifically for Large Language Models and Generative AI. It supports up to 600 billion parameter models, with configurations that can be expanded to support up to 100 trillion parameter models. Its 54 million AI-optimized compute cores and massivefabric network bandwidth of 388 Tb/s allow for nearly linear performance scaling from 1 to 64 CS-2 systems, according to Cerebras. The CG-1 supercomputer also offers inherent support for long sequence length training (up to 50,000 tokens) and does not require any complex distributed programming languages, which is common in case of GPU clusters.

This supercomputer is provided as a cloud service by Cerebras and G42 and since it is located in the U.S., Cerebras and G42 assert that it will not be used by hostile states. CG-1 is the first of three 4 FP16 ExaFLOP AI supercomputers (CG-1, CG-2, and CG-3) created by Cerebras and G42 in collaboration and located in the U.S. Once connected, these three AI supercomputers will form a 12 FP16 ExaFLOP, 162 million core distributed AI supercomputer, though it remains to be seen how efficient this network will be. In 2024, G42 and Cerebras plan to launch six additional Condor Galaxy supercomputers across the world, which will increase the total compute power to 36 FP16 ExaFLOPs delivered by 576 CS-2 systems. The Condor Galaxy project aims to democratize AI by offering sophisticated AI compute technology in the cloud.
"Delivering 4 exaFLOPs of AI compute at FP16, CG-1 dramatically reduces AI training timelines while eliminating the pain of distributed compute," said Andrew Feldman, CEO of Cerebras Systems. "Many cloud companies have announced massive GPU clusters that cost billions of dollars to build, but that are extremely difficult to use. Distributing a single model over thousands of tiny GPUs takes months of time from dozens of people with rare expertise. CG-1 eliminates this challenge. Setting up a generative AI model takes minutes, not months and can be done by a single person. CG-1 is the first of three 4 ExaFLOP AI supercomputers to be deployed across the U.S. Over the next year, together with G42, we plan to expand this deployment and stand up a staggering 36 exaFLOPs of efficient, purpose-built AI compute."
Supercomputing

Tesla Starts Production of Dojo Supercomputer To Train Driverless Cars (theverge.com) 45

An anonymous reader quotes a report from The Verge: Tesla says it has started production of its Dojo supercomputer to train its fleet of autonomous vehicles. In its second quarter earnings report for 2023, the company outlined "four main technology pillars" needed to "solve vehicle autonomy at scale: extremely large real-world dataset, neural net training, vehicle hardware and vehicle software." "We are developing each of these pillars in-house," the company said in its report. "This month, we are taking a step towards faster and cheaper neural net training with the start of production of our Dojo training computer."

The automaker already has a large Nvidia GPU-based supercomputer that is one of the most powerful in the world, but the new Dojo custom-built computer is using chips designed by Tesla. In 2019, Tesla CEO Elon Musk gave this "super powerful training computer" a name: Dojo. Previously, Musk has claimed that Dojo will be capable of an exaflop, or 1 quintillion (1018) floating-point operations per second. That is an incredible amount of power. "To match what a one exaFLOP computer system can do in just one second, you'd have to perform one calculation every second for 31,688,765,000 years," Network World wrote.

Earth

Study the Risks of Sun-Blocking Aerosols, Say 60 Scientists, the US, the EU, and One Supercomputer (scientificamerican.com) 101

Nine days ago the U.S. government released a report on the advantages of studying "scientific and societal implications" of "solar radiation modification" (or SRM) to explore its possible "risks and benefits...as a component of climate policy."

The report's executive summary seems to concede the technique would "negate (explicitly offset) all current or future impacts of climate change" — but would also introduce "an additional change" to "the existing, complex climate system, with ramifications which are not now well understood." Or, as Politico puts it, "The White House cautiously endorsed the idea of studying how to block sunlight from hitting Earth's surface as a way to limit global warming in a congressionally mandated report that could help bring efforts once confined to science fiction into the realm of legitimate debate."

But again, the report endorsed the idea of studying it — to further understand the risks, and also help prepare for "possible deployment of SRM by other public or private actors." Politico emphasized how this report "added a degree of skepticism by noting that Congress has ordered the review, and the administration said it does not signal any new policy decisions related to a process that is sometimes referred to — or derided as — geoengineering." "Climate change is already having profound effects on the physical and natural world, and on human well-being, and these effects will only grow as greenhouse gas concentrations increase and warming continues," the report said. "Understanding these impacts is crucial to enable informed decisions around a possible role for SRM in addressing human hardships associated with climate change..."

The White House said that any potential research on solar radiation modification should be undertaken with "appropriate international cooperation."

It's not just the U.S. making official statements. Their report was released "the same week that European Union leaders opened the door to international discussions of solar radiation modification," according to Politico's report: Policymakers in the European Union have signaled a willingness to begin international discussions of whether and how humanity could limit heating from the sun. "Guided by the precautionary principle, the EU will support international efforts to assess comprehensively the risks and uncertainties of climate interventions, including solar radiation modification and promote discussions on a potential international framework for its governance, including research related aspects," the European Parliament and European Council said in a joint communication.
And it also "follows an open letter by more than 60 leading scientists calling for more research," reports Scientific American. They also note a new supercomputer helping climate scientists model the effects of injecting human-made, sun-blocking aerosols into the stratosphere: The machine, named Derecho, began operating this month at the National Center for Atmospheric Research (NCAR) and will allow scientists to run more detailed weather models for research on solar geoengineering, said Kristen Rasmussen, a climate scientist at Colorado State University who is studying how human-made aerosols, which can be used to deflect sunlight, could affect rainfall patterns... "To understand specific impacts on thunderstorms, we require the use of very high-resolution models that can be run for many, many years," Rasmussen said in an interview. "This faster supercomputer will enable more simulations at longer time frames and at higher resolution than we can currently support..."

The National Academies of Sciences, Engineering and Medicine released a report in 2021 urging scientists to study the impacts of geoengineering, which Rasmussen described as a last resort to address climate change.

"We need to be very cautious," she said. "I am not advocating in any way to move forward on any of these types of mitigation efforts. The best thing to do is to stop fossil fuel emissions as much as we can."

Google

Quantum Supremacy? Google Claims 70-Qubit Quantum Supercomputer (telegraph.co.uk) 35

Google says it would take the world's leading supercomputer more than 47 years to match the calculation speed of its newest quantum computer, reports the Telegraph: Four years ago, Google claimed to be the first company to achieve "quantum supremacy" — a milestone point at which quantum computers surpass existing machines. This was challenged at the time by rivals, which argued that Google was exaggerating the difference between its machine and traditional supercomputers. The company's new paper — Phase Transition in Random Circuit Sampling — published on the open access science website ArXiv, demonstrates a more powerful device that aims to end the debate.

While [Google's] 2019 machine had 53 qubits, the building blocks of quantum computers, the next generation device has 70. Adding more qubits improves a quantum computer's power exponentially, meaning the new machine is 241 million times more powerful than the 2019 machine...

Steve Brierley, the chief executive of Cambridge-based quantum company Riverlane, said: "This is a major milestone. The squabbling about whether we had reached, or indeed could reach, quantum supremacy is now resolved."

Thanks to long-time Slashdot reader schwit1 for sharing the article.
Supercomputing

Inflection AI Develops Supercomputer Equipped With 22,000 Nvidia H100 AI GPUs 28

Inflection AI, an AI startup company, has built a cutting-edge supercomputer equipped with 22,000 NVIDIA H100 GPUs. Wccftech reports: For those unfamiliar with Inflection AI, it is a business that aims at creating "personal AI for everyone." The company is widely known for its recently introduced Inflection-1 AI model, which powers the Pi chatbot. Although the AI model hasn't yet reached the level of ChatGPT or Google's LaMDA models, reports suggest that Inflection-1 performs well on "common sense" tasks, making it much more suitable for applications such as personal assistance.
>
Coming back, Inflection announced that it is building one of the world's largest AI-based supercomputers, and it looks like we finally have a glimpse of what it would be. It is reported that the Inflection supercomputer is equipped with 22,000 H100 GPUs, and based on analysis, it would contain almost 700 four-node racks of Intel Xeon CPUs. The supercomputer will utilize an astounding 31 Mega-Watts of power.

The surprising fact about the supercomputer is the acquisition of 22,000 NVIDIA H100 GPUs. We all are well aware that, in recent times, it has been challenging to acquire even a single unit of the H100s since they are in immense demand, and NVIDIA cannot cope with the influx of orders. In the case of Inflection AI, NVIDIA is considering being an investor in the company, which is why in their case, it is easier to get their hands on such a massive number of GPUs.
Open Source

Peplum: F/OSS Distributed Parallel Computing and Supercomputing At Home With Ruby Infrastructure (ecsypno.com) 20

Slashdot reader Zapotek brings an update from the Ecsypno skunkworks, where they've been busy with R&D for distributed computing systems: Armed with Cuboid, Qmap was built, which tackled the handling of nmap in a distributed environment, with great results. Afterwards, an iterative clean-up process led to a template of sorts, for scheduling most applications in such environments.

With that, Peplum was born, which allows for OS applications, Ruby code and C/C++/Rust code (via Ruby extensions) to be distributed across machines and tackle the processing of neatly grouped objects.

In essence, Peplum:

- Is a distributed computing solution backed by Cuboid.
- Its basic function is to distribute workloads and deliver payloads across multiple machines and thus parallelize otherwise time consuming tasks.
- Allows you to combine several machines and built a cluster/supercomputer of sorts with great ease.

After that was dealt with, it was time to port Qmap over to Peplum for easier long-term maintenance, thus renamed Peplum::Nmap.

We have high hopes for Peplum as it basically means easy, simple and joyful cloud/clustering/super-computing at home, on-premise, anywhere really. Along with the capability to turn a lot of security oriented apps into super versions of themselves, it is quite the infrastructure.

Yes, this means there's a new solution if you're using multiple machines for "running simulations, to network mapping/security scans, to password cracking/recovery or just encoding your collection of music and video" -- or anything else: Peplum is a F/OSS (MIT licensed) project aimed at making clustering/super-computing affordable and accessible, by making it simple to setup a distributed parallel computing environment for abstract applications... TLDR: You no longer have to only imagine a Beowulf cluster of those, you can now easily build one yourself with Peplum.
Some technical specs: It is written in the Ruby programming language, thus coming with an entire ecosystem of libraries and the capability to run abstract Ruby code, execute external utilities, run OS commands, call C/C++/Rust routines and more...

Peplum is powered by Cuboid, a F/OSS (MIT licensed) abstract framework for distributed computing — both of them are funded by Ecsypno Single Member P.C., a new R&D and Consulting company.

Supercomputing

IBM Wants To Build a 100,000-Qubit Quantum Computer (technologyreview.com) 27

IBM has announced its goal to build a 100,000-qubit quantum computing machine within the next 10 years in collaboration with the University of Tokyo and the University of Chicago. MIT Technology Review reports: Late last year, IBM took the record for the largest quantum computing system with a processor that contained 433 quantum bits, or qubits, the fundamental building blocks of quantum information processing. Now, the company has set its sights on a much bigger target: a 100,000-qubit machine that it aims to build within 10 years. IBM made the announcement on May 22 at the G7 summit in Hiroshima, Japan. The company will partner with the University of Tokyo and the University of Chicago in a $100 million dollar initiative to push quantum computing into the realm of full-scale operation, where the technology could potentially tackle pressing problems that no standard supercomputer can solve.

Or at least it can't solve them alone. The idea is that the 100,000 qubits will work alongside the best "classical" supercomputers to achieve new breakthroughs in drug discovery, fertilizer production, battery performance, and a host of other applications. "I call this quantum-centric supercomputing," IBM's VP of quantum, Jay Gambetta, told MIT Technology Review in an in-person interview in London last week. [...] IBM has already done proof-of-principle experiments (PDF) showing that integrated circuits based on "complementary metal oxide semiconductor" (CMOS) technology can be installed next to the cold qubits to control them with just tens of milliwatts. Beyond that, he admits, the technology required for quantum-centric supercomputing does not yet exist: that is why academic research is a vital part of the project.

The qubits will exist on a type of modular chip that is only just beginning to take shape in IBM labs. Modularity, essential when it will be impossible to put enough qubits on a single chip, requires interconnects that transfer quantum information between modules. IBM's "Kookaburra," a 1,386-qubit multichip processor with a quantum communication link, is under development and slated for release in 2025. Other necessary innovations are where the universities come in. Researchers at Tokyo and Chicago have already made significant strides in areas such as components and communication innovations that could be vital parts of the final product, Gambetta says. He thinks there will likely be many more industry-academic collaborations to come over the next decade. "We have to help the universities do what they do best," he says.

Intel

Intel Gives Details on Future AI Chips as It Shifts Strategy (reuters.com) 36

Intel on Monday provided a handful of new details on a chip for artificial intelligence (AI) computing it plans to introduce in 2025 as it shifts strategy to compete against Nvidia and Advanced Micro Devices. From a report: At a supercomputing conference in Germany on Monday, Intel said its forthcoming "Falcon Shores" chip will have 288 gigabytes of memory and support 8-bit floating point computation. Those technical specifications are important as artificial intelligence models similar to services like ChatGPT have exploded in size, and businesses are looking for more powerful chips to run them.

The details are also among the first to trickle out as Intel carries out a strategy shift to catch up to Nvidia, which leads the market in chips for AI, and AMD, which is expected to challenge Nvidia's position with a chip called the MI300. Intel, by contrast, has essentially no market share after its would-be Nvidia competitor, a chip called Ponte Vecchio, suffered years of delays. Intel on Monday said it has nearly completed shipments for Argonne National Lab's Aurora supercomputer based on Ponte Vecchio, which Intel claims has better performance than Nvidia's latest AI chip, the H100. But Intel's Falcon Shores follow-on chip won't be to market until 2025, when Nvidia will likely have another chip of its own out.

AMD

AMD Now Powers 121 of the World's Fastest Supercomputers (tomshardware.com) 22

The Top 500 list of the fastest supercomputers in the world was released today, and AMD continues its streak of impressive wins with 121 systems now powered by AMD's silicon -- a year-over-year increase of 29%. From a report: Additionally, AMD continues to hold the #1 spot on the Top 500 with the Frontier supercomputer, while the test and development system based on the same architecture continues to hold the second spot in power efficiency metrics on the Green 500 list. Overall, AMD also powers seven of the top ten systems on the Green 500 list. The AMD-powered Frontier remains the only fully-qualified exascale-class supercomputer on the planet, as the Intel-powered two-exaflop Aurora has still not submitted a benchmark result after years of delays.

In contrast, Frontier is now fully operational and is being used by researchers in a multitude of science workloads. In fact, Frontier continues to improve from tuning -- the system entered the Top 500 list with 1.02 exaflops of performance in June 2022 but has now improved to 1.194 exaflops, a 17% increase. That's an impressive increase from the same 8,699,904 CPU cores it debuted with. For perspective, that extra 92 petaflops of performance from tuning represents the same amount of computational horsepower as the entire Perlmutter system that ranks eighth on the Top 500.

AI

Meta's Building an In-House AI Chip to Compete with Other Tech Giants (techcrunch.com) 17

An anonymous reader shared this report from the Verge: Meta is building its first custom chip specifically for running AI models, the company announced on Thursday. As Meta increases its AI efforts — CEO Mark Zuckerberg recently said the company sees "an opportunity to introduce AI agents to billions of people in ways that will be useful and meaningful" — the chip and other infrastructure plans revealed Thursday could be critical tools for Meta to compete with other tech giants also investing significant resources into AI.

Meta's new MTIA chip, which stands for Meta Training and Inference Accelerator, is its "in-house, custom accelerator chip family targeting inference workloads," Meta VP and head of infrastructure Santosh Janardhan wrote in a blog post... But the MTIA chip is seemingly a long ways away: it's not set to come out until 2025, TechCrunch reports.

Meta has been working on "a massive project to upgrade its AI infrastructure in the past year," Reuters reports, "after executives realized it lacked the hardware and software to support demand from product teams building AI-powered features."

As a result, the company scrapped plans for a large-scale rollout of an in-house inference chip and started work on a more ambitious chip capable of performing training and inference, Reuters reported...

Meta said it has an AI-powered system to help its engineers create computer code, similar to tools offered by Microsoft, Amazon and Alphabet.

TechCrunch calls these announcements "an attempt at a projection of strength from Meta, which historically has been slow to adopt AI-friendly hardware systems — hobbling its ability to keep pace with rivals such as Google and Microsoft."

Meta's VP of Infrastructure told TechCrunch "This level of vertical integration is needed to push the boundaries of AI research at scale." Over the past decade or so, Meta has spent billions of dollars recruiting top data scientists and building new kinds of AI, including AI that now powers the discovery engines, moderation filters and ad recommenders found throughout its apps and services. But the company has struggled to turn many of its more ambitious AI research innovations into products, particularly on the generative AI front. Until 2022, Meta largely ran its AI workloads using a combination of CPUs — which tend to be less efficient for those sorts of tasks than GPUs — and a custom chip designed for accelerating AI algorithms...

The MTIA is an ASIC, a kind of chip that combines different circuits on one board, allowing it to be programmed to carry out one or many tasks in parallel... Custom AI chips are increasingly the name of the game among the Big Tech players. Google created a processor, the TPU (short for "tensor processing unit"), to train large generative AI systems like PaLM-2 and Imagen. Amazon offers proprietary chips to AWS customers both for training (Trainium) and inferencing (Inferentia). And Microsoft, reportedly, is working with AMD to develop an in-house AI chip called Athena.

Meta says that it created the first generation of the MTIA — MTIA v1 — in 2020, built on a 7-nanometer process. It can scale beyond its internal 128 MB of memory to up to 128 GB, and in a Meta-designed benchmark test — which, of course, has to be taken with a grain of salt — Meta claims that the MTIA handled "low-complexity" and "medium-complexity" AI models more efficiently than a GPU. Work remains to be done in the memory and networking areas of the chip, Meta says, which present bottlenecks as the size of AI models grow, requiring workloads to be split up across several chips. (Not coincidentally, Meta recently acquired an Oslo-based team building AI networking tech at British chip unicorn Graphcore.) And for now, the MTIA's focus is strictly on inference — not training — for "recommendation workloads" across Meta's app family...

If there's a common thread in today's hardware announcements, it's that Meta's attempting desperately to pick up the pace where it concerns AI, specifically generative AI... In part, Meta's feeling increasing pressure from investors concerned that the company's not moving fast enough to capture the (potentially large) market for generative AI. It has no answer — yet — to chatbots like Bard, Bing Chat or ChatGPT. Nor has it made much progress on image generation, another key segment that's seen explosive growth.

If the predictions are right, the total addressable market for generative AI software could be $150 billion. Goldman Sachs predicts that it'll raise GDP by 7%. Even a small slice of that could erase the billions Meta's lost in investments in "metaverse" technologies like augmented reality headsets, meetings software and VR playgrounds like Horizon Worlds.

Slashdot Top Deals