China

Chinese Scientists Claim Record-Smashing Quantum Computing Breakthrough (scmp.com) 44

From the South China Morning Post: Scientists in China say their latest quantum computer has solved an ultra-complicated mathematical problem within a millionth of a second — more than 20 billion years quicker than the world's fastest supercomputer could achieve the same task. The JiuZhang 3 prototype also smashed the record set by its predecessor in the series, with a one million-fold increase in calculation speed, according to a paper published on Tuesday by the peer-reviewed journal Physical Review Letters...

The series uses photons — tiny particles that travel at the speed of light — as the physical medium for calculations, with each one carrying a qubit, the basic unit of quantum information... The fastest classical supercomputer Frontier — developed in the US and named the world's most powerful in mid-2022 — would take over 20 billion years to complete the same task, the researchers said.

The article claims they've increased the number of photons from 76 to 113 in the first two versions, improving to 255 in the latest iteration.

Thanks to long-time Slashdot reader hackingbear for sharing the news.
Supercomputing

Europe's First Exascale Supercomputer Will Run On ARM Instead of X86 (extremetech.com) 40

An anonymous reader quotes a report from ExtremeTech: One of the world's most powerful supercomputers will soon be online in Europe, but it's not just the raw speed that will make the Jupiter supercomputer special. Unlike most of the Top 500 list, the exascale Jupiter system will rely on ARM cores instead of x86 parts. Intel and AMD might be disappointed, but Nvidia will get a piece of the Jupiter action. [...] Jupiter is a project of the European High-Performance Computing Joint Undertaking (EuroHPC JU), which is working with computing firms Eviden and ParTec to assemble the machine. Europe's first exascale computer will be installed at the Julich Supercomputing Centre in Munich, and assembly could start as soon as early 2024.

EuroHPC has opted to go with SiPearl's Rhea processor, which is based on ARM architecture. Most of the top 10 supercomputers in the world are running x86 chips, and only one is running on ARM. While ARM designs were initially popular in mobile devices, the compact, efficient cores have found use in more powerful systems. Apple has recently finished moving all its desktop and laptop computers to the ARM platform, and Qualcomm has new desktop-class chips on its roadmap. Rhea is based on ARM's Neoverse V1 CPU design, which was developed specifically for high-performance computing (HPC) applications with 72 cores. It supports HBM2e high-bandwidth memory, as well as DDR5, and the cache tops out at an impressive 160MB.
The report says the Jupiter system "will have Nvidia's Booster Module, which includes GPUs and Mellanox ultra-high bandwidth interconnects," and will likely include the current-gen H100 chips. "When complete, Jupiter will be near the very top of the supercomputer list."
AI

To Build Their AI Tech, Microsoft and Google are Using a Lot of Water (apnews.com) 73

An anonymous Slashdot reader shares this report from the Associated Press: The cost of building an artificial intelligence product like ChatGPT can be hard to measure. But one thing Microsoft-backed OpenAI needed for its technology was plenty of water, pulled from the watershed of the Raccoon and Des Moines rivers in central Iowa to cool a powerful supercomputer as it helped teach its AI systems how to mimic human writing.

As they race to capitalize on a craze for generative AI, leading tech developers including Microsoft, OpenAI and Google have acknowledged that growing demand for their AI tools carries hefty costs, from expensive semiconductors to an increase in water consumption. But they're often secretive about the specifics. Few people in Iowa knew about its status as a birthplace of OpenAI's most advanced large language model, GPT-4, before a top Microsoft executive said in a speech it "was literally made next to cornfields west of Des Moines."

Building a large language model requires analyzing patterns across a huge trove of human-written text. All of that computing takes a lot of electricity and generates a lot of heat. To keep it cool on hot days, data centers need to pump in water — often to a cooling tower outside its warehouse-sized buildings. In its latest environmental report, Microsoft disclosed that its global water consumption spiked 34% from 2021 to 2022 (to nearly 1.7 billion gallons, or more than 2,500 Olympic-sized swimming pools), a sharp increase compared to previous years that outside researchers tie to its AI research. "It's fair to say the majority of the growth is due to AI," including "its heavy investment in generative AI and partnership with OpenAI," said Shaolei Ren, a researcher at the University of California, Riverside who has been trying to calculate the environmental impact of generative AI products such as ChatGPT. In a paper due to be published later this year, Ren's team estimates ChatGPT gulps up 500 milliliters of water (close to what's in a 16-ounce water bottle) every time you ask it a series of between 5 to 50 prompts or questions...

Google reported a 20% growth in water use in the same period, which Ren also largely attributes to its AI work.

OpenAI and Microsoft both said they were working on improving "efficiencies" of their AI model-training.
Supercomputing

Can Computing Clean Up Its Act? (economist.com) 107

Long-time Slashdot reader SpzToid shares a report from The Economist: What you notice first is how silent it is," says Kimmo Koski, the boss of the Finnish IT Centre for Science. Dr Koski is describing LUMI -- Finnish for "snow" -- the most powerful supercomputer in Europe, which sits 250km south of the Arctic Circle in the town of Kajaani in Finland. LUMI, which was inaugurated last year, is used for everything from climate modeling to searching for new drugs. It has tens of thousands of individual processors and is capable of performing up to 429 quadrillion calculations every second. That makes it the third-most-powerful supercomputer in the world. Powered by hydroelectricity, and with its waste heat used to help warm homes in Kajaani, it even boasts negative emissions of carbon dioxide. LUMI offers a glimpse of the future of high-performance computing (HPC), both on dedicated supercomputers and in the cloud infrastructure that runs much of the internet. Over the past decade the demand for HPC has boomed, driven by technologies like machine learning, genome sequencing and simulations of everything from stockmarkets and nuclear weapons to the weather. It is likely to carry on rising, for such applications will happily consume as much computing power as you can throw at them. Over the same period the amount of computing power required to train a cutting-edge AI model has been doubling every five months. All this has implications for the environment.

HPC -- and computing more generally -- is becoming a big user of energy. The International Energy Agency reckons data centers account for between 1.5% and 2% of global electricity consumption, roughly the same as the entire British economy. That is expected to rise to 4% by 2030. With its eye on government pledges to reduce greenhouse-gas emissions, the computing industry is trying to find ways to do more with less and boost the efficiency of its products. The work is happening at three levels: that of individual microchips; of the computers that are built from those chips; and the data centers that, in turn, house the computers. [...] The standard measure of a data centre's efficiency is the power usage effectiveness (pue), the ratio between the data centre's overall power consumption and how much of that is used to do useful work. According to the Uptime Institute, a firm of it advisers, a typical data centre has a pue of 1.58. That means that about two-thirds of its electricity goes to running its computers while a third goes to running the data centre itself, most of which will be consumed by its cooling systems. Clever design can push that number much lower.

Most existing data centers rely on air cooling. Liquid cooling offers better heat transfer, at the cost of extra engineering effort. Several startups even offer to submerge circuit boards entirely in specially designed liquid baths. Thanks in part to its use of liquid cooling, Frontier boasts a pue of 1.03. One reason lumi was built near the Arctic Circle was to take advantage of the cool sub-Arctic air. A neighboring computer, built in the same facility, makes use of that free cooling to reach a pue rating of just 1.02. That means 98% of the electricity that comes in gets turned into useful mathematics. Even the best commercial data centers fall short of such numbers. Google's, for instance, have an average pue value of 1.1. The latest numbers from the Uptime Institute, published in June, show that, after several years of steady improvement, global data-centre efficiency has been stagnant since 2018.
The report notes that the U.S., Britain and the European Union, among others, are considering new rules that "could force data centers to become more efficient." Germany has proposed the Energy Efficiency Act that would mandate a minimum pue of 1.5 by 2027, and 1.3 by 2030.
Supercomputing

Cerebras To Enable 'Condor Galaxy' Network of AI Supercomputers 20

Cerebras Systems and G42 have introduced the Condor Galaxy project, a network of nine interconnected supercomputers designed for AI model training with a combined performance of 36 FP16 ExaFLOPs. The first supercomputer, CG-1, located in California, offers 4 ExaFLOPs of FP16 performance and 54 million cores, focusing on Large Language Models and Generative AI without the need for complex distributed programming languages. AnandTech reports: CG-2 and CG-3 will be located in the U.S. and will follow in 2024. The remaining systems will be located across the globe and the total cost of the project will be over $900 million. The CG-1 supercomputer, situated in Santa Clara, California, combines 64 Cerebras CS-2 systems into a single user-friendly AI supercomputer, capable of providing 4 ExaFLOPs of dense, systolic FP16 compute for AI training. Based around Cerebras's 2.6 trillion transistor second-generation wafer scale engine processors, the machine is designed specifically for Large Language Models and Generative AI. It supports up to 600 billion parameter models, with configurations that can be expanded to support up to 100 trillion parameter models. Its 54 million AI-optimized compute cores and massivefabric network bandwidth of 388 Tb/s allow for nearly linear performance scaling from 1 to 64 CS-2 systems, according to Cerebras. The CG-1 supercomputer also offers inherent support for long sequence length training (up to 50,000 tokens) and does not require any complex distributed programming languages, which is common in case of GPU clusters.

This supercomputer is provided as a cloud service by Cerebras and G42 and since it is located in the U.S., Cerebras and G42 assert that it will not be used by hostile states. CG-1 is the first of three 4 FP16 ExaFLOP AI supercomputers (CG-1, CG-2, and CG-3) created by Cerebras and G42 in collaboration and located in the U.S. Once connected, these three AI supercomputers will form a 12 FP16 ExaFLOP, 162 million core distributed AI supercomputer, though it remains to be seen how efficient this network will be. In 2024, G42 and Cerebras plan to launch six additional Condor Galaxy supercomputers across the world, which will increase the total compute power to 36 FP16 ExaFLOPs delivered by 576 CS-2 systems. The Condor Galaxy project aims to democratize AI by offering sophisticated AI compute technology in the cloud.
"Delivering 4 exaFLOPs of AI compute at FP16, CG-1 dramatically reduces AI training timelines while eliminating the pain of distributed compute," said Andrew Feldman, CEO of Cerebras Systems. "Many cloud companies have announced massive GPU clusters that cost billions of dollars to build, but that are extremely difficult to use. Distributing a single model over thousands of tiny GPUs takes months of time from dozens of people with rare expertise. CG-1 eliminates this challenge. Setting up a generative AI model takes minutes, not months and can be done by a single person. CG-1 is the first of three 4 ExaFLOP AI supercomputers to be deployed across the U.S. Over the next year, together with G42, we plan to expand this deployment and stand up a staggering 36 exaFLOPs of efficient, purpose-built AI compute."
Supercomputing

Tesla Starts Production of Dojo Supercomputer To Train Driverless Cars (theverge.com) 45

An anonymous reader quotes a report from The Verge: Tesla says it has started production of its Dojo supercomputer to train its fleet of autonomous vehicles. In its second quarter earnings report for 2023, the company outlined "four main technology pillars" needed to "solve vehicle autonomy at scale: extremely large real-world dataset, neural net training, vehicle hardware and vehicle software." "We are developing each of these pillars in-house," the company said in its report. "This month, we are taking a step towards faster and cheaper neural net training with the start of production of our Dojo training computer."

The automaker already has a large Nvidia GPU-based supercomputer that is one of the most powerful in the world, but the new Dojo custom-built computer is using chips designed by Tesla. In 2019, Tesla CEO Elon Musk gave this "super powerful training computer" a name: Dojo. Previously, Musk has claimed that Dojo will be capable of an exaflop, or 1 quintillion (1018) floating-point operations per second. That is an incredible amount of power. "To match what a one exaFLOP computer system can do in just one second, you'd have to perform one calculation every second for 31,688,765,000 years," Network World wrote.

Earth

Study the Risks of Sun-Blocking Aerosols, Say 60 Scientists, the US, the EU, and One Supercomputer (scientificamerican.com) 101

Nine days ago the U.S. government released a report on the advantages of studying "scientific and societal implications" of "solar radiation modification" (or SRM) to explore its possible "risks and benefits...as a component of climate policy."

The report's executive summary seems to concede the technique would "negate (explicitly offset) all current or future impacts of climate change" — but would also introduce "an additional change" to "the existing, complex climate system, with ramifications which are not now well understood." Or, as Politico puts it, "The White House cautiously endorsed the idea of studying how to block sunlight from hitting Earth's surface as a way to limit global warming in a congressionally mandated report that could help bring efforts once confined to science fiction into the realm of legitimate debate."

But again, the report endorsed the idea of studying it — to further understand the risks, and also help prepare for "possible deployment of SRM by other public or private actors." Politico emphasized how this report "added a degree of skepticism by noting that Congress has ordered the review, and the administration said it does not signal any new policy decisions related to a process that is sometimes referred to — or derided as — geoengineering." "Climate change is already having profound effects on the physical and natural world, and on human well-being, and these effects will only grow as greenhouse gas concentrations increase and warming continues," the report said. "Understanding these impacts is crucial to enable informed decisions around a possible role for SRM in addressing human hardships associated with climate change..."

The White House said that any potential research on solar radiation modification should be undertaken with "appropriate international cooperation."

It's not just the U.S. making official statements. Their report was released "the same week that European Union leaders opened the door to international discussions of solar radiation modification," according to Politico's report: Policymakers in the European Union have signaled a willingness to begin international discussions of whether and how humanity could limit heating from the sun. "Guided by the precautionary principle, the EU will support international efforts to assess comprehensively the risks and uncertainties of climate interventions, including solar radiation modification and promote discussions on a potential international framework for its governance, including research related aspects," the European Parliament and European Council said in a joint communication.
And it also "follows an open letter by more than 60 leading scientists calling for more research," reports Scientific American. They also note a new supercomputer helping climate scientists model the effects of injecting human-made, sun-blocking aerosols into the stratosphere: The machine, named Derecho, began operating this month at the National Center for Atmospheric Research (NCAR) and will allow scientists to run more detailed weather models for research on solar geoengineering, said Kristen Rasmussen, a climate scientist at Colorado State University who is studying how human-made aerosols, which can be used to deflect sunlight, could affect rainfall patterns... "To understand specific impacts on thunderstorms, we require the use of very high-resolution models that can be run for many, many years," Rasmussen said in an interview. "This faster supercomputer will enable more simulations at longer time frames and at higher resolution than we can currently support..."

The National Academies of Sciences, Engineering and Medicine released a report in 2021 urging scientists to study the impacts of geoengineering, which Rasmussen described as a last resort to address climate change.

"We need to be very cautious," she said. "I am not advocating in any way to move forward on any of these types of mitigation efforts. The best thing to do is to stop fossil fuel emissions as much as we can."

Google

Quantum Supremacy? Google Claims 70-Qubit Quantum Supercomputer (telegraph.co.uk) 35

Google says it would take the world's leading supercomputer more than 47 years to match the calculation speed of its newest quantum computer, reports the Telegraph: Four years ago, Google claimed to be the first company to achieve "quantum supremacy" — a milestone point at which quantum computers surpass existing machines. This was challenged at the time by rivals, which argued that Google was exaggerating the difference between its machine and traditional supercomputers. The company's new paper — Phase Transition in Random Circuit Sampling — published on the open access science website ArXiv, demonstrates a more powerful device that aims to end the debate.

While [Google's] 2019 machine had 53 qubits, the building blocks of quantum computers, the next generation device has 70. Adding more qubits improves a quantum computer's power exponentially, meaning the new machine is 241 million times more powerful than the 2019 machine...

Steve Brierley, the chief executive of Cambridge-based quantum company Riverlane, said: "This is a major milestone. The squabbling about whether we had reached, or indeed could reach, quantum supremacy is now resolved."

Thanks to long-time Slashdot reader schwit1 for sharing the article.
Supercomputing

Inflection AI Develops Supercomputer Equipped With 22,000 Nvidia H100 AI GPUs 28

Inflection AI, an AI startup company, has built a cutting-edge supercomputer equipped with 22,000 NVIDIA H100 GPUs. Wccftech reports: For those unfamiliar with Inflection AI, it is a business that aims at creating "personal AI for everyone." The company is widely known for its recently introduced Inflection-1 AI model, which powers the Pi chatbot. Although the AI model hasn't yet reached the level of ChatGPT or Google's LaMDA models, reports suggest that Inflection-1 performs well on "common sense" tasks, making it much more suitable for applications such as personal assistance.
>
Coming back, Inflection announced that it is building one of the world's largest AI-based supercomputers, and it looks like we finally have a glimpse of what it would be. It is reported that the Inflection supercomputer is equipped with 22,000 H100 GPUs, and based on analysis, it would contain almost 700 four-node racks of Intel Xeon CPUs. The supercomputer will utilize an astounding 31 Mega-Watts of power.

The surprising fact about the supercomputer is the acquisition of 22,000 NVIDIA H100 GPUs. We all are well aware that, in recent times, it has been challenging to acquire even a single unit of the H100s since they are in immense demand, and NVIDIA cannot cope with the influx of orders. In the case of Inflection AI, NVIDIA is considering being an investor in the company, which is why in their case, it is easier to get their hands on such a massive number of GPUs.
Open Source

Peplum: F/OSS Distributed Parallel Computing and Supercomputing At Home With Ruby Infrastructure (ecsypno.com) 20

Slashdot reader Zapotek brings an update from the Ecsypno skunkworks, where they've been busy with R&D for distributed computing systems: Armed with Cuboid, Qmap was built, which tackled the handling of nmap in a distributed environment, with great results. Afterwards, an iterative clean-up process led to a template of sorts, for scheduling most applications in such environments.

With that, Peplum was born, which allows for OS applications, Ruby code and C/C++/Rust code (via Ruby extensions) to be distributed across machines and tackle the processing of neatly grouped objects.

In essence, Peplum:

- Is a distributed computing solution backed by Cuboid.
- Its basic function is to distribute workloads and deliver payloads across multiple machines and thus parallelize otherwise time consuming tasks.
- Allows you to combine several machines and built a cluster/supercomputer of sorts with great ease.

After that was dealt with, it was time to port Qmap over to Peplum for easier long-term maintenance, thus renamed Peplum::Nmap.

We have high hopes for Peplum as it basically means easy, simple and joyful cloud/clustering/super-computing at home, on-premise, anywhere really. Along with the capability to turn a lot of security oriented apps into super versions of themselves, it is quite the infrastructure.

Yes, this means there's a new solution if you're using multiple machines for "running simulations, to network mapping/security scans, to password cracking/recovery or just encoding your collection of music and video" -- or anything else: Peplum is a F/OSS (MIT licensed) project aimed at making clustering/super-computing affordable and accessible, by making it simple to setup a distributed parallel computing environment for abstract applications... TLDR: You no longer have to only imagine a Beowulf cluster of those, you can now easily build one yourself with Peplum.
Some technical specs: It is written in the Ruby programming language, thus coming with an entire ecosystem of libraries and the capability to run abstract Ruby code, execute external utilities, run OS commands, call C/C++/Rust routines and more...

Peplum is powered by Cuboid, a F/OSS (MIT licensed) abstract framework for distributed computing — both of them are funded by Ecsypno Single Member P.C., a new R&D and Consulting company.

Supercomputing

IBM Wants To Build a 100,000-Qubit Quantum Computer (technologyreview.com) 27

IBM has announced its goal to build a 100,000-qubit quantum computing machine within the next 10 years in collaboration with the University of Tokyo and the University of Chicago. MIT Technology Review reports: Late last year, IBM took the record for the largest quantum computing system with a processor that contained 433 quantum bits, or qubits, the fundamental building blocks of quantum information processing. Now, the company has set its sights on a much bigger target: a 100,000-qubit machine that it aims to build within 10 years. IBM made the announcement on May 22 at the G7 summit in Hiroshima, Japan. The company will partner with the University of Tokyo and the University of Chicago in a $100 million dollar initiative to push quantum computing into the realm of full-scale operation, where the technology could potentially tackle pressing problems that no standard supercomputer can solve.

Or at least it can't solve them alone. The idea is that the 100,000 qubits will work alongside the best "classical" supercomputers to achieve new breakthroughs in drug discovery, fertilizer production, battery performance, and a host of other applications. "I call this quantum-centric supercomputing," IBM's VP of quantum, Jay Gambetta, told MIT Technology Review in an in-person interview in London last week. [...] IBM has already done proof-of-principle experiments (PDF) showing that integrated circuits based on "complementary metal oxide semiconductor" (CMOS) technology can be installed next to the cold qubits to control them with just tens of milliwatts. Beyond that, he admits, the technology required for quantum-centric supercomputing does not yet exist: that is why academic research is a vital part of the project.

The qubits will exist on a type of modular chip that is only just beginning to take shape in IBM labs. Modularity, essential when it will be impossible to put enough qubits on a single chip, requires interconnects that transfer quantum information between modules. IBM's "Kookaburra," a 1,386-qubit multichip processor with a quantum communication link, is under development and slated for release in 2025. Other necessary innovations are where the universities come in. Researchers at Tokyo and Chicago have already made significant strides in areas such as components and communication innovations that could be vital parts of the final product, Gambetta says. He thinks there will likely be many more industry-academic collaborations to come over the next decade. "We have to help the universities do what they do best," he says.

Intel

Intel Gives Details on Future AI Chips as It Shifts Strategy (reuters.com) 36

Intel on Monday provided a handful of new details on a chip for artificial intelligence (AI) computing it plans to introduce in 2025 as it shifts strategy to compete against Nvidia and Advanced Micro Devices. From a report: At a supercomputing conference in Germany on Monday, Intel said its forthcoming "Falcon Shores" chip will have 288 gigabytes of memory and support 8-bit floating point computation. Those technical specifications are important as artificial intelligence models similar to services like ChatGPT have exploded in size, and businesses are looking for more powerful chips to run them.

The details are also among the first to trickle out as Intel carries out a strategy shift to catch up to Nvidia, which leads the market in chips for AI, and AMD, which is expected to challenge Nvidia's position with a chip called the MI300. Intel, by contrast, has essentially no market share after its would-be Nvidia competitor, a chip called Ponte Vecchio, suffered years of delays. Intel on Monday said it has nearly completed shipments for Argonne National Lab's Aurora supercomputer based on Ponte Vecchio, which Intel claims has better performance than Nvidia's latest AI chip, the H100. But Intel's Falcon Shores follow-on chip won't be to market until 2025, when Nvidia will likely have another chip of its own out.

AMD

AMD Now Powers 121 of the World's Fastest Supercomputers (tomshardware.com) 22

The Top 500 list of the fastest supercomputers in the world was released today, and AMD continues its streak of impressive wins with 121 systems now powered by AMD's silicon -- a year-over-year increase of 29%. From a report: Additionally, AMD continues to hold the #1 spot on the Top 500 with the Frontier supercomputer, while the test and development system based on the same architecture continues to hold the second spot in power efficiency metrics on the Green 500 list. Overall, AMD also powers seven of the top ten systems on the Green 500 list. The AMD-powered Frontier remains the only fully-qualified exascale-class supercomputer on the planet, as the Intel-powered two-exaflop Aurora has still not submitted a benchmark result after years of delays.

In contrast, Frontier is now fully operational and is being used by researchers in a multitude of science workloads. In fact, Frontier continues to improve from tuning -- the system entered the Top 500 list with 1.02 exaflops of performance in June 2022 but has now improved to 1.194 exaflops, a 17% increase. That's an impressive increase from the same 8,699,904 CPU cores it debuted with. For perspective, that extra 92 petaflops of performance from tuning represents the same amount of computational horsepower as the entire Perlmutter system that ranks eighth on the Top 500.

AI

Meta's Building an In-House AI Chip to Compete with Other Tech Giants (techcrunch.com) 17

An anonymous reader shared this report from the Verge: Meta is building its first custom chip specifically for running AI models, the company announced on Thursday. As Meta increases its AI efforts — CEO Mark Zuckerberg recently said the company sees "an opportunity to introduce AI agents to billions of people in ways that will be useful and meaningful" — the chip and other infrastructure plans revealed Thursday could be critical tools for Meta to compete with other tech giants also investing significant resources into AI.

Meta's new MTIA chip, which stands for Meta Training and Inference Accelerator, is its "in-house, custom accelerator chip family targeting inference workloads," Meta VP and head of infrastructure Santosh Janardhan wrote in a blog post... But the MTIA chip is seemingly a long ways away: it's not set to come out until 2025, TechCrunch reports.

Meta has been working on "a massive project to upgrade its AI infrastructure in the past year," Reuters reports, "after executives realized it lacked the hardware and software to support demand from product teams building AI-powered features."

As a result, the company scrapped plans for a large-scale rollout of an in-house inference chip and started work on a more ambitious chip capable of performing training and inference, Reuters reported...

Meta said it has an AI-powered system to help its engineers create computer code, similar to tools offered by Microsoft, Amazon and Alphabet.

TechCrunch calls these announcements "an attempt at a projection of strength from Meta, which historically has been slow to adopt AI-friendly hardware systems — hobbling its ability to keep pace with rivals such as Google and Microsoft."

Meta's VP of Infrastructure told TechCrunch "This level of vertical integration is needed to push the boundaries of AI research at scale." Over the past decade or so, Meta has spent billions of dollars recruiting top data scientists and building new kinds of AI, including AI that now powers the discovery engines, moderation filters and ad recommenders found throughout its apps and services. But the company has struggled to turn many of its more ambitious AI research innovations into products, particularly on the generative AI front. Until 2022, Meta largely ran its AI workloads using a combination of CPUs — which tend to be less efficient for those sorts of tasks than GPUs — and a custom chip designed for accelerating AI algorithms...

The MTIA is an ASIC, a kind of chip that combines different circuits on one board, allowing it to be programmed to carry out one or many tasks in parallel... Custom AI chips are increasingly the name of the game among the Big Tech players. Google created a processor, the TPU (short for "tensor processing unit"), to train large generative AI systems like PaLM-2 and Imagen. Amazon offers proprietary chips to AWS customers both for training (Trainium) and inferencing (Inferentia). And Microsoft, reportedly, is working with AMD to develop an in-house AI chip called Athena.

Meta says that it created the first generation of the MTIA — MTIA v1 — in 2020, built on a 7-nanometer process. It can scale beyond its internal 128 MB of memory to up to 128 GB, and in a Meta-designed benchmark test — which, of course, has to be taken with a grain of salt — Meta claims that the MTIA handled "low-complexity" and "medium-complexity" AI models more efficiently than a GPU. Work remains to be done in the memory and networking areas of the chip, Meta says, which present bottlenecks as the size of AI models grow, requiring workloads to be split up across several chips. (Not coincidentally, Meta recently acquired an Oslo-based team building AI networking tech at British chip unicorn Graphcore.) And for now, the MTIA's focus is strictly on inference — not training — for "recommendation workloads" across Meta's app family...

If there's a common thread in today's hardware announcements, it's that Meta's attempting desperately to pick up the pace where it concerns AI, specifically generative AI... In part, Meta's feeling increasing pressure from investors concerned that the company's not moving fast enough to capture the (potentially large) market for generative AI. It has no answer — yet — to chatbots like Bard, Bing Chat or ChatGPT. Nor has it made much progress on image generation, another key segment that's seen explosive growth.

If the predictions are right, the total addressable market for generative AI software could be $150 billion. Goldman Sachs predicts that it'll raise GDP by 7%. Even a small slice of that could erase the billions Meta's lost in investments in "metaverse" technologies like augmented reality headsets, meetings software and VR playgrounds like Horizon Worlds.

Google

Google Says Its AI Supercomputer is Faster, Greener Than Nvidia A100 Chip (reuters.com) 28

Alphabet's Google released new details about the supercomputers it uses to train its artificial intelligence models, saying the systems are both faster and more power-efficient than comparable systems from Nvidia. From a report: Google has designed its own custom chip called the Tensor Processing Unit, or TPU. It uses those chips for more than 90% of the company's work on artificial intelligence training, the process of feeding data through models to make them useful at tasks such as responding to queries with human-like text or generating images. The Google TPU is now in its fourth generation. Google on Tuesday published a scientific paper detailing how it has strung more than 4,000 of the chips together into a supercomputer using its own custom-developed optical switches to help connect individual machines.

Improving these connections has become a key point of competition among companies that build AI supercomputers because so-called large language models that power technologies like Google's Bard or OpenAI's ChatGPT have exploded in size, meaning they are far too large to store on a single chip. The models must instead be split across thousands of chips, which must then work together for weeks or more to train the model. Google's PaLM model - its largest publicly disclosed language model to date - was trained by splitting it across two of the 4,000-chip supercomputers over 50 days.

Bitcoin

Cryptocurrencies Add Nothing Useful To Society, Says Nvidia (theguardian.com) 212

The US chip-maker Nvidia has said cryptocurrencies do not "bring anything useful for society" despite the company's powerful processors selling in huge quantities to the sector. From a report: Michael Kagan, its chief technology officer, said other uses of processing power such as the artificial intelligence chatbot ChatGPT were more worthwhile than mining crypto. Nvidia never embraced the crypto community with open arms. In 2021, the company even released software that artificially constrained the ability to use its graphics cards from being used to mine the popular Ethereum cryptocurrency, in an effort to ensure supply went to its preferred customers instead, who include AI researchers and gamers. Kagan said the decision was justified because of the limited value of using processing power to mine cryptocurrencies.

The first version ChatGPT was trained on a supercomputer made up of about 10,000 Nvidia graphics cards. "All this crypto stuff, it needed parallel processing, and [Nvidia] is the best, so people just programmed it to use for this purpose. They bought a lot of stuff, and then eventually it collapsed, because it doesn't bring anything useful for society. AI does," Kagan told the Guardian. "With ChatGPT, everybody can now create his own machine, his own programme: you just tell it what to do, and it will. And if it doesn't work the way you want it to, you tell it 'I want something different.'" Crypto, by contrast, was more like high-frequency trading, an industry that had led to a lot of business for Mellanox, the company Kagan founded before it was acquired by Nvidia. "We were heavily involved in also trading: people on Wall Street were buying our stuff to save a few nanoseconds on the wire, the banks were doing crazy things like pulling the fibres under the Hudson taut to make them a little bit shorter, to save a few nanoseconds between their datacentre and the stock exchange," he said. "I never believed that [crypto] is something that will do something good for humanity. You know, people do crazy things, but they buy your stuff, you sell them stuff. But you don't redirect the company to support whatever it is."

AI

Nvidia DGX Cloud: Train Your Own ChatGPT in a Web Browser For $37K a Month 22

An anonymous reader writes: Last week, we learned that Microsoft spent hundreds of millions of dollars to buy tens of thousands of Nvidia A100 graphics chips so that partner OpenAI could train the large language models (LLMs) behind Bing's AI chatbot and ChatGPT.

Don't have access to all that capital or space for all that hardware for your own LLM project? Nvidia's DGX Cloud is an attempt to sell remote web access to the very same thing. Announced today at the company's 2023 GPU Technology Conference, the service rents virtual versions of its DGX Server boxes, each containing eight Nvidia H100 or A100 GPUs and 640GB of memory. The service includes interconnects that scale up to the neighborhood of 32,000 GPUs, storage, software, and "direct access to Nvidia AI experts who optimize your code," starting at $36,999 a month for the A100 tier.

Meanwhile, a physical DGX Server box can cost upwards of $200,000 for the same hardware if you're buying it outright, and that doesn't count the efforts companies like Microsoft say they made to build working data centers around the technology.
Supercomputing

UK To Invest 900 Million Pounds In Supercomputer In Bid To Build Own 'BritGPT' (theguardian.com) 35

An anonymous reader quotes a report from The Guardian: The UK government is to invest 900 million pounds in a cutting-edge supercomputer as part of an artificial intelligence strategy that includes ensuring the country can build its own "BritGPT". The treasury outlined plans to spend around 900 million pounds on building an exascale computer, which would be several times more powerful than the UK's biggest computers, and establishing a new AI research body. An exascale computer can be used for training complex AI models, but also have other uses across science, industry and defense, including modeling weather forecasts and climate projections. The Treasury said the investment will "allow researchers to better understand climate change, power the discovery of new drugs and maximize our potential in AI.".

An exascale computer is one that can carry out more than one billion billion simple calculations a second, a metric known as an "exaflops". Only one such machine is known to exist, Frontier, which is housed at America's Oak Ridge National Laboratory and used for scientific research -- although supercomputers have such important military applications that it may be the case that others already exist but are not acknowledged by their owners. Frontier, which cost about 500 million pounds to produce and came online in 2022, is more than twice as powerful as the next fastest machine.

The Treasury said it would award a 1 million-pound prize every year for the next 10 years to the most groundbreaking AI research. The award will be called the Manchester Prize, in memory of the so-called Manchester Baby, a forerunner of the modern computer built at the University of Manchester in 1948. The government will also invest 2.5 billion pounds over the next decade in quantum technologies. Quantum computing is based on quantum physics -- which looks at how the subatomic particles that make up the universe work -- and quantum computers are capable of computing their way through vast numbers of different outcomes.

Microsoft

Microsoft Strung Together Tens of Thousands of Chips in a Pricey Supercomputer for OpenAI (bloomberg.com) 25

When Microsoft invested $1 billion in OpenAI in 2019, it agreed to build a massive, cutting-edge supercomputer for the artificial intelligence research startup. The only problem: Microsoft didn't have anything like what OpenAI needed and wasn't totally sure it could build something that big in its Azure cloud service without it breaking. From a report: OpenAI was trying to train an increasingly large set of artificial intelligence programs called models, which were ingesting greater volumes of data and learning more and more parameters, the variables the AI system has sussed out through training and retraining. That meant OpenAI needed access to powerful cloud computing services for long periods of time. To meet that challenge, Microsoft had to find ways to string together tens of thousands of Nvidia's A100 graphics chips -- the workhorse for training AI models -- and change how it positions servers on racks to prevent power outages. Scott Guthrie, the Microsoft executive vice president who oversees cloud and AI, wouldn't give a specific cost for the project, but said "it's probably larger" than several hundred million dollars. [...] Now Microsoft uses that same set of resources it built for OpenAI to train and run its own large artificial intelligence models, including the new Bing search bot introduced last month. It also sells the system to other customers. The software giant is already at work on the next generation of the AI supercomputer, part of an expanded deal with OpenAI in which Microsoft added $10 billion to its investment.
IBM

IBM Says It's Been Running a Cloud-Native, AI-Optimized Supercomputer Since May (theregister.com) 25

"IBM is the latest tech giant to unveil its own "AI supercomputer," this one composed of a bunch of virtual machines running within IBM Cloud," reports the Register: The system known as Vela, which the company claims has been online since May last year, is touted as IBM's first AI-optimized, cloud-native supercomputer, created with the aim of developing and training large-scale AI models. Before anyone rushes off to sign up for access, IBM stated that the platform is currently reserved for use by the IBM Research community. In fact, Vela has become the company's "go-to environment" for researchers creating advanced AI capabilities since May 2022, including work on foundation models, it said.

IBM states that it chose this architecture because it gives the company greater flexibility to scale up as required, and also the ability to deploy similar infrastructure into any IBM Cloud datacenter around the globe. But Vela is not running on any old standard IBM Cloud node hardware; each is a twin-socket system with 2nd Gen Xeon Scalable processors configured with 1.5TB of DRAM, and four 3.2TB NVMe flash drives, plus eight 80GB Nvidia A100 GPUs, the latter connected by NVLink and NVSwitch. This makes the Vela infrastructure closer to that of a high performance compute site than typical cloud infrastructure, despite IBM's insistence that it was taking a different path as "traditional supercomputers weren't designed for AI."

It is also notable that IBM chose to use x86 processors rather than its own Power 10 chips, especially as these were touted by Big Blue as being ideally suited for memory-intensive workloads such as large-model AI inferencing.

Thanks to Slashdot reader guest reader for sharing the story.

Slashdot Top Deals