×
Oracle

Largest Local Government Body In Europe Goes Under Amid Oracle Disaster (theregister.com) 110

Birmingham City Council, the largest local authority in Europe, has declared itself in financial distress after troubled Oracle project costs ballooned from $25 million to around $125.5 million. The Register reports: Contributing to the publication of a legal Section 114 Notice, which says the $4.3 billion revenue organization is unable to balance the books, is a bill of up to $954 million to settle equal pay claims. In a statement today, councillors John Cotton and Sharon Thompson, leader and deputy leader respectively, said the authority was also hit by financial stress owing to issues with the implementation of its Oracle IT system. The council has made a request to the Local Government Association for additional strategic support, the statement said.

In May, Birmingham City Council said it was set to pay up to $125.5 million for its Oracle ERP system -- potentially a fourfold increase on initial estimated expenses -- in a project suffering from delays, cost over-runs, and a lack of controls. After grappling with the project to replace SAP for core HR and finance functions since 2018, the council reviewed the plan in 2019, 2020, and again in 2021, when the total implementation cost for the project almost doubled to $48.5 million. The project, dubbed Financial and People, was "crucial to an organisation of Birmingham City Council's size," a spokesperson said at the time. Cotton said the system had a problem with how it was "tracking our financial transactions and HR transactions issues as well. That's got to be fixed," he said.

Earlier this year, one insider told The Register that Oracle Fusion, the cloud-based ERP system the council is moving to, "is not a product that is suitable for local authorities, because it's very much geared towards a manufacturing/trading organization." They said the previous SAP system had been heavily customized to meet the council's needs and it was struggling to recreate these functions in Oracle.

AI

OpenAI Disputes Authors' Claims That Every ChatGPT Response is Derivative Work 119

OpenAI has responded to a pair of nearly identical class-action lawsuits from book authors -- including Sarah Silverman, Paul Tremblay, Mona Awad, Chris Golden, and Richard Kadrey -- who earlier this summer alleged that ChatGPT was illegally trained on pirated copies of their books. From a report: In OpenAI's motion to dismiss (filed in both lawsuits), the company asked a US district court in California to toss all but one claim alleging direct copyright infringement, which OpenAI hopes to defeat at "a later stage of the case." The authors' other claims -- alleging vicarious copyright infringement, violation of the Digital Millennium Copyright Act (DMCA), unfair competition, negligence, and unjust enrichment -- need to be "trimmed" from the lawsuits "so that these cases do not proceed to discovery and beyond with legally infirm theories of liability," OpenAI argued.

OpenAI claimed that the authors "misconceive the scope of copyright, failing to take into account the limitations and exceptions (including fair use) that properly leave room for innovations like the large language models now at the forefront of artificial intelligence." According to OpenAI, even if the authors' books were a "tiny part" of ChatGPT's massive dataset, "the use of copyrighted materials by innovators in transformative ways does not violate copyright." Unlike plagiarists who seek to directly profit off distributing copyrighted materials, OpenAI argued that its goal was "to teach its models to derive the rules underlying human language" in order to do things like help people "save time at work," "make daily life easier," or simply entertain themselves by typing prompts into ChatGPT.

The purpose of copyright law, OpenAI argued is "to promote the Progress of Science and useful Arts" by protecting the way authors express ideas, but "not the underlying idea itself, facts embodied within the author's articulated message, or other building blocks of creative," which are arguably the elements of authors' works that would be useful to ChatGPT's training model. Citing a notable copyright case involving Google Books, OpenAI reminded the court that "while an author may register a copyright in her book, the 'statistical information' pertaining to 'word frequencies, syntactic patterns, and thematic markers' in that book are beyond the scope of copyright protection."
AI

'Life Or Death:' AI-Generated Mushroom Foraging Books Are All Over Amazon (404media.co) 75

samleecole writes: A genre of AI-generated books on Amazon is scaring foragers and mycologists: cookbooks and identification guides for mushrooms aimed at beginners.

Amazon has an AI-generated books problem that's been documented by journalists for months. Many of these books are obviously gibberish designed to make money. But experts say that AI-generated foraging books, specifically, could actually kill people if they eat the wrong mushroom because a guidebook written by an AI prompt said it was safe.

The New York Mycological Society (NYMS) warned on social media that the proliferation of AI-generated foraging books could "mean life or death."

A quick scan of Amazon's mushroom and foraging books revealed a bunch of books likely written by ChatGPT, but are sold without any indication that they're AI-generated and are marketed as having been written by a human when they're probably not. 404 Media used GPT text detectors and AI image detection tools on some of the suspicious books, and found that they were very likely made with AI, with authors who may not even exist.

AI

Stephen King, Zadie Smith and Rachel Cusk's Pirated Works Used To Train AI (theguardian.com) 129

Zadie Smith, Stephen King, Rachel Cusk and Elena Ferrante are among thousands of authors whose pirated works have been used to train artificial intelligence tools, a story in The Atlantic has revealed. The Guardian: More than 170,000 titles were fed into models run by companies including Meta and Bloomberg, according to an analysis of "Books3" -- the dataset harnessed by the firms to build their AI tools. Books3 was used to train Meta's LLaMA, one of a number of large language models -- the best-known of which is OpenAI's ChatGPT -- that can generate content based on patterns identified in sample texts. The dataset was also used to train Bloomberg's BloombergGPT, EleutherAI's GPT-J and it is "likely" it has been used in other AI models.

The titles contained in Books3 are roughly one-third fiction and two-thirds nonfiction, and the majority were published within the last two decades. Along with Smith, King, Cusk and Ferrante's writing, copyrighted works in the dataset include 33 books by Margaret Atwood, at least nine by Haruki Murakami, nine by bell hooks, seven by Jonathan Franzen, five by Jennifer Egan and five by David Grann. Books by George Saunders, Junot DÃaz, Michael Pollan, Rebecca Solnit and Jon Krakauer also feature, as well as 102 pulp novels by Scientology founder L Ron Hubbard and 90 books by pastor John MacArthur. The titles span large and small publishers including more than 30,000 published by Penguin Random House, 14,000 by HarperCollins, 7,000 by Macmillan, 1,800 by Oxford University Press and 600 by Verso.

Books

On Bill Waterson's Upcoming Book - And Why He Vanished (theamericanconservative.com) 77

In 1995 Bill Watterson walked away from "the madness that had consumed him for practically his entire adulthood," writes the American Conservative.

Though everyone loved his Calvin & Hobbes comic strip, "I had virtually no life beyond the drawing board," he said of the years leading up to the decision... So it came as some surprise earlier this year when Watterson's publisher announced his first new book in nearly thirty years. The Mysteries is a "modern fable"... ["For the book's illustrations, Watterson and caricaturist John Kascht worked together for several years in unusually close collaboration," explains the upcoming book's web page. "Both artists abandoned their past ways of working, inventing images together that neither could anticipate — a mysterious process in its own right."] At seventy-two pages, the book itself is a slight thing, in no way a return to the daily grind of the funny pages. It is being sold exclusively in print. And, typical of Watterson, press access is limited. [Publisher] Andrews McMeel is not sending review copies until the week of its publication in early October...

In the years since the strip's end, Watterson has indicated that there was something false inherent to Calvin and Hobbes, some impurity either in his approach or encoded in the strip itself that made it impossible to continue in good faith. That, combined with the fight over licensing with his syndicate, crushed him. "I lost the conviction that I wanted to spend my life cartooning," he remembers realizing in 1991, four years before he ended the strip. Beyond stray comments such as this one, he has never forthrightly explained where exactly he went wrong. But I think I have an explanation...

"Work and home were so intermingled that I had no refuge from the strip when I needed a break," Watterson recalls. "Day or night, the work was always right there, and the book-publishing schedule was as relentless as the newspaper deadlines. Having certain perfectionist and maniacal tendencies, I was consumed by Calvin and Hobbes." By Watterson's own admission, he cannot accurately recall a whole decade of his life because of his "Ahab-like obsession" with his work. "The intensity of pushing the writing and drawing as far as my skills allowed was the whole point of doing it," he says. "I eliminated pretty much everything from my life that wasn't the strip." While Watterson's wife, Melissa Richmond, organized everything around him, he furthered his isolation, burrowing ever more deeply into the strip's world. There was no other way, he believed, to keep its integrity absolute. "My approach was probably too crazy to sustain for a lifetime," he says, "but it let me draw the exact strip I wanted while it lasted...."

But Watterson had designed a world for himself so self-contained that any disruption could mean its destruction: "I just knew it was time to go." This much became clear in the middle of the licensing fight. It took up so much of his energy that he lost his lead time on the strip and found himself in a situation where he was drawing practically every single comic on press night. After a few weeks of this, he broke down. "I was in a black despair," he says. "I was absolutely frantic. I had to publish everything I thought of, no matter what it was, and I found that idea almost unbearable." His wife saw him spiraling out of control and drew up a schedule that helped him slowly, over the course of six months, rebuild his lead time. Not long after, Watterson crashed his bike, bruised a rib, and broke a finger. He was so afraid of losing his lead again that he propped his drawing board on his knees in his sickbed and drew anyway. That freaked him out, too, and so gradually he scaled his life down to the point where nothing unpredictable could happen...

Watterson compares ending Calvin and Hobbes to reaching the summit of a high mountain... He had no desire to return whence he came. And he couldn't go any higher; no one can ascend into the air itself. So he took his next best option. He jumped.

The Almighty Buck

Thousands of Crypto Scammers are Enslaved by Human-Trafficking Gangsters, Says Bloomberg Reporter (bloomberg.com) 100

A Bloomberg investigative reporter wrote a new book titled Number Go Up: Inside Crypto's Wild Rise and Staggering Fall. This week Bloomberg published an excerpt that begins when the reporter received a flirtatious text message from a woman named Vicky Ho for a scam that's called "pig butchering".

"Vicky's random text had found its way to pretty much exactly the wrong target. I'd been investigating the crypto bubble for more than a year..." After a day, Vicky revealed her true love language: Bitcoin price data. She started sending me charts. She told me she'd figured out how to predict market fluctuations and make quick gains of 20% or more. The screenshots she shared showed that during that week alone she'd made $18,600 on one trade, $4,320 on another and $3,600 on a third... For days, she went on chatting without asking for me to send any money. I was supposed to be the mark, but I had to work her to con me.... Vicky sent me a link to download an app called ZBXS. It looked pretty much like other crypto-exchange apps. "New safe and stable trading market," a banner read at the top. Then Vicky gave me some instructions. They involved buying one cryptocurrency using another crypto-exchange app, then transferring the crypto to ZBXS's deposit address on the blockchain, a 42-character string of letters and numbers...

People around the world really were losing huge sums of money to the con. A project finance lawyer in Boston with terminal cancer handed over $2.5 million. A divorced mother of three in St. Louis was defrauded of $5 million. And the victims I spoke to all told me they'd been told to use Tether, the same coin Vicky suggested to me. Rich Sanders, the lead investigator at CipherBlade, a crypto-tracing firm, said that at least $10 billion had been lost to crypto romance scams.

The huge sums involved weren't the most shocking part. I learned that whoever was posing as Vicky was likely a victim as well — of human trafficking. Most "pig-butchering" operations were orchestrated by Chinese gangsters based in Cambodia or Myanmar. They'd lure young people from across Southeast Asia to move abroad with the promise of well-paying jobs in customer service or online gambling. Then, when the workers arrived, they'd be held captive and forced into a criminal racket. Thousands have been tricked this way. Entire office towers are filled with floor after floor of people sending spam messages around the clock, under threat of torture or death.

With the assistance of translators, I started video chatting with people who'd escaped...

I'd heard that [southwestern Cambodia's giant building complex] Chinatown alone held as many as 6,000 captive workers like "Vicky Ho."

Two of the workers interviewed "said they'd seen workers murdered." And another worker said Tether was used specifically because "It's more safe. We are afraid people will track us... It's untraceable."

The reporter's conclusion? "It was hard to see how this slave complex could exist without cryptocurrency."
Censorship

Iowa School District Is Using AI To Ban Books 394

According to the Globe Gazette, the school board of Mason City, Iowa has begun leveraging AI technology to cultivate lists of potentially bannable books from the district's libraries ahead of the 2023/24 school year. Engadget reports: In May, the Republican-controlled state legislature passed, and Governor Kim Reynolds subsequently signed, Senate File 496 (SF 496), which enacted sweeping changes to the state's education curriculum. Specifically it limits what books can be made available in school libraries and classrooms, requiring titles to be "age appropriateâ and without "descriptions or visual depictions of a sex act," per Iowa Code 702.17. But ensuring that every book in the district's archives adhere to these new rules is quickly turning into a mammoth undertaking. "Our classroom and school libraries have vast collections, consisting of texts purchased, donated, and found," Bridgette Exman, assistant superintendent of curriculum and instruction at Mason City Community School District, said in a statement. "It is simply not feasible to read every book and filter for these new requirements."

As such, the Mason City School District is bringing in AI to parse suspect texts for banned ideas and descriptions since there are simply too many titles for human reviewers to cover on their own. Per the district, a "master list" is first cobbled together from "several sources" based on whether there were previous complaints of sexual content. Books from that list are then scanned by "AI software" -- the district doesn't specify which systems will be employed -- which tells the state censors whether or not there actually is a depiction of sex in the book. So far, the AI has flagged 19 books for removal. [The full list is available here.]
Books

Publishers, Internet Archive Agree To Streamline Digital Book-Lending Case (reuters.com) 6

An anonymous reader quotes a report from Reuters: The Internet Archive and a group of leading book publishers told a Manhattan federal court on Friday that they have resolved aspects of their legal battle over the Archive's digital lending of their scanned books. If accepted, the consent judgment would settle questions over potential money damages in the case and the scope of a ban on the Archive's lending and would clear the way for the Archive to appeal U.S. District Judge John Koeltl's decision that it infringed the publishers' copyrights.

The proposed order would require the Archive to pay Lagardere SCA's Hachette Book Group, News Corp's HarperCollins Publishers, John Wiley & Sons and Bertelsmann SE & Co's Penguin Random House an undisclosed amount of money if it loses its appeal. The order would also permanently block the Archive from lending out copies of the publishers' books without permission, pending the result of the appeal. They asked Koeltl to resolve a dispute over whether the order will apply only to the publishers' books that are already available for electronic licensing or books commercially available in any format.

The Internet Archive said in a blog post that the fight was "far from over," and founder Brewster Kahle said in a statement that "we must have strong libraries, which is why we are appealing this decision." Maria Pallante, the CEO of the Association of American Publishers, said in a statement that the plaintiffs were "extremely pleased" with the proposed injunction, which will "extend not only to the Plaintiffs' 127 works in suit but also to thousands of other literary works in their catalogs."

Television

Neil Gaiman To Continue 'Good Omens' Story Even If It's Not Renewed For Season 3 (gizmodo.com) 42

In the unfortunate event that Amazon cancels Good Omens, a British fantasy comedy series created by Neil Gaiman, the New York Times bestselling author says a novel would be written to continue where the show left off. For those unaware, Good Omens recently launched season two on Amazon Prime and follows various characters all trying to either encourage or prevent an imminent Armageddon, seen through the eyes of the angel Aziraphale and the demon Crowley. According to Gizmodo's Linda Codega, it "ends on an absolutely devastating cliffhanger. Emotionally speaking." From the report: Neil Gaiman, the co-author of Good Omens (the book) alongside Terry Pratchett and the lead writer on Good Omens (the show), has always been active on Tumblr. Naturally, people have been asking him about that ending -- mostly because Good Omens, for all the hype, hasn't yet been renewed for a third season, and I will reiterate, the ending of season two is heart-wrenching. Gaiman had a lovely answer for one fan [poohbear0915] who asked: "In the unfortunate event that Good Omens is not renewed for a season three, would you consider releasing a script book of what would have happened for the fans to read?" Neil Gaiman responded: "No, I'd write a novel."
AI

A New Frontier for Travel Scammers: AI-Generated Guidebooks (nytimes.com) 15

Shoddy guidebooks, promoted with deceptive reviews, have flooded Amazon in recent months. Their authors claim to be renowned travel writers.

But do they even exist?

The New York Times: The books are the result of a swirling mix of modern tools: A.I. apps that can produce text and fake portraits; websites with a seemingly endless array of stock photos and graphics; self-publishing platforms -- like Amazon's Kindle Direct Publishing -- with few guardrails against the use of A.I.; and the ability to solicit, purchase and post phony online reviews, which runs counter to Amazon's policies and may soon face increased regulation from the Federal Trade Commission. The use of these tools in tandem has allowed the books to rise near the top of Amazon search results and sometimes garner Amazon endorsements such as "#1 Travel Guide on Alaska." A recent Amazon search for the phrase "Paris Travel Guide 2023," for example, yielded dozens of guides with that exact title. One, whose author is listed as Stuart Hartley, boasts, ungrammatically, that it is "Everything you Need to Know Before Plan a Trip to Paris."

The book itself has no further information about the author or publisher. It also has no photographs or maps, though many of its competitors have art and photography easily traceable to stock-photo sites. More than 10 other guidebooks attributed to Stuart Hartley have appeared on Amazon in recent months that rely on the same cookie-cutter design and use similar promotional language. The Times also found similar books on a much broader range of topics, including cooking, programming, gardening, business, crafts, medicine, religion and mathematics, as well as self-help books and novels, among many other categories. Amazon declined to answer a series of detailed questions about the books.

Books

Amazon Reverses Course On 'Garbage Books' Written By AI 25

Amazon removed several books believed to be written using AI and listed under a real author's name. Decrypt reports: When professor Jane Friedman complained about books that she didn't write being attributed to her on Monday, ecommerce giant Amazon initially said that it would not remove them. But after she took her case to Twitter, earning the backing of the Authors Guild, Amazon relented early this morning. Friedman -- a non-fiction writer, journalist, and educator -- said Amazon had refused to remove the books even though they appeared to trade on her name and reputation as an author who has published how-to guides for other writers.

The "garbage books," which Friedman says were probably churned out using generative AI, had the titles "Your Guide to Writing a Bestseller eBook on Amazon," "Publishing Power: Navigating Amazon's Kindle Direct Publishing," and "Promote to Prosper: Strategies to Skyrocket Your eBook Sales on Amazon." When Friedman acknowledged that she could not prove that she owned the trademark on her own name, she said Amazon said it would leave the book up and for sale. But that stance changed late Monday night when the books began disappearing from Amazon's website, and after the Authors Guild offered to step in on Friedman's behalf.

"We have clear content guidelines governing which books can be listed for sale and promptly investigate any book when a concern is raised," Amazon spokesperson Ashley Vanicek told Decrypt by email. "We welcome author feedback and work directly with authors to address any issues they raise and where we have made an error, we correct it." Other authors responding to Friedman's tweet said the same thing had happened to them, and in some cases, the publisher of the fraudulent books did more than just use their names. [...] On Tuesday, Friedman again took to Twitter to confirm that the fraudulent works were removed from Amazon. She remained concerned, however, that other writers like Hayes -- who do not have the large audience that she does -- would not be able to raise such a "big red flag."
Crime

Serial Murders Have Dwindled, Thanks To a Cautious Citizenry and Improved Technology (nytimes.com) 184

An anonymous reader quotes a report from the New York Times: Rex Heuermann, the meticulous architectural consultant who the authorities say murdered three women and buried them on a Long Island beach more than a decade ago, may have been among the last of the dying breed of American serial killers. Even as serial killers came to inhabit a central place in the nation's imagination -- inspiring hit movies, television shows, books, podcasts and more -- their actual number was dwindling dramatically. There were once hundreds at large, and a spike in the 1970s and '80s terrified the country. Now only a handful at most are known to be active, researchers say. The techniques that led to the arrest of Mr. Heuermann, who has pleaded not guilty to the crimes, help explain the waning of serial killing, which the F.B.I. defines as the same person killing two or more victims in separate events at different times.

It is harder to hide. Rapid advances in investigative technology, video and other digital surveillance tools, as well as the ability to analyze mountains of information, quickly allow the authorities to find killers who before would have gone undetected. At the same time, Americans have adopted more cautious habits in their everyday lives -- hitchhiking, for example, is less common, and children are driven to and from school. That reduces easy targets. And, some theorize, those bent on killing now opt for spectacular mass murders. "The 'perfect crime' concept is more of a concept than it ever has been before," said Adam Scott Wandt, an assistant professor at John Jay College of Criminal Justice. More than a decade ago, prosecutors said, Mr. Heuermann tried to cover his digital tracks by communicating with victims using so-called burner phones, prepaid units purchased anonymously for temporary use. But thanks to exponential progress in technology since 2010, investigators were able not only to chart Mr. Heuermann's decade-old movements; they could also monitor exactly what he was searching online in recent months. They saw that he was using an anonymous account for internet queries like "Why could law enforcement not trace the calls made by the long island serial killer," prosecutors said. He had also been visiting massage parlors and contacting women working as escorts, they said.

The ubiquity of technology has made it harder to get away with murder, Mr. Wandt said. The amount of data people create in their daily lives is more than many can conceptualize, he said. Just by walking outside, people are now tracked by ever-present cameras, from Amazon's Ring units outside homes to surveillance at banks and retail stores, he said. Every use of a phone or computer creates streams of data that are collected directly on devices or immortalized on servers, he said. A concerted effort by the federal government to ensure that even the smallest police departments can use technology to their benefit has also helped give investigators an upper hand, Mr. Wandt said. In 1987, there were 198 known active serial killers -- people connected to at least two murders -- and 404 known victims across the United States, according to a report published three years ago by researchers who run Radford University and Florida Gulf Coast University's Serial Killer Database. By 2018, there were only 12 known serial killers and 44 victims, according to the report.
"The big question is: Are they going underground and finding other techniques?â said Terence Leary, an associate professor in the psychology department at Florida Gulf Coast University and the team leader for the database.

He said that some serial murderers have killed for discrete periods before taking prolonged breaks: "Maybe they decided to give it up. Who knows?"
Books

Paramount Agrees To Sell Simon and Schuster To KKR (nytimes.com) 17

Paramount said on Monday it had reached a deal to sell Simon and Schuster, one of the biggest and most prestigious publishing houses in the United States, to the private-equity firm KKR, in a major changing of the guard in the books business. From a report: The deal, for $1.62 billion, will put control of the cultural touchstone behind authors like Stephen King and Bob Woodward in the hands of a financial buyer with an expanding presence in the publishing industry. While private equity investors have had a significant footprint in the book business --different firms have owned literary agencies, publishing houses and the retailer Barnes & Noble -- the acquisition of one of the largest publishers in the country vastly increases the hold of financial interests in the business.

[...] Since Simon & Schuster was first put up for sale in 2020, many in the publishing industry have fretted over where the company might land. A sale to another publisher would mean the new management would understand the book business. But it would also mean further consolidation in the industry, with potentially fewer players available to bid on big books, and the chance of layoffs as redundant jobs were eliminated. It could also raise regulatory scrutiny: Paramount's first attempt to sell Simon & Schuster, to Penguin Random House in 2020, was derailed by government antitrust concerns.

AI

AI-Generated Art Banned from Future 'Dungeons & Dragons' Books After Fan Uproar (geekwire.com) 81

A Dungeons & Dragons expansion book included AI-generated artwork. Fans on Twitter spotted it before the book was even released (noting, among other things, a wolf with human feet). An embarrassed representative for Wizards of the Coast then tweeted out an announcement about new guidelines stating explicitly that "artists must refrain from using AI art generation as part of their creation process for developing D&D art." GeekWire reports: The artist in question, Ilya Shkipin, is a California-based painter, illustrator, and operator of an NFT marketplace, who has worked on projects for Renton, Wash.-based Wizards of the Coast since 2014. Shkipin took to Twitter himself on Friday, and acknowledged in several now-deleted tweets that he'd used AI tools to "polish" several original illustrations and concept sketches. As of Saturday morning, Shkipin had taken down his original tweets and announced that the illustrations for Glory of the Giants are "going to be reworked..."

While the physical book won't be out until August 15, the e-book is available now from Wizards' D&D Beyond digital storefront.

Wizards of the Coast emphasized this won't happen again. About this particular incident, they noted "We have worked with this artist since 2014 and he's put years of work into books we all love. While we weren't aware of the artist's choice to use AI in the creation process for these commissioned pieces, we have discussed with him, and he will not use AI for Wizards' work moving forward."

GeekWire adds that the latest D&D video game, Baldur's Gate 3, "went into its full launch period on Tuesday. Based on metrics such as its player population on Steam, BG3 has been an immediate success, with a high of over 709,000 people playing it concurrently on Saturday afternoon."
Books

What Role Does Intuition Play in Science? (theamericanscholar.org) 86

Recently science author Sam Kean reviewed In a Flight of Starlings, a "slender, uneven collection of essays by Giorgio Parisi about his life in physics, from his student days in Rome to the work that won him a share of the 2021 Nobel Prize in physics."

But the reviewer makes an interesting point: As someone who writes about science history, I have long grumbled about how misleading modern scientific papers are. I understand the need to present scientific findings in a clean, concise way, but the papers also omit all the false starts, blind alleys, broken equipment, and dumb mistakes that beset real scientific research every day. By omitting all the human stuff, the papers fail to explain how science really gets done. Parisi raises a related complaint — that scientific papers omit all sense of intuition. Indeed, the best sections of the book explore the role of intuition in scientific thinking.

He quotes a friend who says that "a good mathematician understands immediately which mathematical statements are true and which are false, whereas a bad mathematician has to try to prove them in order to know." The same applies to science: the early stages of any project are chaotic, and the data can be confusing and even contradictory. Scientists need intuition to cut through the mess and focus on the most promising explanations. Much of this intuition is unconscious and, while still grounded in physical brain processes, remains murky and hard to reconstruct. And for whatever reason, that vagueness makes scientists uncomfortable. "In almost all texts written by scientists," Parisi notes, "these themes are taboo."

So it's refreshing to see a scientist, especially one of Parisi's stature, honestly discuss the fuzzy side of scientific thinking, and not just during the early, groping stages but in the technical phases of a project, too. "The physicist sometimes uses mathematics ungrammatically," he admits, "a license that we grant to poets" as well... In a Flight of Starlings, Parisi writes, "is my attempt to convey to a wide readership something of the beauty, importance, and cultural value of modern science." Does he succeed? At times, yes... Perhaps it's not unlike the hodgepodge of science itself, then...

Books

Cory Doctorow's New Book On Beating Big Tech At Its Own Game (boingboing.net) 43

Cory Doctorow, author, digital rights advocate, and co-editor of the blog Boing Boing, has launched a Kickstarter campaign for his next book, called The Internet Con: How To Seize the Means of Computation. "The book presents an array of policy solutions aimed at dismantling the monopolistic power of Big Tech, making the internet a more open and user-focused space," writes Boing Boing's Mark Frauenfelder. "Key among these solutions is the concept of interoperability, which would allow users to take their apps, data, and content with them when they decide to leave a service, thus reducing the power of tech platforms." From Cory's Medium article announcing the Kickstarter: I won't sell my work with DRM, because DRM is key to the enshittification of the internet. Enshittification is why the old, good internet died and became "five giant websites filled with screenshots of the other four" (h/t Tom Eastman). When a tech company can lock in its users and suppliers, it can drain value from both sides, using DRM and other lock-in gimmicks to keep their business even as they grow ever more miserable on the platform.

Here is how platforms die: first, they are good to their users; then they abuse their users to make things better for their business customers; finally, they abuse those business customers to claw back all the value for themselves. Then, they die.

The Internet Con isn't just an analysis of where enshittification comes from: it's a detailed, shovel-ready policy prescription for halting enshittification, throwing it into reverse and bringing back the old, good internet.

Books

New Book about 'The Apple II Age' Celebrates Early Software Developers - and Users (thenewstack.io) 76

By 1983 there were a whopping 2,000 pieces of software for Apple's pre-Macintosh computer, the Apple II — more than for any other machine in the world. It turns out this left a trail for one historian to understand The Apple II Age: How the Computer Became Personal.

The new book (by New York University academic Laine Nooney) argues that it was the first purchasers of that software who are the true overlooked pioneers during the seven years before the Macintosh. And (as this reviewer explains, with quotes from the book), collectively they form the most compelling story about the history of Apple: It's about all those brave and curious people, the users, who came "Not to hack, but to play... Not to program, but to print..." And you can trace their activities in perfect detail through the decades-old software programs they left behind. It's a fresh and original approach to the history of technology. Yes, the Apple II competed with Commodore's PET 2001 and Tandy's TRS-80... [But] this trove of programs uniquely offers "a glimpse of what users did with their personal computers, or perhaps more tellingly, what users hoped their computers might do."

Looking back in time, Nooney calls the period "one of unusually industrious and experimental software production, as mom-and-pop development houses cast about trying to create software that could satisfy the question, 'What is a computer even good for...?'" The book's jacket promises "a constellation of software creation stories," with each chapter revisiting an especially iconic program that also represents an entire category of software...

[T]he book ultimately focuses more heavily on the lessons that can be learned from what programmers envisioned for these strange new devices — and how the software-buying public did (or didn't) respond... The earliest emergence of personal computing in America was "a wondrous mangle," Nooney writes, saying it turned into an era where "overnight entrepreneurs hastily constructed a consumer computing supply chain where one had never previously existed."

Vice republished an excerpt in May which describes the "roiling debate" that took place over copy protection in 1981.
AI

Is AI Training on Libraries of Pirated Books? (nytimes.com) 96

The New York Times points out that so-called "shadow libraries," like Library Genesis, Z-Library or Bibliotik, "are obscure repositories storing millions of titles, in many cases without permission — and are often used as A.I. training data." A.I. companies have acknowledged in research papers that they rely on shadow libraries. OpenAI's GPT-1 was trained on BookCorpus, which has over 7,000 unpublished titles scraped from the self-publishing platform Smashwords. To train GPT-3, OpenAI said that about 16 percent of the data it used came from two "internet-based books corpora" that it called "Books1" and "Books2." According to a lawsuit by the comedian Sarah Silverman and two other authors against OpenAI, Books2 is most likely a "flagrantly illegal" shadow library.

These sites have been under scrutiny for some time. The Authors Guild, which organized the authors' open letter to tech executives, cited studies in 2016 and 2017 that suggested text piracy depressed legitimate book sales by as much as 14 percent.

Efforts to shut down these sites have floundered. Last year, the F.B.I., with help from the Authors Guild, charged two people accused of running Z-Library with copyright infringement, fraud and money laundering. But afterward, some of these sites were moved to the dark web and torrent sites, making it harder to trace them. And because many of these sites are run outside the United States and anonymously, actually punishing the operators is a tall task.

Tech companies are becoming more tight-lipped about the data used to train their systems.

Movies

Hollywood Movie Aside, Just How Good a Physicist Was Oppenheimer? (science.org) 91

sciencehabit shares a report from Science: This week, the much anticipated movie Oppenheimer hits theaters, giving famed filmmaker Christopher Nolan's take on the theoretical physicist who during World War II led the Manhattan Project to develop the first atomic bomb. J. Robert Oppenheimer, who died in 1967, is known as a charismatic leader, eloquent public intellectual, and Red Scare victim who in 1954 lost his security clearance in part because of his earlier associations with suspected Communists. To learn about Oppenheimer the scientist, Science spoke with David C. Cassidy, a physicist and historian emeritus at Hofstra University. Cassidy has authored or edited 10 books, including J. Robert Oppenheimer and the American Century. How did Oppenheimer compare to Einstein? Did he actually make any substantiative contributions to THE Bomb? And why did he eventually lose his security clearance?
AI

Why Synthetic Data is Being Used To Train AI Models (ft.com) 31

Artificial intelligence companies are exploring a new avenue to obtain the massive amounts of data needed to develop powerful generative models: creating the information from scratch. From a report: Microsoft, OpenAI and Cohere are among the groups testing the use of so-called synthetic data -- computer-generated information to train their AI systems known as large language models (LLMs) -- as they reach the limits of human-made data that can further improve the cutting-edge technology. The launch of Microsoft-backed OpenAI's ChatGPT last November has led to a flood of products rolled out publicly this year by companies including Google and Anthropic, which can produce plausible text, images or code in response to simple prompts.

The technology, known as generative AI, has driven a surge of investor and consumer interest, with the world's biggest technology companies including Google, Microsoft and Meta racing to dominate the space. Currently, LLMs that power chatbots such as OpenAI's ChatGPT and Google's Bard are trained primarily by scraping the internet. Data used to train these systems includes digitised books, news articles, blogs, search queries, Twitter and Reddit posts, YouTube videos and Flickr images, among other content. Humans are then used to provide feedback and fill gaps in the information in a process known as reinforcement learning by human feedback (RLHF). But as generative AI software becomes more sophisticated, even deep-pocketed AI companies are running out of easily accessible and high-quality data to train on. Meanwhile, they are under fire from regulators, artists and media organisations around the world over the volume and provenance of personal data consumed by the technology.

Slashdot Top Deals