Programming

Google's Jules Enters Developers' Toolchains As AI Coding Agent Competition Heats Up 2

An anonymous reader quotes a report from TechCrunch: Google is bringing its AI coding agent Jules deeper into developer workflows with a new command-line interface and public API, allowing it to plug into terminals, CI/CD systems, and tools like Slack -- as competition intensifies among tech companies to own the future of software development and make coding more of an AI-assisted task.

Until now, Jules -- Google's asynchronous coding agent -- was only accessible via its website and GitHub. On Thursday, the company introduced Jules Tools, a command-line interface that brings Jules directly into the developer's terminal. The CLI lets developers interact with the agent using commands, streamlining workflows by eliminating the need to switch between the web interface and GitHub. It allows them to stay within their environment while delegating coding tasks and validating results.
"We want to reduce context switching for developers as much as possible," Kathy Korevec, director of product at Google Labs, told TechCrunch.

Jules differs from Gemini CLI in that it focuses on "scoped," independent tasks rather than requiring iterative collaboration. Once a user approves a plan, Jules executes it autonomously, while the CLI needs more step-by-step guidance. Jules also has a public API for workflow and IDE integration, plus features like memory, a stacked diff viewer, PR comment handling, and image uploads -- capabilities not present in the CLI. Gemini CLI is limited to terminals and CI/CD pipelines and is better suited for exploratory, highly interactive use.
XBox (Games)

Microsoft is About To Launch Free Xbox Cloud Gaming With Ads (theverge.com) 14

An anonymous reader shares a report from The Verge: Microsoft is getting ready to announce an ad-supported version of Xbox Cloud Gaming. Sources familiar with Microsoft's plans tell The Verge that the software maker has started testing ad-supported games streaming internally, allowing employees to play select titles free without a Game Pass subscription.

I understand that the free ad-supported version of Xbox Cloud Gaming will include the ability to stream some games you own, as well as eligible Free Play Days titles, which let Xbox players try games over a weekend. You'll also be able to stream Xbox Retro Classics games. Sources tell me the internal testing includes around two minutes of preroll ads before a game is available to stream for free through Xbox Cloud Gaming. [...] The ad-supported Xbox Cloud Gaming version will be available on PC, Xbox consoles, handheld devices, and via the web.

Transportation

Tesla's Lead in Car Software Updates Remains Unchallenged (wired.com) 107

No automaker has matched Tesla's ability to deliver over-the-air software updates despite years of effort and billions in spending. Tesla introduced the technology in 2012 and issued 42 updates within six months, Jean-Marie Lapeyre, Capgemini's chief technology officer for automotive, told WIRED. Other automakers ship updates "maybe once a year," Lapeyre said.

General Motors actually introduced OTA functionality first in 2010, two years before Tesla, but limited it to the OnStar telematics system. Traditional automakers treat software as one bolt-on component among many. Tesla and other digital-native brands like Rivian, Lucid and Chinese companies including BYD and Xpeng treat it as central. There are now 69 million OTA-capable vehicles in the United States, S&P Global estimates. More than 13 million vehicles were recalled in 2024 due to software-related issues, a 35 percent increase over the prior year. OTA updates cost automakers $66.50 per vehicle for each gigabyte of data, Harman Automotive estimates.
Biotech

Microsoft Says AI Can Create 'Zero Day' Threats In Biology (technologyreview.com) 29

An anonymous reader quotes a report from MIT Technology Review: A team at Microsoft says it used artificial intelligence to discover a "zero day" vulnerability in the biosecurity systems used to prevent the misuse of DNA. These screening systems are designed to stop people from purchasing genetic sequences that could be used to create deadly toxins or pathogens. But now researchers led by Microsoft's chief scientist, Eric Horvitz, says they have figured out how to bypass the protections in a way previously unknown to defenders.The team described its work today in the journalScience.

Horvitz and his team focused on generative AI algorithms that propose new protein shapes. These types of programs are already fueling the hunt for new drugs at well-funded startups like Generate Biomedicines and Isomorphic Labs, a spinout of Google. The problem is that such systems are potentially "dual use." They can use their training sets to generate both beneficial molecules and harmful ones. Microsoft says it began a "red-teaming" test of AI's dual-use potential in 2023 in order to determine whether "adversarial AI protein design" could help bioterrorists manufacture harmful proteins.

The safeguard that Microsoft attacked is what's known as biosecurity screening software. To manufacture a protein, researchers typically need to order a corresponding DNA sequence from a commercial vendor, which they can then install in a cell. Those vendors use screening software to compare incoming orders with known toxins or pathogens. A close match will set off an alert. To design its attack, Microsoft used several generative protein models (including its own, called EvoDiff) to redesign toxins -- changing their structure in a way that let them slip past screening software but was predicted to keep their deadly function intact.
"This finding, combined with rapid advances in AI-enabled biological modeling, demonstrates the clear and urgent need for enhanced nucleic acid synthesis screening procedures coupled with a reliable enforcement and verification mechanism," says Dean Ball, a fellow at the Foundation for American Innovation, a think tank in San Francisco.
Crime

Cops: Accused Vandal Confessed To ChatGPT 59

alternative_right shares a report from the Smoking Gun: Minutes after vandalizing 17 cars in a Missouri college parking lot, a 19-year-old sophomore had a lengthy ChatGPT conversation during which he confessed to the crime, asked about the possibility of getting caught, and wondered, "is there any way they could know it was me," according to a police probable cause statement. Ryan Schaefer was arrested yesterday and charged with felony property damage for a rampage early Sunday at a Missouri State University parking lot. Investigators allege that Schaefer shattered car windows, ripped off side mirrors, dented hoods, and broke windshield wipers during the 3 AM spree.

When confronted with surveillance footage and other evidence, Schaefer said that he could see the resemblance between the suspect and himself. At that point, Schaefer reportedly consented to a search of his iPhone. A subsequent review of the device revealed location data placing Schaefer "at or near the scene of the crime," as well as a "troubling dialogue exchange this defendant seems to have had with artificial intelligence software installed on his phone," prosecutors reported.
The incriminating ChatGPT conversation can be found here.
Businesses

In a Sea of Tech Talent, Companies Can't Find the Workers They Want (wsj.com) 106

Tech companies are struggling to fill AI-specialized roles despite a surplus of available tech talent. U.S. colleges more than doubled the number of computer science degrees awarded between 2013 and 2022. Major layoffs at Google, Meta, and Amazon flooded the job market. The Bureau of Labor Statistics predicts businesses will employ 6% fewer computer programmers in 2034 than last year. The disconnect stems from companies seeking workers with specific AI expertise.

Runway CEO Cristobal Valenzuela estimates only hundreds of people worldwide possess the skills to train complex AI models. His company advertises base salaries up to $490,000 for a director of machine learning. Daniel Park's startup Pickle offers up to $500,000 base salary and expects candidates willing to work seven days a week. The WSJ story includes the example of one James Strawn, who was laid off from Adobe over the summer after 25 years as a senior software quality-assurance engineer. The 55-year-old has had one interview since his layoff. Matt Massucci, CEO of recruiting firm Hirewell, told the publication companies can automate some low-level engineering tasks and redirect that money to high-end talent.
AI

Mira Murati's Stealth AI Lab Launches Its First Product (wired.com) 33

An anonymous reader quotes a report from Wired: Thinking Machines Lab,a heavily funded startup cofounded by prominent researchers from OpenAI, has revealed its first product -- a tool called Tinker that automates the creation of custom frontier AI models. "We believe [Tinker] will help empower researchers and developers to experiment with models and will make frontier capabilities much more accessible to all people," said Mira Murati, cofounder and CEO of Thinking Machines, in an interview with WIRED ahead of the announcement.

Big companies and academic labs already fine-tune open source AI models to create new variants that are optimized for specific tasks, like solving math problems, drafting legal agreements, or answering medical questions. Typically, this work involves acquiring and managing clusters of GPUs and using various software tools to ensure that large-scale training runs are stable and efficient. Tinker promises to allow more businesses, researchers, and even hobbyists to fine-tune their own AI models by automating much of this work.

Essentially, the team is betting that helping people fine-tune frontier models will be the next big thing in AI. And there's reason to believe they might be right. Thinking Machines Lab is helmed by researchers who played a core role in the creation of ChatGPT. And, compared to similar tools on the market, Tinker is more powerful and user friendly, according to beta testers I spoke with. Murati says that Thinking Machines Lab hopes to demystify the work involved in tuning the world's most powerful AI models and make it possible for more people to explore the outer limits of AI. "We're making what is otherwise a frontier capability accessible to all, and that is completely game-changing," she says. "There are a ton of smart people out there, and we need as many smart people as possible to do frontier AI research."
"There's a bunch of secret magic, but we give people full control over the training loop," OpenAI veteran John Schulman says. "We abstract away the distributed training details, but we still give people full control over the data and the algorithms."
Security

Intel and AMD Trusted Enclaves, a Foundation For Network Security, Fall To Physical Attacks (arstechnica.com) 96

Researchers have unveiled two new hardware-based attacks, Battering RAM and Wiretap, that break Intel SGX and AMD SEV-SNP trusted enclaves by exploiting deterministic encryption and physical interposers. Ars Technica reports: In the age of cloud computing, protections baked into chips from Intel, AMD, and others are essential for ensuring confidential data and sensitive operations can't be viewed or manipulated by attackers who manage to compromise servers running inside a data center. In many cases, these protections -- which work by storing certain data and processes inside encrypted enclaves known as TEEs (Trusted Execution Enclaves) -- are essential for safeguarding secrets stored in the cloud by the likes of Signal Messenger and WhatsApp. All major cloud providers recommend that customers use it. Intel calls its protection SGX, and AMD has named it SEV-SNP.

Over the years, researchers have repeatedly broken the security and privacy promises that Intel and AMD have made about their respective protections. On Tuesday, researchers independently published two papers laying out separate attacks that further demonstrate the limitations of SGX and SEV-SNP. One attack, dubbed Battering RAM, defeats both protections and allows attackers to not only view encrypted data but also to actively manipulate it to introduce software backdoors or to corrupt data. A separate attack known as Wiretap is able to passively decrypt sensitive data protected by SGX and remain invisible at all times.

Microsoft

Nadella Appoints New CEO To Run Microsoft's Biggest Businesses (theverge.com) 11

Microsoft is promoting Judson Althoff, currently executive vice president and chief commercial officer at Microsoft, to a new role as CEO of its commercial business. From a report: It's the latest shakeup inside the company, as Microsoft navigates what CEO Satya Nadella calls a "tectonic AI platform shift." It's also a move that will allow Nadella to focus on more technical work at Microsoft, while still remaining overall CEO.

In an internal memo to employees today, Nadella announced Althoff's promotion and said it's linked with the need for Microsoft to reinvent itself in the AI era and "bring together sales, marketing, operations, and engineering to drive growth and strengthen our position as the partner of choice for AI transformation." Althoff has led Microsoft's global sales organization for the past nine years, helping the company build out its Microsoft Customer and Partner Solutions (MCAPS) division. He will now also be responsible for the operations and marketing teams that help sell Microsoft's software and services to businesses, but not the engineering teams that help build them.

Books

Independent UK Bookshops To Begin Selling eBooks 17

Independent UK bookshops will now be able to sell ebooks via a new platform (Bookshop.org's expansion), keeping 100% of profits and offering a non-Amazon way to reach digital readers. "Bookshops now have an additional tool in their fight against Amazon," said Nicole Vanderbilt, managing director of Bookshop.org UK. "Digital readers don't depend on Amazon's monopoly any more, now that they can find ebooks at the same price on Bookshop.org." The Guardian reports: Bookshop.org launched in the UK in November 2020 as a platform for independent bookshops to sell physical books. Bookshops receive 30% of the cover price from each sale they generate; so far, the UK site has generated 4.5 million pounds for independent bookshops. Customers will also now be able to buy ebooks through a bookshop of their choice. Profits from orders without a specified bookshop will be added to a shared pool, which will be distributed among all participating bookshops on the platform. [...]

The platform will launch with a catalogue of more than a million ebooks from all major publishers. It will be available online via a web browser and through the Bookshop.org apps on Apple and Android. "Due to Amazon's proprietary digital rights management [DRM] software and publishers' DRM requirements, it's not currently possible to buy DRM-protected ebooks from Bookshop.org or local bookshops and read them on your Kindle," said Bookshop.org. However, the site is working with the e-reader company Kobo to support Kobo devices "later this year," and longer term would "love to offer our own eInk device."
Books

Kindle Scribe Redesign Adds Color Model and AI-powered Notebook Features (aboutamazon.com) 12

Amazon today announced three new Kindle Scribe models, its e ink-featuring tables designed for note-taking and reading. The lineup includes the standard Kindle Scribe and a version without a front light alongside the Kindle Scribe Colorsoft. The new devices feature an 11-inch glare-free E Ink screen compared to the 10.2-inch display on previous models.

Amazon has reduced the weight to 400 grams from 433 grams and made the devices 5.4mm thin. The company added a quad-core processor and additional memory to deliver writing and page turns that are 40% faster than earlier versions. The Colorsoft model uses custom-built display technology to offer 10 pen colors and five highlighter colors. Amazon redesigned the software to include AI-powered notebook search and summaries. The devices will support Google Drive and Microsoft OneDrive for document access and allow users to export notes as editable text to OneNote. The standard Kindle Scribe will start at $499.99 and the Colorsoft at $629.99 when they become available later this year. The version without a front light will cost $429.99 and arrive early next year.
Iphone

FCC Mistakenly Leaks Confidential iPhone 16e Schematics (appleinsider.com) 50

The FCC mistakenly published a 163-page PDF containing detailed schematics for Apple's upcoming iPhone 16e, despite Apple explicitly requesting indefinite confidentiality to protect trade secrets. AppleInsider reports: A cover letter is also distributed alongside the schematics, addressed to the FCC and dated September 16, 2024. The letter from Apple is a request for the confidential treatment of documents that are filed with the FCC. [...] The letter from Apple requests a series of documents are withheld from public viewing "indefinitely." The justification is that they contain "confidential and proprietary trade secrets" that are not disclosed to the public post-release, due to giving competitors an "unfair advantage."

The list of documents, Apple states, includes: Block Diagrams, Electrical Schematic Diagrams, Technical Descriptions, Product Specifications, Antenna Locations, Tune-Up Procedure, and Software Security Description. Other documents, such as external and internal photographs, shots of the test setup, and the user manual, are deemed to be less damaging and have "short-term confidentiality" requirements. In those cases, Apple asks for short-term confidentiality for 180 days after the equipment authorization is granted by the FCC.

Microsoft

Microsoft Launches 'Vibe Working' in Excel and Word (theverge.com) 36

An anonymous reader shares a report: You've probably heard of vibe coding -- novices writing apps by creating a simple AI prompt -- but now Microsoft wants to introduce a similar thing for its Office apps. The software maker is launching a new Agent Mode in Excel and Word that can generate complex spreadsheets and documents with just a prompt. A new Office Agent in Copilot chat, powered by Anthropic models, is also launching today that can create PowerPoint presentations and Word documents from a "vibe working" chatbot.

[...] Agent Mode essentially takes a complex task and breaks it down with planning and reasoning that you can follow. It then uses OpenAI's GPT-5 model to break down each step of document creation into an agentic task and execute it. It's like watching an automated macro in real time, showing everything it's doing in the sidebar.

Programming

Will AI Mean Bring an End to Top Programming Language Rankings? (ieee.org) 51

IEEE Spectrum ranks the popularity of programming languages — but is there a problem? Programmers "are turning away from many of these public expressions of interest. Rather than page through a book or search a website like Stack Exchange for answers to their questions, they'll chat with an LLM like Claude or ChatGPT in a private conversation." And with an AI assistant like Cursor helping to write code, the need to pose questions in the first place is significantly decreased. For example, across the total set of languages evaluated in the Top Programming Languages, the number of questions we saw posted per week on Stack Exchange in 2025 was just 22% of what it was in 2024...

However, an even more fundamental problem is looming in the wings... In the same way most developers today don't pay much attention to the instruction sets and other hardware idiosyncrasies of the CPUs that their code runs on, which language a program is vibe coded in ultimately becomes a minor detail... [T]he popularity of different computer languages could become as obscure a topic as the relative popularity of railway track gauges... But if an AI is soothing our irritations with today's languages, will any new ones ever reach the kind of critical mass needed to make an impact? Will the popularity of today's languages remain frozen in time?

That's ultimately the larger question. "how much abstraction and anti-foot-shooting structure will a sufficiently-advanced coding AI really need...?" [C]ould we get our AIs to go straight from prompt to an intermediate language that could be fed into the interpreter or compiler of our choice? Do we need high-level languages at all in that future? True, this would turn programs into inscrutable black boxes, but they could still be divided into modular testable units for sanity and quality checks. And instead of trying to read or maintain source code, programmers would just tweak their prompts and generate software afresh.

What's the role of the programmer in a future without source code? Architecture design and algorithm selection would remain vital skills... How should a piece of software be interfaced with a larger system? How should new hardware be exploited? In this scenario, computer science degrees, with their emphasis on fundamentals over the details of programming languages, rise in value over coding boot camps.

Will there be a Top Programming Language in 2026? Right now, programming is going through the biggest transformation since compilers broke onto the scene in the early 1950s. Even if the predictions that much of AI is a bubble about to burst come true, the thing about tech bubbles is that there's always some residual technology that survives. It's likely that using LLMs to write and assist with code is something that's going to stick. So we're going to be spending the next 12 months figuring out what popularity means in this new age, and what metrics might be useful to measure.

Having said that, IEEE Spectrum still ranks programming language popularity three ways — based on use among working programmers, demand from employers, and "trending" in the zeitgeist — using seven different metrics.

Their results? Among programmers, "we see that once again Python has the top spot, with the biggest change in the top five being JavaScript's drop from third place last year to sixth place this year. As JavaScript is often used to create web pages, and vibe coding is often used to create websites, this drop in the apparent popularity may be due to the effects of AI... In the 'Jobs' ranking, which looks exclusively at what skills employers are looking for, we see that Python has also taken 1st place, up from second place last year, though SQL expertise remains an incredibly valuable skill to have on your resume."
Transportation

When This EV Company Went Bankrupt, Its Customers Launched a Nonprofit to Keep Their Cars Running (theverge.com) 23

Cristian Fleming paid around $70,000 for one of Fisker Ocean's electric mid-size crossover SUVs. Seven months later the company filed for bankruptcy in June of 2024, reports the Verge, "having only delivered 11,000 vehicles."

"Early adopters were left with cars plagued by battery failures, glitchy software, inconsistent key fobs, and door handles that did not always open. With the company gone, there was no way to fix any issues." Regulators logged dozens of complaints as replacement parts vanished. Passionate owners who spent top dollar on high-end trims saw their cars reduced to expensive driveway ornaments.

Rather than accept defeat, thousands of Ocean owners have organized into their own makeshift car company. The Fisker Owners Association (FOA) is a nonprofit that's launched third-party apps, built a global parts supply chain, and came together around a future for their orphaned vehicles. It's part car club, part tech startup, part survival mission. Fleming now serves as the organization's president... FOA calls itself the first entirely owner-controlled EV fleet in history. So far, 4,055 Ocean owners have signed up, paying $550 a year in dues that the group estimates will raise around $3 million annually, about 0.1 percent of Fisker's peak valuation. Only verified Ocean owners can become full members, but anyone can donate.

The grassroots effort has precedent — DeLorean diehards and Saab enthusiasts have kept their favorite brands alive after factory closures. But those efforts focused on preserving aging vehicles. FOA is attempting something different: real-time software updates and hardware improvements for a connected, two-year-old EV fleet... The organization has spawned three separate companies. Tsunami Automotive handles parts in North America while Tidal Wave covers Europe, scavenging insurance auctions and contracting with tooling manufacturers to reproduce components. UnderCurrent Automotive, run by former Google and Apple engineers, focuses on software solutions.

UnderCurrent's first product is OceanLink Pro, a third-party mobile app now used by over 1,200 members that restores basic EV features, such as remote battery monitoring and climate control. A companion device called OceanLink Pulse adds wireless CarPlay and Android Auto, with plans for future upgrades including keyless entry. "Those are things you would have expected to be in a $70,000 luxury car," says Clint Bagley [FOA's treasurer]. "But, you know, we're happy to provide what the billion-dollar automaker apparently couldn't."

Robotics

Humanoid Robots Are Meta's Next 'AR-Sized Bet' (theverge.com) 44

Meta is making humanoid robots its next massive "AR-sized bet," investing billions into a project led by top roboticists. The focus will be less on hardware and more on software dexterity, aiming to license its robotics platform to manufacturers much like Google licenses Android. The Verge reports: During a recent conversation at Meta's headquarters, CTO Andrew Bosworth said he stood up a robotics "research effort" earlier this year at the direction of CEO Mark Zuckerberg. The team's existence has been reported on before, but Bosworth hadn't discussed its strategy in-depth until our interview. "I don't think the hardware is the hard part," he told me ahead of Meta's recent Connect conference. "I'm not saying the hardware isn't also hard, but it's not the bottleneck. The bottleneck is the software."

To demonstrate, Bosworth picked up my glass of water from a table between us. "If you know robotics, one of the biggest problems that you have is dexterous manipulation," he said. "These robots, they can stand, they can run, they can do a flip, because the ground is a super stable thing." By contrast, a robot trying to pick up the glass of water would likely "immediately crush it or spill all the water." While Meta is currently building its own humanoid, or "Metabot" as it's called internally, Bosworth envisions the company licensing its software platform to other robot manufacturers. "I don't care about us being the hardware manufacturers," he explained.

China

Chinese Hackers Breach US Software and Law Firms Amid Trade Fight (cnn.com) 3

An anonymous reader quotes a report from CNN: A team of suspected Chinese hackers has infiltrated US software developers and law firms in a sophisticated campaign to collect intelligence that could help Beijing in its ongoing trade fight with Washington, cybersecurity firm Mandiant said Wednesday. The hackers have been rampant in recent weeks, hitting the cloud-computing firms that numerous American companies rely on to store key data, Mandiant, which is owned by Google, said. In a sign of how important China's hacking army is in the race for tech supremacy, the hackers have also stolen US tech firms' proprietary software and used it to find new vulnerabilities to burrow deeper into networks, according to Mandiant.

[...] In some cases, the hackers have lurked undetected in the US corporate networks for over a year, quietly collecting intelligence, Mandiant said. The disclosure comes after the Trump administration escalated America's trade war with China this spring by slapping unprecedented tariffs on Chinese exports to the United States. The tit-for-tat tariffs set off a scramble in both governments to understand each other's positions. Mandiant analysts said the fallout from the breaches -- the task of kicking out the hackers and assessing the damage -- could last many months. They described it as a milestone hack, comparable in severity and sophistication to Russia's use of SolarWinds software to infiltrate US government agencies in 2020.

Operating Systems

Amazon Fire TV Devices Expected To Ditch Android for Linux in 2025 (arstechnica.com) 29

Amazon Fire TV devices will run the company's Linux-based Vega OS starting in 2025, according to a job listing that Amazon subsequently edited after press inquiries. The software development manager position originally sought someone to oversee "the Vega OS experience" and "the dedicated Prime Video app on Vega OS" launching in 2025. Amazon removed references to Vega after a reporter contacted the company for comment.

The proprietary OS already powers the Echo Hub, Echo Show 5 third generation, and Echo Spot, running on Linux kernel 5.16 according to Amazon's source code notices. Current Fire TV devices won't receive Vega updates. The shift from Android would eliminate Google's influence over Amazon's streaming hardware business and remove smartphone code unnecessary for TV devices.
Microsoft

Microsoft Disables Some Cloud Services Used by Israel's Defense Ministry (msn.com) 119

Microsoft has disabled the Israeli Defense Ministry's access to certain services and subscriptions, after finding evidence that the ministry used the tech company's cloud services to surveil Gaza citizens. WSJ adds: The software company made the move after an internal investigation indicated Israel's Defense Ministry used Microsoft's Azure cloud services for surveillance, according to a person familiar with the matter. The company probe is ongoing. "As employees, we all have a shared interest in privacy protection, given the business value it creates by ensuring our customers can rely on our services with rock solid trust," Microsoft President Brad Smith said in a blog post Thursday on Microsoft's company website.

Smith said Microsoft's investigation was guided by the company's "longstanding protection of privacy as a fundamental right." Microsoft opened the probe after the Guardian, the British news organization, reported in August that Israel used Azure to store data on Gaza civilians and surveil them. The issue has been the source of protests at the company.

AI

OpenAI Says GPT-5 Stacks Up To Humans in a Wide Range of Jobs (techcrunch.com) 39

An anonymous reader shares a report: OpenAI released a new benchmark on Thursday that tests how its AI models perform compared to human professionals across a wide range of industries and jobs. The test, GDPval, is an early attempt at understanding how close OpenAI's systems are to outperforming humans at economically valuable work -- a key part of the company's founding mission to develop artificial general intelligence or AGI.

OpenAI says its found that its GPT-5 model and Anthropic's Claude Opus 4.1 "are already approaching the quality of work produced by industry experts." That's not to say that OpenAI's models are going to start replacing humans in their jobs immediately. Despite some CEOs' predictions that AI will take the jobs of humans in just a few years, OpenAI admits that GDPval today covers a very limited number of tasks people do in their real jobs. However, it is one of the latest ways the company is measuring AI's progress towards this milestone. GDPval is based on nine industries that contribute the most to America's gross domestic product, including domains such as healthcare, finance, manufacturing, and government. The benchmark tests an AI model's performance in 44 occupations among those industries, ranging from software engineers to nurses to journalists.

Slashdot Top Deals