AI News Digest
2025-11-17
The novelty phase is officially over—this week’s AI news signals we’ve entered the stabilization era, where the industry’s focus has decisively shifted from “look what it can do” to “can we actually understand and trust what it’s doing?” Between OpenAI’s experiments with sparse models for debugging neural networks, new research on weight-sparse transformers revealing interpretable circuits, and the Upwork study confirming what we all suspected (AI agents still need human babysitters), there’s a clear pattern emerging: explainability and human oversight aren’t nice-to-haves anymore, they’re becoming prerequisites for production deployment. Meanwhile, the open-source versus proprietary battle continues heating up—with Weibo’s VibeThinker-1.5B claiming to outperform DeepSeek-R1 on a shoestring $7,800 budget and Meta releasing its SPICE self-reasoning framework.
📰 General News
ChatGPT Group Chats are here … but not for everyone (yet)
OpenAI has launched ChatGPT Group Chats as a limited pilot in Japan, New Zealand, South Korea, and Taiwan, allowing multiple users (1-20 participants) to collaborate in shared conversations with ChatGPT. The feature runs on GPT-5.1 Auto, supports various tools like image generation and file uploads, and operates independently of ChatGPT’s memory system for privacy. Group chats enable real-time collaboration for planning, brainstorming, and project work, with ChatGPT able to react with emojis and personalize responses. No API or developer access has been announced, keeping it a consumer-facing feature for now.
LinkedIn adds AI-powered search to help users find people
LinkedIn is rolling out an AI-powered people search feature to premium users in the United States. This new functionality aims to help users find and connect with people more effectively using artificial intelligence capabilities.
Weibo launch open source AI, VibeThinker-1.5B
Weibo AI has released VibeThinker-1.5B, an open-source AI model with 1.5 billion parameters. The model is hosted on Hugging Face, making it publicly accessible for download and use. This represents Weibo’s entry into the open-source AI model space, though limited information is available from the brief announcement.
ChatGPT launches pilot group chats across Japan, New Zealand, South Korea, and Taiwan
OpenAI is piloting group chat functionality for ChatGPT in Japan, New Zealand, South Korea, and Taiwan. The feature allows invitation-only group conversations while maintaining privacy for individual chats and personal ChatGPT memory. OpenAI describes this as a small first step toward creating a more shared experience within the app, with members able to leave groups at any time.
Introducing OpenAI for Ireland
OpenAI announces the launch of OpenAI for Ireland, a partnership initiative with the Irish Government, Dogpatch Labs, and Patch. The program aims to support Irish small and medium enterprises (SMEs), founders, and young builders by providing them with AI tools and resources to drive innovation, enhance productivity, and develop the next generation of Irish technology startups.
Mozilla announces an AI ‘window’ for Firefox
Mozilla is developing a new AI feature for Firefox called ‘AI Window’ that will include an AI assistant and chatbot. The company describes it as an opt-in, user-controlled feature that is being developed openly with user input. Firefox positions itself as an independent browser alternative.
Introducing GPT-5.1 for developers
OpenAI has released GPT-5.1 through its API for developers. The new model features faster adaptive reasoning capabilities, extended prompt caching for improved efficiency, enhanced coding performance, and introduces two new tools: apply_patch and shell for developer workflows.
OpenAI reboots ChatGPT experience with GPT-5.1 after mixed reviews of GPT-5
OpenAI has released GPT-5.1 (Instant and Thinking variants) as an upgrade to GPT-5, which received mixed reviews at launch. The new models feature more conversational and natural tones, adaptive reasoning capabilities, and expanded personalization options including multiple personality presets. GPT-5.1 Thinking uses fewer tokens on simple tasks while maintaining performance on complex queries. The release follows criticism of GPT-5’s initial rollout, where users found it didn’t significantly outperform older models and OpenAI’s plan to sunset beloved models was poorly received.
Google is introducing its own version of Apple’s private AI cloud compute
Google is launching its own version of private AI cloud compute, similar to Apple’s Private Cloud Compute system. This represents Google’s effort to provide privacy-focused AI processing capabilities in the cloud, following Apple’s approach to handling sensitive AI workloads while maintaining user privacy guarantees.
ElevenLabs’ new AI marketplace lets brands use famous voices for ads
ElevenLabs, an AI audio startup, is launching an Iconic Voice Marketplace that allows companies to license AI-replicated voices of famous figures for content and advertisements. The company claims this marketplace addresses ethical concerns by providing a consent-based, performer-first approach to using AI-generated celebrity voices.
Chronosphere takes on Datadog with AI that explains itself, not just outages
Chronosphere, a $1.6B observability startup, announced AI-Guided Troubleshooting capabilities to help engineers diagnose software failures. The system uses a Temporal Knowledge Graph that maps services, infrastructure, and changes over time, combined with AI analysis that shows its reasoning rather than making automatic decisions. The company positions itself against competitors like Datadog, Dynatrace, and Splunk by emphasizing transparency, custom telemetry coverage, and cost reduction (claiming 84% average savings). Features enter limited availability with select customers, with general availability planned for 2026.
Wikipedia urges AI companies to use its paid API, and stop scraping
Wikipedia has announced a plan to address declining traffic in the AI era by urging AI companies to use its paid API service instead of scraping its content. The nonprofit encyclopedia is seeking to ensure financial sustainability as AI systems increasingly use its data for training and responses, potentially reducing direct visits to the Wikipedia website.
Meta’s star AI scientist Yann LeCun plans to leave for own startup
Yann LeCun, Meta’s Chief AI Scientist and Turing Award winner, is reportedly planning to leave the company to start his own venture. The departure is attributed to frustration with Meta’s strategic shift from fundamental AI research toward rapid product development and commercialization. This represents a significant loss for Meta’s AI research division.
Faster Than a Click: Hyperlink Agent Search Now Available on NVIDIA RTX PCs
NVIDIA announces Hyperlink Agent Search, a new feature for RTX PCs that enables LLM-based AI assistants to access and search through local files including slides, notes, PDFs, and images. The technology aims to provide better context for AI responses by allowing assistants to retrieve information from users’ personal document collections stored on their computers.
Expanding support for AI developers on Hugging Face
Google Cloud and Hugging Face announced an expanded partnership to improve AI developer experience. Key improvements include: significantly reduced model download times (from hours to minutes) through a new caching gateway on Google Cloud, native TPU support for all Hugging Face open models alongside existing GPU support, and enhanced security through Google Cloud’s threat intelligence and Mandiant validation for models deployed via Vertex AI Model Garden.
ElevenLabs strike deals with celebs to create AI audio
ElevenLabs, an AI voice synthesis company, has signed deals with actors Michael Caine and Matthew McConaughey to create AI-generated versions of their voices. This represents a commercial partnership where celebrities are licensing their voices for AI audio generation purposes.
Announcing BigQuery-managed AI functions for better SQL
Google Cloud announces public preview of BigQuery-managed AI functions (AI.IF, AI.CLASSIFY, and AI.SCORE) that integrate LLM capabilities directly into SQL queries. These functions enable semantic filtering, data classification, and ranking using natural language criteria without requiring prompt tuning or model selection. BigQuery automatically optimizes prompts, query plans, and model parameters to reduce costs and improve performance when processing unstructured data like text and images alongside traditional SQL operations.
Visa builds AI commerce infrastructure for the Asia Pacific’s 2026 Pilot
Visa announced its Intelligent Commerce platform for Asia Pacific on November 12, designed to address the emerging challenge of AI agents flooding merchant websites. The infrastructure aims to distinguish between legitimate AI shopping agents and malicious bots, with a 2026 pilot planned for the region.
Piloting group chats in ChatGPT
OpenAI is piloting a new group chat feature in ChatGPT that allows multiple users to collaborate in a shared conversation with the AI. The feature is designed to facilitate planning, brainstorming, and collaborative creation among team members within a single ChatGPT conversation.
Fei-Fei Li’s World Labs speeds up the world model race with Marble, its first commercial product
World Labs, founded by AI pioneer Fei-Fei Li, has launched Marble, its first commercial product in the world model space. Marble differentiates itself from competitors like Odyssey, Decart, and Google’s Genie by creating persistent, downloadable 3D environments instead of generating worlds dynamically during exploration. This represents World Labs’ entry into the competitive AI-generated 3D world market.
BMW to Use Alexa+ for in-Vehicle Voice Assistance
BMW has announced it will be the first automaker to integrate Amazon’s upgraded Alexa+ technology for in-vehicle voice assistance. This integration will allow BMW to build uniquely branded AI assistants for their vehicles. The specific timeline for implementation has not been determined yet.
Meta’s chief AI scientist Yann LeCun reportedly plans to leave to build his own startup
Yann LeCun, Meta’s chief AI scientist and Turing Award winner, is reportedly planning to leave the company to start his own startup. The new venture will focus on continuing his research work on world models, a key area of AI research that aims to enable AI systems to understand and predict how the world works.
AWS AI to transform research data on chimpanzees
AWS has committed $1 million to digitize 65 years of handwritten chimpanzee research data from the Jane Goodall Institute using AI technology. The project aims to transform analog field notes into searchable digital archives, making decades of primate research more accessible to scientists and researchers.
Achieve better AI-powered code reviews using new memory capabilities on Gemini Code Assist
Google Cloud announces a new memory capability for Gemini Code Assist on GitHub that enables AI code review agents to learn from past interactions. The feature automatically extracts and stores coding standards from pull request feedback, creating dynamic rules that adapt to team preferences. Memory is stored securely in Google-managed projects and applies learned rules to future code reviews, both guiding initial analysis and filtering suggestions to avoid repeating previously rejected feedback.
Supporting Viksit Bharat: Announcing our newest AI investments in India
Google Cloud announces major AI infrastructure expansion in India, including deployment of Trillium TPUs and AI Hypercomputer architecture to support local data residency and sovereignty requirements. The company is making its latest Gemini models available in India with full data residency support, launching Document AI and batch processing capabilities locally, and partnering with IIT Madras to support the Indic Arena platform for evaluating AI models on India-specific multilingual tasks.
💰 BigMoneyDeals
Microsoft Confirms $10B Spend on Portuguese AI Data Center
Microsoft has announced a $10 billion investment in an AI data center in Portugal. This investment is part of Microsoft’s broader strategy to more than double its European data center capacity across 16 countries by 2027, reflecting the company’s commitment to expanding AI infrastructure in Europe.
Nebius Reveals $3B Deal With Meta
Nebius, a neocloud provider, announced a $3 billion five-year deal with Meta for AI infrastructure. The company disclosed this agreement to shareholders via letter. This follows a previous, even larger AI infrastructure deal that Nebius signed with Microsoft in September.
Alembic melted GPUs chasing causal A.I. — now it’s running one of the fastest supercomputers in the world
Alembic Technologies raised $145 million in Series B funding at a $645 million valuation (13x increase from Series A). The San Francisco startup builds causal AI systems that identify cause-and-effect relationships in enterprise data, rather than correlations. The company is deploying an Nvidia NVL72 superPOD, one of the fastest private supercomputers, after discovering its causal models work across business domains beyond initial marketing focus. Customers include Delta Air Lines, Mars, and Nvidia, using the platform to measure previously unmeasurable business impacts like Olympics sponsorship ROI and viral marketing effects.
Wonderful Raises $100M Series A Just 10 Months In
Tel Aviv-based AI startup Wonderful has raised $100 million in Series A funding just 10 months after its founding. The company specializes in developing multilingual customer service AI agents for enterprise applications. This represents a significant funding round for such an early-stage company in the enterprise AI space.
Building for an Open Future - our new partnership with Google Cloud
Hugging Face announces a strategic partnership with Google Cloud to enhance open-source AI development. The collaboration will integrate Hugging Face’s platform with Google Cloud infrastructure, making it easier for developers to build, train, and deploy AI models using Google’s cloud services. This partnership aims to strengthen the open-source AI ecosystem by combining Hugging Face’s model hub and community with Google Cloud’s computing resources.
Anthropic to invest $50B in U.S. AI infrastructure
Anthropic announces a $50 billion investment in U.S. AI infrastructure. This follows similar large-scale infrastructure investments by other generative AI companies like OpenAI in 2024. The investment represents a significant commitment to expanding AI computational capabilities and data center infrastructure.
New AI data center leads Google’s $6.4B investment in Germany
Google announces a $6.4 billion investment in Germany focused on AI infrastructure expansion, with a new AI data center as the centerpiece of this initiative. This represents a significant commitment to building AI computing capacity in Europe.
Immortality startup Eternos nabs $10.3M, pivots to personal AI that sounds like you
Uare.ai (formerly Eternos) raised $10.3 million in seed funding led by Mayfield and Boldstart Ventures. The startup has pivoted from its original immortality focus to developing personal AI technology that can replicate a user’s voice and communication style.
Cursor Raises $2.3B Bringing It to a $29.3B Valuation
Cursor, an AI-powered code development startup founded in 2022, has raised $2.3 billion in funding, bringing its valuation to $29.3 billion. The company focuses on AI-powered code development and ‘vibe coding’ capabilities, demonstrating significant investor confidence in AI development tools.
AI data startup WisdomAI has raised another $50M, led by Kleiner, Nvidia
WisdomAI, an AI data startup, has secured $50 million in funding led by Kleiner Perkins and Nvidia. The company specializes in AI-driven data analytics that can process and answer business questions from various data types, including structured, unstructured, and ‘dirty’ data that hasn’t been cleaned of errors or typos.
Anthropic announces $50 billion data center plan
Anthropic has announced a $50 billion partnership with U.K.-based company Fluidstack to build data center facilities across the United States. This represents a major infrastructure investment by the AI company to support its operations and growth.
Anthropic will invest $50 billion in building AI data centers in the US
Anthropic announced a $50 billion investment to build AI computing infrastructure in the United States. The company is partnering with AI cloud platform Fluidstack to construct data centers in Texas and New York, with additional locations planned. The data centers are expected to come online throughout 2026 and will create approximately 800 jobs.
Gamma Raises $68M for AI Tool
Gamma, an AI-powered presentation tool positioned as a PowerPoint alternative, has raised $68 million in funding. Following this investment round, the company is now valued at $2.1 billion, marking a significant valuation for a presentation software startup in the AI space.
Wonderful raised $100M Series A to put AI agents on the front lines of customer service
Israeli AI agent startup Wonderful has raised $100 million in Series A funding led by Index Ventures, with participation from Insight Partners, IVP, Bessemer, and Vine Ventures. The substantial funding round in a crowded AI agent market suggests investors believe Wonderful is building genuine infrastructure and orchestration capabilities rather than being just another GPT wrapper.
Nvidia Joins $2B India Deep Tech Alliance
Nvidia has joined a $2 billion India Deep Tech Alliance, where it will provide training and mentoring services to Indian startups operating in the deep tech sector. This partnership aims to support the development of India’s deep tech ecosystem through Nvidia’s expertise and resources.
Salesforce to Acquire Spindle AI in Agentic AI Boost
Salesforce is acquiring Spindle AI to enhance its Agentforce platform. The acquisition will add autonomous analytics and self-improving AI capabilities to Salesforce’s existing AI offerings, strengthening its position in the agentic AI market.
Kaltura acquires eSelf, founded by creator of Snap’s AI, in $27M deal
Kaltura, an enterprise video platform company, has acquired eSelf, an AI avatar startup, in a $27 million deal. eSelf was founded by the creator of Snap’s AI technology. The acquisition aims to integrate generative AI capabilities into Kaltura’s enterprise video and learning tools, enhancing their platform with AI avatar technology.
AI PowerPoint-killer Gamma hits $2.1B valuation, $100M ARR, founder says
Gamma, an AI-powered presentation software company positioning itself as a PowerPoint alternative, has reached a $2.1 billion valuation with $100 million in annual recurring revenue (ARR). Co-founder and CEO Grant Lee reports the company is growing quickly and operating profitably.
🔬 Technical
Weight-sparse transformers have interpretable circuits
Researchers from OpenAI have developed a method for creating interpretable circuits in Transformer models by training them with sparse weights, where most connections are zero. This produces models with highly understandable circuits that can be explained at granular levels (individual neurons, attention channels) and are simple enough to visualize completely. The main limitation is that these sparse models are expensive to train and deploy, making direct application to frontier models unlikely, though the team aims to eventually scale the method to create a fully interpretable moderate-sized model.
Steering Language Models with Weight Arithmetic
Researchers present a method for steering language model behavior by performing arithmetic operations on model weights rather than activations. The technique involves fine-tuning models on contrasting behaviors and subtracting the weight deltas to isolate behavior directions. Results show this ‘contrastive weight steering’ often generalizes better than activation steering for traits like sycophancy, and can detect emergence of problematic behaviors during training without requiring examples of bad behavior. The work was conducted as part of MATS and includes both paper and code releases.
OpenAI experiment finds that sparse models could give AI builders the tools to debug neural networks
OpenAI researchers are experimenting with sparse neural network architectures to improve AI model interpretability and debugging capabilities. By reducing connections between nodes and using circuit tracing techniques, they achieved 16-fold smaller circuits compared to dense models while maintaining comparable performance. The research focuses on mechanistic interpretability, which reverse-engineers a model’s mathematical structure to understand decision-making processes, though current experiments are limited to smaller models like GPT-2 rather than frontier models.
Inside LinkedIn’s generative AI cookbook: How it scaled people search to 1.3 billion users
LinkedIn has launched AI-powered people search for its 1.3 billion users, three years after ChatGPT’s debut. The system uses semantic understanding to interpret natural language queries and surface relevant professionals, even without exact keyword matches. The technical implementation involved a multi-stage pipeline: distilling a 7B parameter model into smaller models (ultimately 220M parameters), using synthetic training data, GPU-based infrastructure for retrieval, and RL-trained summarizers that reduced input size 20x, achieving 10x throughput gains. LinkedIn’s approach emphasizes pragmatic optimization over hype, focusing on perfecting recommender systems as tools for future agents rather than building agents directly.
Weibo’s new open source AI model VibeThinker-1.5B outperforms DeepSeek-R1 on $7,800 post-training budget
Weibo’s AI division released VibeThinker-1.5B, a 1.5 billion parameter open-source LLM that outperforms much larger models including DeepSeek-R1 (671B parameters) on specific reasoning benchmarks. The model was post-trained for only $7,800 using a novel Spectrum-to-Signal Principle (SSP) training approach that prioritizes solution diversity before reinforcement learning. It excels at math and coding tasks (scoring 74.4 on AIME25 and 51.1 on LiveCodeBench) but lags on general knowledge benchmarks, demonstrating that smaller, efficiently-trained models can match larger systems in specialized domains.
Meta’s SPICE framework lets AI systems teach themselves to reason
Meta FAIR and the National University of Singapore have developed SPICE (Self-Play In Corpus Environments), a reinforcement learning framework that enables AI systems to self-improve through adversarial interaction. The system uses two AI agents: a ‘Challenger’ that creates problems from document corpora and a ‘Reasoner’ that solves them without access to source documents. Testing on models like Qwen3-4B-Base showed consistent improvements across mathematical and general reasoning benchmarks, with the Reasoner’s pass rate increasing from 55% to 85% over time.
Meta returns to open source AI with Omnilingual ASR models that can transcribe 1,600+ languages natively
Meta has released Omnilingual ASR, an open-source automatic speech recognition system supporting 1,600+ languages natively, with zero-shot learning capabilities extending coverage to 5,400+ languages. Released under Apache 2.0 license (unlike previous restrictive Llama licenses), it includes models up to 7B parameters, a 3,350-hour corpus covering 348 low-resource languages, and achieves character error rates under 10% in 78% of supported languages. This release follows Meta’s troubled Llama 4 launch and represents a strategic reset in their AI approach.
Baidu unveils proprietary ERNIE 5 beating GPT-5 performance on charts, document understanding and more
Baidu unveiled ERNIE 5.0, a proprietary omni-modal AI model that claims to outperform GPT-5 and Gemini 2.5 Pro on document understanding, chart reasoning, and multimodal tasks. The model is available via Baidu’s ERNIE Bot and Qianfan API at $0.85/$3.40 per million input/output tokens. Baidu also released an open-source model (ERNIE-4.5-VL-28B) under Apache 2.0 license and announced global expansion of AI products including MeDo, Oreate, and digital human platforms. Independent verification of benchmark claims is pending, and early users reported tool-invocation bugs that Baidu acknowledged.
Understanding neural networks through sparse circuits
OpenAI is researching mechanistic interpretability to understand neural network reasoning processes. They are developing a sparse model approach aimed at making AI systems more transparent and improving their safety and reliability. This work focuses on understanding the internal circuits and mechanisms within neural networks.
BlueCodeAgent: A blue teaming agent enabled by automated red teaming for CodeGen AI
Microsoft Research has developed BlueCodeAgent, an end-to-end blue-teaming framework designed to enhance code security in AI-generated code. The system leverages automated red-teaming processes, data, and safety rules to guide large language models in making defensive security decisions. The framework incorporates dynamic testing to reduce false positives in vulnerability detection.
OpenAI’s new LLM exposes the secrets of how AI really works
OpenAI has developed an experimental large language model designed to be more transparent and interpretable than typical LLMs. This is significant because current LLMs function as ‘black boxes’ where their internal decision-making processes are not fully understood. The new model aims to shed light on how LLMs work in general, which could help researchers better understand AI systems.
Google DeepMind is using Gemini to train agents inside Goat Simulator 3
Google DeepMind has developed SIMA 2, an advanced video-game-playing agent capable of navigating and problem-solving across multiple 3D virtual worlds, including Goat Simulator 3. The company positions this as a significant advancement toward general-purpose AI agents and improved real-world robotics. SIMA 2 is an evolution of the original SIMA (scalable instructable multiworld agent) that was first demonstrated last year.
Researchers isolate memorization from problem-solving in AI neural networks
Researchers have discovered that AI neural networks store memorized information and logical reasoning capabilities in distinct pathways. The study reveals that basic arithmetic ability resides in memorization pathways rather than logic circuits, suggesting a fundamental separation between how AI models handle rote learning versus problem-solving tasks.
MMCTAgent: Enabling multimodal reasoning over large video and image collections
Microsoft Research has announced MMCTAgent, a multimodal AI system built on the AutoGen framework that enables dynamic reasoning over large collections of videos and images. The system combines language, vision, and temporal understanding capabilities with iterative planning and reflection mechanisms to handle complex analysis tasks involving long-form video content and image collections.
Project Fetch: Can Claude train a robot dog?
Project Fetch is an Anthropic research initiative exploring whether Claude, their AI language model, can be used to train a robot dog. The project investigates the application of large language models in robotics training and control, representing an expansion of Claude’s capabilities beyond text-based interactions into physical embodied AI systems.
Baidu just dropped an open-source multimodal AI that it claims beats GPT-5 and Gemini
Baidu released ERNIE-4.5-VL-28B-A3B-Thinking, an open-source multimodal AI model under Apache 2.0 license that claims to outperform Google’s Gemini 2.5 Pro and OpenAI’s GPT-5-High on vision-related benchmarks. The model uses a Mixture-of-Experts architecture with 28 billion total parameters but only activates 3 billion during operation, allowing it to run on a single 80GB GPU. Key features include dynamic image examination (’Thinking with Images’), enhanced visual grounding, and video understanding capabilities, though independent verification of performance claims is pending.
How to Unlock Accelerated AI Storage Performance With RDMA for S3-Compatible Storage
The article discusses how RDMA (Remote Direct Memory Access) technology can enhance storage performance for S3-compatible storage systems in AI workloads. It highlights the growing data demands of AI applications, noting that enterprises are projected to generate nearly 400 zettabytes of data annually by 2028, with 90% being unstructured data including audio, video, PDFs, and images. The piece focuses on technical solutions for scalable and affordable storage infrastructure.
A new top score: Advancing Text-to-SQL on the BIRD benchmark
Google Cloud achieved a state-of-the-art score of 76.13 on the BIRD benchmark’s Single Trained Model Track for text-to-SQL translation, surpassing other single-model solutions (human performance benchmark is 92.96). The achievement was accomplished through a three-phase approach: rigorous data filtering to create a gold-standard dataset, multitask learning using supervised fine-tuning of Gemini 2.5-pro, and self-consistency testing with 1-7 query candidates. This advancement is being integrated into Google Cloud products including AlloyDB AI’s natural language capability, BigQuery’s conversational analytics, and Gemini Code Assist.
Introducing Agent Sandbox: Strong guardrails for agentic AI on Kubernetes and GKE
Google announced Agent Sandbox at KubeCon NA 2025, a new Kubernetes primitive designed for secure execution of AI agents. Built on gVisor and Kata Containers, it provides kernel-level isolation for agentic AI workloads that execute code and use computer terminals. On GKE, it offers sub-second latency through pre-warmed sandbox pools (90% improvement over cold starts) and introduces Pod Snapshots for checkpoint/restore capabilities, reducing startup times from minutes to seconds for both CPU and GPU workloads.
NVIDIA Wins Every MLPerf Training v5.1 Benchmark
NVIDIA announces that it won every benchmark in MLPerf Training v5.1, the latest round of industry-standard AI training performance tests. The article emphasizes that training more capable AI models requires breakthroughs across multiple hardware and software components including GPUs, CPUs, networking, and system architectures. The results showcase NVIDIA’s Blackwell architecture performance in AI training workloads.
🤔 Sceptical
Turns Out AI Is Not Good at Database Transaction Scheduling
A research article from UC Berkeley’s ADRS group examines the effectiveness of AI approaches for database transaction scheduling. The article appears to present findings that AI methods are not performing well at this specific database optimization task, challenging assumptions about AI’s capabilities in systems-level optimization problems.
Upwork study shows AI agents excel with human partners but fail independently
Upwork released peer-reviewed research evaluating AI agents (GPT-5, Claude Sonnet 4, Gemini 2.5 Pro) on 300+ real freelance projects. AI agents working independently showed poor completion rates on even simple tasks, but when paired with human experts providing just 20 minutes of feedback, completion rates improved by up to 70%. The study challenges both AI replacement fears and autonomous agent hype, suggesting the future involves human-AI collaboration rather than full automation.
Only 9% of developers think AI code can be used without human oversight, BairesDev survey reveals
BairesDev’s Q4 2025 Dev Barometer survey of 501 developers and 19 project managers reveals that only 9% of developers trust AI-generated code enough to use without human oversight, while 56% consider it ‘somewhat reliable.’ Despite this caution, 65% of senior developers expect AI to redefine their roles by 2026, with 74% anticipating a shift from hands-on coding to solution design and architecture. The survey shows developers are saving approximately 8 hours per week using AI tools for code scaffolding and unit tests, but concerns exist about reduced entry-level opportunities potentially creating future talent shortages.
Court rules that OpenAI violated German copyright law; ordered it to pay damages
A German court has ruled that OpenAI violated German copyright law by training ChatGPT’s language models on licensed musical works without obtaining proper permission. The court has ordered OpenAI to pay damages as a result of this infringement.
The circular money problem at the heart of AI’s biggest deals
SoftBank and OpenAI announced a 50-50 joint venture called ‘Crystal Intelligence’ to sell enterprise AI tools in Japan. However, the deal raises concerns about circular financing, as SoftBank is simultaneously a major investor in OpenAI. The article questions whether such arrangements create genuine economic value or merely circulate money between related parties without producing real growth.
Closing Thoughts
The novelty phase is officially over—this week’s shift toward explainability, security, and human-in-the-loop validation signals AI’s transition from shiny new toy to infrastructure that needs guardrails. Meanwhile, open-source models are nipping at proprietary heels, multimodal capabilities continue their relentless expansion, and the datacenter gold rush spans from Silicon Valley to Stuttgart. The convergence of group chat features across all major providers tells us exactly where this is heading: AI assistants are about to become permanent members of every team meeting, whether we asked for that or not.
See you next week, where I’ll be writing this from a group chat with three LLMs who’ve volunteered to “help” with my workflow. YAI 👋
Disclaimer: I use AI to help aggregate and process the news. I do my best to cross-check facts and sources, but misinformation may still slip through. Always do your own research and apply critical thinking—with anything you consume these days, AI-generated or otherwise.



Your articulation of explainability becoming a prerequisite truly captures this stabilization era, suggesting a richer, more responsibile AI development phase.