Detecting and Controlling Sycophancy with Cascading Linear Features
arXiv:2606.26155v1 Announce Type: new Abstract: Interpreting and controlling model behaviors through activation steering methods requires many pairs of contrastive samples that...
Life After Benchmark Saturation: A Case Study of CORE-Bench
arXiv:2606.26158v1 Announce Type: new Abstract: When a benchmark's accuracy saturates, it is often retired and replaced with a more challenging version. We show that this approa...
AlgoEvolve: LLM-driven Meta-evolution of Algorithmic Trading Programs
arXiv:2606.26173v1 Announce Type: new Abstract: Recent work shows that Large Language Models (LLMs) can act as semantic mutation operators for the evolutionary discovery of prog...
OpenAI frontier models and Codex are now available on AWS
OpenAI frontier models and Codex are now generally available on AWS, giving enterprises a new path to build with OpenAI through the AWS environments, controls, and procurement w...
Databricks brings GPT-5.5 to enterprise agent workflows
Databricks uses GPT-5.5 for enterprise agent workflows after the model set a new state of the art on the OfficeQA Pro benchmark.
Introducing GPT-5.4 mini and nano
GPT-5.4 mini and nano are smaller, faster versions of GPT-5.4 optimized for coding, tool use, multimodal reasoning, and high-volume API and sub-agent workloads.
CyberSecEval 2 - A Comprehensive Evaluation Framework for Cybersecurity Risks and Capabilities of Large Language Models
CyberSecEval 2 - A Comprehensive Evaluation Framework for Cybersecurity Risks and Capabilities of Large Language Models
Introducing the Gemini 2.5 Computer Use model
Available in preview via the API, our Computer Use model is a specialized model built on Gemini 2.5 Pro’s capabilities to power agents that can interact with user interfaces.
MobileGuard: A Mobile-Native Governance Framework for Agentic AI
Article URL: https://zenodo.org/records/20970167 Comments URL: https://news.ycombinator.com/item?id=48701972 Points: 1 # Comments: 0
Is it agentic enough? Benchmarking open models on your own tooling
Is it agentic enough? Benchmarking open models on your own tooling
A New Framework for Evaluating Voice Agents (EVA)
A New Framework for Evaluating Voice Agents (EVA)
Everyone feared AI taking over; the real danger is AI serving just the few
Everyone feared AI would enslave humanity; but it looks like the real fight is stopping governments and Big Tech from enslaving AI for the benefit of the few. Amid the newly ann...
Gemma Scope 2: helping the AI safety community deepen understanding of complex language model behavior
Open interpretability tools for language models are now available across the entire Gemma 3 family with the release of Gemma Scope 2.
Strengthening our Frontier Safety Framework
We’re strengthening the Frontier Safety Framework (FSF) to help identify and mitigate severe risks from advanced AI models.
What Happens When AI Agents Refuse to Work Until They're Paid
Article URL: https://blog.owulveryck.info/2026/06/25/from-isolated-agents-to-agentic-mesh-orchestrating-sdlc-with-a2a-and-ap2.html Comments URL: https://news.ycombinator.com/ite...
Agentic Analysis for Agentic Infrastructure: An LLM-Powered Pipeline for Comparative Governance of DAO and Corporate AI Protocols
arXiv:2606.26203v1 Announce Type: new Abstract: As AI agent protocols proliferate, the governance structures shaping their interoperability standards remain empirically underexa...
Knowledge-augmented Agentic AI for Mental Health Medication Information Seeking
arXiv:2606.26205v1 Announce Type: new Abstract: Patients increasingly seek medication information online, yet safety knowledge for psychiatric drugs is split between regulatory...
Governing Actions, Not Agents: Institutional Attestation as a Governance Model for Autonomous AI Systems
arXiv:2606.26298v1 Announce Type: new Abstract: Autonomous AI agents may begin to perform consequential, irreversible actions such as clinical prescribing and production softwar...
COrigami: An AI Pipeline for Co-Designing Flat-Foldable Visually Recognisable Origami
arXiv:2606.26299v1 Announce Type: new Abstract: While generative AI has achieved remarkable success in solving problems with verifiable solutions, generating physical art that s...
The Verification Horizon: No Silver Bullet for Coding Agent Rewards
arXiv:2606.26300v1 Announce Type: new Abstract: A classical intuition holds that verifying a solution is easier than producing one. For today's coding agents, this intuition is...