TODAY · 20 SIGNALS Last Update: 2026-07-02 23:06
#01

Constructive Alignment: Governing Preference Dynamics in Human-AI Interaction

arXiv:2607.00001v1 Announce Type: new Abstract: Most approaches to AI alignment treat human preferences as fixed targets to be inferred and optimized. This assumption conflicts...

arXiv AI /
#02

Bounded Morality: Defining the Space of Moral Computation

arXiv:2607.00002v1 Announce Type: new Abstract: Moral cognition has traditionally been modeled as adherence to fixed ethical theories--deontology, consequentialism, virtue ethic...

arXiv AI /
#03

Making Failure Safe: A Constrained, Verifiable Agent Framework for Open-Web Data Collection

arXiv:2607.00035v1 Announce Type: new Abstract: LLMs and agents can generate web scrapers from natural-language requirements, but direct generation remains unreliable because of...

arXiv AI /
#04

Benchmarking Frontier LLMs on Arabic Cultural and Sociolinguistic Knowledge: A Cross-Evaluation Framework with Human SME Ground Truth

arXiv:2607.00139v1 Announce Type: new Abstract: The cost of human expert evaluation is a principal bottleneck to deploying language models in specialized, high-stakes domains. T...

arXiv Computation and Language /
#05

ALEE: Any-Language Evaluation of Embeddings via English-Centric Minimal Pairs

arXiv:2607.00171v1 Announce Type: new Abstract: Text embeddings are standard for semantic similarity tasks, yet their evaluation remains an open challenge. Current benchmarks ar...

arXiv Computation and Language /
#06

TRACE: State-Aware Query Processing over Temporal Evidence Graphs for Conversational Data

arXiv:2607.00339v1 Announce Type: new Abstract: Conversational data is increasingly used as a persistent source of user state for long-running assistants and AI agents. However,...

arXiv Computation and Language /
#07

ScarfBench: Benchmarking AI Agents for Enterprise Java Framework Migration

ScarfBench: Benchmarking AI Agents for Enterprise Java Framework Migration

Hugging Face Blog /
#08

OpenAI frontier models and Codex are now available on AWS

OpenAI frontier models and Codex are now generally available on AWS, giving enterprises a new path to build with OpenAI through the AWS environments, controls, and procurement w...

OpenAI News /
#09

Databricks brings GPT-5.5 to enterprise agent workflows

Databricks uses GPT-5.5 for enterprise agent workflows after the model set a new state of the art on the OfficeQA Pro benchmark.

OpenAI News /
#10

Introducing GPT-5.4 mini and nano

GPT-5.4 mini and nano are smaller, faster versions of GPT-5.4 optimized for coding, tool use, multimodal reasoning, and high-volume API and sub-agent workloads.

OpenAI News /
#11

CyberSecEval 2 - A Comprehensive Evaluation Framework for Cybersecurity Risks and Capabilities of Large Language Models

CyberSecEval 2 - A Comprehensive Evaluation Framework for Cybersecurity Risks and Capabilities of Large Language Models

Hugging Face Blog /
#12

Introducing the Gemini 2.5 Computer Use model

Available in preview via the API, our Computer Use model is a specialized model built on Gemini 2.5 Pro’s capabilities to power agents that can interact with user interfaces.

Google DeepMind Blog /
#13

AI Agent Qubitz

Article URL: https://github.com/Gabrieliam42/AI-Agent-Qubitz Comments URL: https://news.ycombinator.com/item?id=48767232 Points: 1 # Comments: 0

Hacker News AI /
#14

Is it agentic enough? Benchmarking open models on your own tooling

Is it agentic enough? Benchmarking open models on your own tooling

Hugging Face Blog /
#15

Gemma Scope 2: helping the AI safety community deepen understanding of complex language model behavior

Open interpretability tools for language models are now available across the entire Gemma 3 family with the release of Gemma Scope 2.

Google DeepMind Blog /
#16

Strengthening our Frontier Safety Framework

We’re strengthening the Frontier Safety Framework (FSF) to help identify and mitigate severe risks from advanced AI models.

Google DeepMind Blog /
#17

Show HN: Fastest Enterprise AI Gateway

Article URL: https://github.com/maximhq/bifrost Comments URL: https://news.ycombinator.com/item?id=48767003 Points: 2 # Comments: 0

Hacker News AI /
#18

Show HN: Mirrors – test AI agent changes by replaying real production traces

Article URL: https://www.runmirrors.com/ Comments URL: https://news.ycombinator.com/item?id=48768200 Points: 3 # Comments: 0

Hacker News AI /
#19

RareDxR1: Autonomous Medical Reasoning for Rare Disease Diagnosis Beyond Human Annotation

arXiv:2607.00147v1 Announce Type: new Abstract: Rare disease differential diagnosis is a critical yet arduous clinical task, requiring physicians to identify precise phenotypes...

arXiv AI /
#20

Mnemosyne: Agentic Transaction Processing for Validating and Repairing AI-generated Workflows

arXiv:2607.00269v1 Announce Type: new Abstract: LLMs, solvers, and agent teams increasingly generate workflow actions, repairs, and plans, but a generated action may be syntacti...

arXiv AI /