TODAY · 20 SIGNALS Last Update: 2026-07-03 23:03
#01

Auto-FL-Research: Agentic Search for Federated Learning Algorithms

arXiv:2607.01366v1 Announce Type: new Abstract: Federated learning (FL) research often depends on many small but consequential algorithmic choices: optimizer variants, server ag...

arXiv AI /
#02

Agent4cs: A Multi-agent System for Code Summarization in Large Hierarchical Codebases

arXiv:2607.01425v1 Announce Type: new Abstract: Understanding large, complex codebases, especially those with obfuscated structures and incomplete documentation, remains a signi...

arXiv AI /
#03

Beyond Next-Token Prediction: An RLVR Proof of Concept for Tool-Use Agents on Atlassian Workflows

arXiv:2607.01465v1 Announce Type: new Abstract: Large language models are trained to predict the next token, not to act inside a specific API. In niche enterprise SaaS workflows...

arXiv AI /
#04

RuleChef: Grounding LLM Task Knowledge in Human-Editable Rules

arXiv:2607.01293v1 Announce Type: new Abstract: We present RuleChef, a framework that uses large language models (LLMs) to generate executable rules for NLP tasks such as text c...

arXiv Computation and Language /
#05

RusFinChain: A Russian Benchmark for Verifiable Chain-of-Thought Reasoning in Finance with Fuzzy-Aligned Evaluation

arXiv:2607.01388v1 Announce Type: new Abstract: Multi-step symbolic reasoning is essential for robust financial analysis, yet most benchmarks neglect intermediate reasoning step...

arXiv Computation and Language /
#06

FaithMed: Training LLMs For Faithful Evidence-Based Medical Reasoning

arXiv:2607.01440v1 Announce Type: new Abstract: Faithful reasoning is essential in medicine, where clinical decisions require transparent justification grounded in reliable evid...

arXiv Computation and Language /
#07

OpenAI frontier models and Codex are now available on AWS

OpenAI frontier models and Codex are now generally available on AWS, giving enterprises a new path to build with OpenAI through the AWS environments, controls, and procurement w...

OpenAI News /
#08

Databricks brings GPT-5.5 to enterprise agent workflows

Databricks uses GPT-5.5 for enterprise agent workflows after the model set a new state of the art on the OfficeQA Pro benchmark.

OpenAI News /
#09

Introducing GPT-5.4 mini and nano

GPT-5.4 mini and nano are smaller, faster versions of GPT-5.4 optimized for coding, tool use, multimodal reasoning, and high-volume API and sub-agent workloads.

OpenAI News /
#10

CyberSecEval 2 - A Comprehensive Evaluation Framework for Cybersecurity Risks and Capabilities of Large Language Models

CyberSecEval 2 - A Comprehensive Evaluation Framework for Cybersecurity Risks and Capabilities of Large Language Models

Hugging Face Blog /
#11

ScarfBench: Benchmarking AI Agents for Enterprise Java Framework Migration

ScarfBench: Benchmarking AI Agents for Enterprise Java Framework Migration

Hugging Face Blog /
#12

Introducing the Gemini 2.5 Computer Use model

Available in preview via the API, our Computer Use model is a specialized model built on Gemini 2.5 Pro’s capabilities to power agents that can interact with user interfaces.

Google DeepMind Blog /
#13

OpenCode, Pi, and Goose: Three Layers of the AI Agent Stack

Article URL: https://gist.github.com/AIMOWAY/bd8007c8f834a9bc83c71e3178239d75 Comments URL: https://news.ycombinator.com/item?id=48779685 Points: 2 # Comments: 0

Hacker News AI /
#14

Is it agentic enough? Benchmarking open models on your own tooling

Is it agentic enough? Benchmarking open models on your own tooling

Hugging Face Blog /
#15

Gemma Scope 2: helping the AI safety community deepen understanding of complex language model behavior

Open interpretability tools for language models are now available across the entire Gemma 3 family with the release of Gemma Scope 2.

Google DeepMind Blog /
#16

Strengthening our Frontier Safety Framework

We’re strengthening the Frontier Safety Framework (FSF) to help identify and mitigate severe risks from advanced AI models.

Google DeepMind Blog /
#17

The Termi Protocol: Watch AI Coding Agents Build in 3D

Article URL: https://termiprotocol.com/ Comments URL: https://news.ycombinator.com/item?id=48780405 Points: 1 # Comments: 1

Hacker News AI /
#18

Show HN: Durable AI agents without the workflow engine

Article URL: https://www.noworkflows.dev/ Comments URL: https://news.ycombinator.com/item?id=48780400 Points: 3 # Comments: 0

Hacker News AI /
#19

Procedural Memory Distillation: Online Reflection for Self-Improving Language Models

arXiv:2607.01480v1 Announce Type: new Abstract: Reinforcement learning with verifiable rewards (RLVR), along with recent selfdistillation variants such as SDPO, evaluates each r...

arXiv AI /
#20

Janus: a Playground for User-Involved Agentic Permission Management

arXiv:2607.01510v1 Announce Type: new Abstract: AI agents that autonomously execute tool calls on a user's behalf raise pressing questions about permission management: what role...

arXiv AI /