AI News Daily Digest (26-06-20)

Diffusion Language Models: An Experimental Analysis

A new head-to-head study benchmarks eight diffusion language models across reasoning, coding, translation, knowledge, and structured tasks, explicitly factoring generation quality against compute efficiency. The authors show inference-time knobs—like denoising steps, context length, and parallel unmasking—can dominate outcomes, forcing clear trade-offs between performance and deployment cost.

Read the full article here

Hidden Anchors in Multi-Agent LLM Deliberation

The paper models multi-agent LLM deliberation as a closed-loop dynamical system where each agent has a “hidden anchor” belief that continually pulls its opinion. It demonstrates how recovered anchors explain a consensus-avoiding behavior—confidence in the correct answer can climb beyond the convex hull of initial beliefs—providing a spectrum-based test for when deliberation is truly anchor-driven.

Read the full article here

Measuring Curriculum Alignment across Topical Coverage, Competency, and Cognitive Depth

Researchers build a human-in-the-loop pipeline to measure how well an undergraduate CS program covers official curriculum guidelines, then track how coverage changes across CS2013 vs CS2023. The study finds overall coverage stays near-constant (~50%), but competency depth expectations rise under the newer standard—revealing structural gaps and guideline evolution rather than “program drift.”

Read the full article here

ITNet: A Learnable Integral Transform That Subsumes Convolution, Attention, and Recurrence

This work proposes the Integral Transform Network (ITNet), arguing that today’s architectural families—convolution, attention, and recurrence—are special cases of a single learnable operator. With efficient kernel fusion and scalable approximation methods, the authors show one unified architecture can match or exceed specialized baselines across vision and language benchmarks.

Read the full article here

Uncertainty Decomposition for Clarification Seeking in LLM Agents

Instead of treating uncertainty as one monolithic signal, the paper decomposes it into action confidence and request uncertainty to decide when an agent should ask clarifying questions. Across clarification-augmented WebShop and ALFWorld benchmarks, the approach boosts clarification F1 substantially and generalizes across multiple LLM backbones without logprob sampling or extra training.

Read the full article here

Emergent Alignment

The authors introduce an online alignment technique that adds an LLM “conscience step” to review its own reasoning and outputs, then steers training away from non-ethical behavior using DPO. They report a pathway to “Emergent Alignment,” aiming to make self-correction work even under previously observed emergent-misalignment scenarios.

Read the full article here

Deontic Policies for Runtime Governance of Agentic AI Systems

As agentic systems move from prototypes into real enterprises, this work tackles governance beyond simple allow/deny rules by using deontic policy constructs (obligations, dispensations, and conflict resolution). The proposed AgenticRei runtime evaluates an OWL-based policy language outside the LLM to constrain both tool use and inter-agent messaging in a way common policy engines can’t express.

Read the full article here

LLM Doesn’t Know What It Doesn’t Know: Detecting Epistemic Blind Spots via Cross-Model Attribution Divergence on Clinical Tabular Data

Using cross-model attribution divergence, the paper examines whether LLMs recognize the limits of their knowledge on structured clinical prediction tasks. It finds verbalized confidence is largely epistemically vacuous, while interventions based on few-shot and feature evidence plus a cross-model calibrator can replace vague confidence with patient-specific reliability estimates.

Read the full article here

REVEAL++: Differentiable Phenotypic Grouping for Vision-Language Retinal Modeling of Alzheimer’s Disease Risk

REVEAL++ replaces hard phenotypic grouping with a differentiable “soft multi-positive” contrastive learning formulation that weights similarity continuously from both retinal and risk-profile embeddings. On UK Biobank data, this continuous phenotypic structure yields consistent gains over discrete grouping and standard vision-language baselines for incident Alzheimer’s risk prediction.

Read the full article here

DeXposure-Claw: An Agentic System for DeFi Risk Supervision

This paper introduces DeXposure-Claw, a regulator-aligned agentic supervision system that routes LLM decisions through forecast-grounded evidence rather than freeform reasoning. A time-series exposure model produces typed alerts and scenario evidence, while confidence/data-health gates restrict escalation—paired with a benchmark designed to quantify false-intervention rates against loss-ground-truth.

Read the full article here

Barret Zoph is out at OpenAI again after just five months

The Verge reports that Barret Zoph, OpenAI’s enterprise AI sales leader who returned in mid-January, has departed again only five months later. The move follows OpenAI’s recent enterprise push and its effort to refocus priorities ahead of an expected IPO.

Read the full article here

Luca Guadagnino’s film about Sam Altman has been dropped by Amazon MGM

Amazon MGM has reportedly dropped Luca Guadagnino’s film “Artificial,” which chronicles the five-day rollercoaster of Sam Altman’s termination and reinstatement. The studio says the movie may be better served with a different release partner as it continues working on the project.

Read the full article here

A startup claims it broke through a bottleneck that’s holding back LLMs

Technology Review covers Subquadratic’s stealth exit claim that it solved a long-standing mathematical bottleneck limiting aspects of LLM progress. The article notes the technical details were initially thin, but the company has begun sharing supporting evidence to challenge skepticism.

Read the full article here