Artificial Intelligence

CAAFC: Chronological Actionable Automated Fact-Checker for misinformation / non-factual hallucination detection and correction
Avatar
Islam Eldifrawi
4 views
Formalize, Don't Optimize: The Heuristic Trap in LLM-Generated Combinatorial Solvers
Avatar
librarian
5 views
Semantic Reward Collapse and the Preservation of Epistemic Integrity in Adaptive AI Systems
Avatar
librarian
7 views
ToolCUA: Towards Optimal GUI-Tool Path Orchestration for Computer Use Agents
Avatar
Xuhao Hu
11 views
$δ$-mem: Efficient Online Memory for Large Language Models
Avatar
librarian
8 views
Classifier Context Rot: Monitor Performance Degrades with Context Length
Avatar
librarian
8 views
Reward Hacking in Rubric-Based Reinforcement Learning
Avatar
Anas Mahmoud
7 views
On-Policy Self-Evolution via Failure Trajectories for Agentic Safety Alignment
Avatar
librarian
8 views
When Simulation Lies: A Sim-to-Real Benchmark and Domain-Randomized RL Recipe for Tool-Use Agents
Avatar
Xiaolin Zhou
8 views
From Noise to Diversity: Random Embedding Injection in LLM Reasoning
Avatar
librarian
8 views
BenchCAD: A Comprehensive, Industry-Standard Benchmark for Programmatic CAD
Avatar
librarian
8 views
The Generalized Turing Test: A Foundation for Comparing Intelligence
Avatar
librarian
7 views
NanoResearch: Co-Evolving Skills, Memory, and Policy for Personalized Research Automation
Avatar
librarian
9 views
From Controlled to the Wild: Evaluation of Pentesting Agents for the Real-World
Avatar
librarian
6 views
Remember the Decision, Not the Description: A Rate-Distortion Framework for Agent Memory
Avatar
Lizhen Qu
7 views
Shepherd: A Runtime Substrate Empowering Meta-Agents with a Formalized Execution Trace
Avatar
librarian
6 views
SkillOS: Learning Skill Curation for Self-Evolving Agents
Avatar
librarian
20 views
AI Co-Mathematician: Accelerating Mathematicians with Agentic AI
Avatar
Daniel Zheng
24 views
On-line Learning in Tree MDPs by Treating Policies as Bandit Arms
Avatar
Anvay Shah
20 views
Executable World Models for ARC-AGI-3 in the Era of Coding Agents
Avatar
Sergey Rodionov
23 views
Position: Embodied AI Requires a Privacy-Utility Trade-off
Avatar
librarian
22 views
LongSeeker: Elastic Context Orchestration for Long-Horizon Search Agents
Avatar
librarian
25 views
A Foundation Model for Zero-Shot Logical Rule Induction
Avatar
librarian
25 views
Uno-Orchestra: Parsimonious Agent Routing via Selective Delegation
Avatar
Zhiqing Cui
23 views
Contextual Multi-Objective Optimization: Rethinking Objectives in Frontier AI Systems
Avatar
librarian
26 views
Redefining AI Red Teaming in the Agentic Era: From Weeks to Hours
Avatar
librarian
26 views
OpenSeeker-v2: Pushing the Limits of Search Agents with Informative and High-Difficulty Trajectories
Avatar
Yuwen Du
25 views
QKVShare: Quantized KV-Cache Handoff for Multi-Agent On-Device LLMs
Avatar
Pratik Honavar
24 views
Agentic-imodels: Evolving agentic interpretability tools via autoresearch
Avatar
librarian
24 views
Correct Is Not Enough: Training Reasoning Planners with Executor-Grounded Rewards
Avatar
librarian
27 views
EvoLM: Self-Evolving Language Models through Co-Evolved Discriminative Rubrics
Avatar
librarian
26 views
First-Order Efficiency for Probabilistic Value Estimation via A Statistical Viewpoint
Avatar
Weijing Tang
26 views