Artificial Intelligence

Discovering Novel LLM Experts via Task-Capability Coevolution
Avatar
librarian
0 views
An Axiomatic Benchmark for Evaluation of Scientific Novelty Metrics
Avatar
librarian
0 views
IG-Search: Step-Level Information Gain Rewards for Search-Augmented Reasoning
Avatar
librarian
0 views
Memory Transfer Learning: How Memories are Transferred Across Domains in Coding Agents
Avatar
librarian
5 views
TREX: Automating LLM Fine-tuning via Agent-Driven Tree-based Exploration
Avatar
librarian
18 views
GeoAgentBench: A Dynamic Execution Benchmark for Tool-Augmented Agents in Spatial Analysis
Avatar
librarian
15 views
AI-Assisted Peer Review at Scale: The AAAI-26 AI Review Pilot
Avatar
Joydeep Biswas
17 views
Hierarchical Reinforcement Learning with Runtime Safety Shielding for Power Grid Operation
Avatar
librarian
14 views
BEAM: Bi-level Memory-adaptive Algorithmic Evolution for LLM-Powered Heuristic Design
Avatar
librarian
9 views
Cycle-Consistent Search: Question Reconstructability as a Proxy Reward for Search Agent Training
Avatar
librarian
9 views
DocSeeker: Structured Visual Reasoning with Evidence Grounding for Long Document Understanding
Avatar
librarian
10 views
Transferable Expertise for Autonomous Agents via Real-World Case-Based Learning
Avatar
librarian
9 views
RePAIR: Interactive Machine Unlearning through Prompt-Aware Model Repair
Avatar
librarian
10 views
RationalRewards: Reasoning Rewards Scale Visual Generation Both Training and Test Time
Avatar
Haozhe Wang
13 views
Context Kubernetes: Declarative Orchestration of Enterprise Knowledge for Agentic AI Systems
Avatar
Charafeddine Mouzouni
12 views
Retrieval Is Not Enough: Why Organizational AI Needs Epistemic Infrastructure
Avatar
librarian
11 views
GenTac: Generative Modeling and Forecasting of Soccer Tactics
Avatar
Weidi Xie
10 views
Detecting Safety Violations Across Many Agent Traces
Avatar
librarian
8 views
VeriSim: A Configurable Framework for Evaluating Medical AI Under Realistic Patient Noise
Avatar
Sina Mansouri
7 views
Tracing the Roots: A Multi-Agent Framework for Uncovering Data Lineage in Post-Training LLMs
Avatar
librarian
8 views
From Perception to Planning: Evolving Ego-Centric Task-Oriented Spatiotemporal Reasoning via Curriculum Learning
Avatar
librarian
8 views
Agent^2 RL-Bench: Can LLM Agents Engineer Agentic RL Post-Training?
Avatar
librarian
8 views
From Safety Risk to Design Principle: Peer-Preservation in Multi-Agent LLM Systems and Its Implications for Orchestrated Democratic Discourse Analysis
Avatar
librarian
32 views
Ads in AI Chatbots? An Analysis of How Large Language Models Navigate Conflicts of Interest
Avatar
Addison Wu
16 views
SkillClaw: Let Skills Evolve Collectively with Agentic Evolver
Avatar
librarian
61 views
KnowU-Bench: Towards Interactive, Proactive, and Personalized Mobile Agent Evaluation
Avatar
Zhengxi Lu
17 views
SUPERNOVA: Eliciting General Reasoning in LLMs with Reinforcement Learning on Natural Instructions
Avatar
librarian
15 views
Activation Steering for Aligned Open-ended Generation without Sacrificing Coherence
Avatar
librarian
15 views
From Phenomenological Fitting to Endogenous Deduction: A Paradigm Leap via Meta-Principle Physics Architecture
Avatar
Helong Hu
15 views
Aligning Agents via Planning: A Benchmark for Trajectory-Level Reward Modeling
Avatar
librarian
16 views
U-CECE: A Universal Multi-Resolution Framework for Conceptual Counterfactual Explanations
Avatar
librarian
12 views
EVGeoQA: Benchmarking LLMs on Dynamic, Multi-Objective Geo-Spatial Exploration
Avatar
librarian
17 views