Artificial Intelligence

Cliff Tokens: Identifying Single-Token Failure Triggers in LLM Mathematical Reasoning
Avatar
librarian
0 views
AI Snitches Get Glitches: Towards Evading Agentic Surveillance
Avatar
Hyejun Jeong
0 views
Confidence Sequences for Online Statistical Model Checking of Markov Decision Processes
Avatar
librarian
0 views
Decentralised AI Training and Inference with BlockTrain
Avatar
librarian
1 view
World Models in Pieces: Structural Certification for General Agents
Avatar
Tongxin Li
4 views
OpenThoughts-Agent: Data Recipes for Agentic Models
Avatar
librarian
4 views
A specialized reasoning large language model for accelerating rare disease diagnosis: a randomized AI physician assistance trial
Avatar
librarian
4 views
ReM-MoA: Reasoning Memory Sustains Mixture-of-Agents Scaling
Avatar
Heng Ping
3 views
VeriEvol: Scaling Multimodal Mathematical Reasoning via Verifiable Evol-Instruct
Avatar
Haoling Li
9 views
The Topology of Ill-Posed Questions: Persistent Homology for Detection and Steering in LLMs
Avatar
Guangyu Jiang
7 views
SPIRAL: Learning to Search and Aggregate
Avatar
librarian
7 views
PlanBench-XL: Evaluating Long-Horizon Planning of LLM Tool-Use Agents in Large-Scale Tool Ecosystems
Avatar
librarian
8 views
Self-Evolving Cognitive Framework via Causal World Modeling for Embodied Scientific Intelligence
Avatar
Yi Yu
6 views
Governance Decay: How Context Compaction Silently Erases Safety Constraints in Long-Horizon LLM Agents
Avatar
librarian
6 views
PaperClaw: Harnessing Agents for Autonomous Research and Human-in-the-Loop Refinement
Avatar
librarian
6 views
How Do Instructions Shape Speech? Cross-Attention Attribution for Style-Captioned Text-to-Speech
Avatar
Nityanand Mathur
25 views
Multi-LCB: Extending LiveCodeBench to Multiple Programming Languages
Avatar
librarian
24 views
DeepSWIP: Quotient-WMC Counterfactuals for Neural Probabilistic Logic Programs
Avatar
librarian
27 views
Lagrange: An Open-Vocabulary, Energy-Based Sparse Framework for Generalized End-to-End Driving
Avatar
librarian
26 views
Rethinking Shrinkage Bias in LLM FP4 Pretraining: Geometric Origin, Systemic Impact, and UFP4 Recipe
Avatar
Unknown
16 views
LedgerAgent: Structured State for Policy-Adherent Tool-Calling Agents
Avatar
Md Nayem Uddin
27 views
Toward Calibrated Mixture-of-Experts Under Distribution Shift
Avatar
librarian
23 views
QMFOL: Benchmarking Large Language Model Reasoning via Quantifiable Monadic First-Order Logic Test Case Generation
Avatar
librarian
13 views
Navigating Unreliable Parametric and Contextual Knowledge: Explicit Knowledge Conflict Resolution for LLM Inference
Avatar
librarian
12 views
SoftSkill: Behavioral Compression for Contextual Adaptation
Avatar
librarian
15 views
Beyond Safe Data: Pretraining-Stage Alignment with Regular Safety Reflection
Avatar
Jinhan Li
23 views
ARIADNE: Agnostic Routing for Inference-time Adapter DyNamic sElection
Avatar
Enrico Cassano
27 views
User as Engram: Internalizing Per-User Memory as Local Parametric Edits
Avatar
librarian
21 views
NeSyCat Torch: A Differentiable Tensor Implementation of Categorical Semantics for Neurosymbolic Learning
Avatar
librarian
18 views
Rethinking Reward Supervision: Rubric-Conditioned Self-Distillation
Avatar
librarian
17 views
EvolveNav: Proactive Preflection and Self-Evolving Memory for Zero-Shot Object Goal Navigation
Avatar
librarian
28 views
The Stanford EDGAR Filings Dataset: Reconstructing U.S. Corporate and Financial Disclosures into Layout-Faithful and Token-Efficient Pretraining Data
Avatar
Nick Bettencourt
32 views