Computer Science

Towards Responsibly Non-Compliant Machines
Avatar
librarian
0 views
APPO: Agentic Procedural Policy Optimization
Avatar
librarian
1 view
The Impossibility of Eliciting Latent Knowledge
Avatar
librarian
1 view
A Five-Plane Reference Architecture for Runtime Governance of Production AI Agents
Avatar
Krti Tallam
1 view
PROJECTMEM: A Local-First, Event-Sourced Memory and Judgment Layer for AI Coding Agents
Avatar
librarian
1 view
StatefulDiscovery: Evidence-Calibrated Claim Formation in Open-Ended Scientific Discovery
Avatar
12531182
1 view
Embodied-BenchClaw: An Autonomous Multi-Agent System for Embodied Spatial Intelligence Benchmark Construction
Avatar
librarian
1 view
Toward Generalist Autonomous Research via Hypothesis-Tree Refinement
Avatar
Jiajie Jin
1 view
ABC-Bench: An Agentic Bio-Capabilities Benchmark for Biosecurity
Avatar
librarian
4 views
CIAware-Bench: Benchmarking Control Intervention Awareness Across Frontier LLMs
Avatar
librarian
4 views
Null-Space Constrained Low-Rank Adaptation for Response-Specified Large Language Model Unlearning
Avatar
librarian
5 views
Workflow-GYM: Towards Long-Horizon Evaluation of Computer-use Agentic tasks in Real-World Professional Fields
Avatar
librarian
7 views
ReasonAlloc: Hierarchical Decoding-Time KV Cache Budget Allocation for Reasoning Models
Avatar
librarian
6 views
AutoPDE: Reliable Agentic PDE Solving via Explicitly Represented Solver Strategies
Avatar
librarian
5 views
Frontier Coding Agents Use Metaprogramming to Adapt to Unfamiliar Programming Languages
Avatar
librarian
4 views
Express Language Modeling
Avatar
librarian
4 views
Moonshine: An Autonomous Mathematical Research Agent Centered on Conjecture Generation
Avatar
librarian
4 views
WorldKernel: A World Model is the Coupling Kernel of Admissible Possible Worlds
Avatar
librarian
4 views
Recalling Too Well: Sycophancy Evaluation and Mitigation in Memory-Augmented Models
Avatar
librarian
4 views
(Auto)formalization is supposed to be easy: Trellis process semantics for spelling out rigorous proofs
Avatar
librarian
14 views
End-to-End Context Compression at Scale
Avatar
librarian
13 views
Tight Sample Complexity of Transformers
Avatar
librarian
20 views