Transforming the cytokine literature into a resource for experimental analysis and discovery
Transforming the cytokine literature into a resource for experimental analysis and discovery
Oesinghaus, L.; Park, M.; Shao, R.; Koh, P. W.; Seelig, G.
AbstractCytokine biology is dispersed across hundreds of thousands of publications, making it difficult to use systematically when interpreting new experiments. Large language models (LLMs) can assist with focused literature interpretation, but ad hoc retrieval remains incomplete and unreliable. We present the Cytokine Effect Database (CytED), a framework for interfacing user-supplied experimental datasets with literature knowledge at scale. CytED uses a multi-step LLM pipeline to generate over a million cytokine-cell type-effect triples from 110,000 full-text publications, with annotations for experimental context and directional changes in genes, pathways, and cellular processes. This structure enables quantitative comparison between observed perturbation responses and prior literature across cytokines, cell types, and experimental contexts. Applied to in vitro IL-10 stimulation of PBMCs, CytED identifies unexpected pro-inflammatory features in monocytes and systematic in vivo-in vitro differences in cytotoxicity responses in CD8+ T cells. CytED infers cytokine signaling, distinguishes primary from secondary cytokine effects, and guides the design of combinatorial perturbation screens. Together, CytED establishes a general paradigm for converting unstructured domain literature into analytical tools that bridge literature and experiment.