Files
Abstract
The innate immune system orchestrates the earliest responses to infection and plays a central role in inflammation and antiviral defense. Small molecules capable of modulating innate immune pathways hold considerable therapeutic potential, yet their discovery remains constrained by the vastness of chemical space and the nonlinear, context-dependent nature of immune signaling. Traditional high-throughput screening identifies such immunomodulators inefficiently, while rational design is hindered by incomplete mechanistic understanding and the limited predictive power of existing computational tools. This dissertation addresses these challenges by developing data-driven frameworks—integrating machine learning with wet-lab experimentation—to more efficiently discover and design small-molecule modulators of innate immune responses, with broader implications for advancing vaccine development and immunomodulatory therapeutics. In Chapter 2, we introduce a machine learning–guided high-throughput screening framework motivated by the inefficiency of brute-force exploration of large chemical space for innate immune modulation. By integrating active learning and deep representational learning with experimental reporter assays, this approach enables efficient discovery of small molecule immunomodulators acting downstream of pattern recognition receptor signaling. The resulting framework uncovers diverse and highly potent modulators of NF-κB and IRF pathways while simultaneously yielding interpretable chemical design rules, demonstrating both practical impact and mechanistic insight. In Chapter 3, we apply data-driven molecular discovery to the identification of small-molecule agonists of the STING pathway. Graph neural network models trained on large-scale experimental screening data are used to perform virtual screening across expansive chemical libraries, enabling prioritization of novel candidate agonists and systematic identification of enriched structural motifs. This work illustrates how graph-based deep learning can generalize across chemical scaffolds and accelerate discovery in therapeutically important innate immune pathways. In Chapter 4, we move beyond single-agent discovery and investigate combinatorial control of innate immune signaling through co-delivery of small molecule immunomodulators. By combining statistical modeling with systematic pairwise stimulation experiments, we show that dual signaling can produce synergistic, interpolated, and finely tunable immune response landscapes that are inaccessible to individual molecules alone. These results establish compositional immunomodulation as a scalable and programmable strategy for precision immune control. In Chapter 5, we explore methodological advances in molecular machine learning that support and generalize data-driven discovery efforts. I introduce physically motivated attention biases for transformer architectures that encode interatomic structure through simple power-law relationships. This work demonstrates that incorporating structural biases can improve molecular property prediction while maintaining computational efficiency and interpretability, informing the design of scalable learning models for chemical and biological applications. Together, these computational and experimental innovations establish a unified strategy for accelerating the discovery of small-molecule modulators of innate immunity. The findings deepen our understanding of how individual molecular structures and molecular combinations shape innate immune signaling and provide broadly applicable tools for immunoengineering, vaccine adjuvant design, and antiviral therapeutic development.