Skip to content

Details

Title: What Is Progress? Knowledge Aggregation, Living Textbooks, and the Automation of Scientific Discovery
Date: June 20 2026 Noon - 14:00 EDT
Summary: Our collective knowledge infrastructure — the textbooks, professional training resources, and literature syntheses that define what professionals across disciplines believe to be true — is quietly accruing a structural liability. Compounded confirmation bias, stacked citation-by-citation into the foundations of formal knowledge, means that breakthroughs can take decades to reach the classrooms, clinical workflows, and decision-making frameworks where they matter most. Meanwhile, the deepest friction is rarely acknowledged: before any field can build meaningful consensus on "why" or "how" a phenomenon occurs, it must first establish honest, consolidated agreement on "what" has actually been observed. That prior step is routinely skipped, assumed, or fragmented across siloed literatures that never cross-pollinate.

This talk introduces a framework called "Knowledge Aggregation" — with two distinct but complementary ambitions. The first is descriptive transparency: algorithmically mapping what has been said, measured, and documented across a problem space, without imposing causal interpretation or narrative. The second traces the boundary between empirical observation and explanatory claim, building systems that can separate the "what" from the "why/how" — because consensus on mechanism cannot be meaningfully constructed until consensus on phenomenon is first established.

Both ambitions are now within reach. By composing tools already at our disposal — large language models, classical NLP pipelines, public data repositories, and engineering-grade automation frameworks — it becomes possible to model knowledge itself, rather than merely imitate individual experts. One concrete expression of this is automating the writing of living textbooks: compressing the lag from bleeding-edge discovery, through replicated evidence, all the way to professional training resources. But the deeper aspiration reaches further — toward automating the discovery of scientific insights that have never previously been conceived, by systematically surfacing hypothesis combinations that no single siloed researcher would have had the cross-disciplinary vantage point to even ask. Drawing on ongoing systems biology and computational research — with ME/CFS research demoed as a use case for what siloed, fragmented knowledge infrastructure costs in practice — this talk maps the conceptual architecture, the real-world friction, and the data science toolkit for building it.
Speaker: As a systems biologist at heart, Sam specializes his biomedical research on interactions and connections in biology - rather than just one domain of expertise. He wears many hats and collects skill sets across disciplines, with degree studies and industry experience acquired across Chemical Engineering (BSc), Bioinformatics (MSc), Systems and Synthetic Biology (M2), Biomedical Sciences (MSc), and beyond. Even more important to him than niches or fields of work, comes down to the synergistic approaches that allow us to move beyond reductionism. The notion that a question can only allow for one answer, is inherently reductionist. By resisting many norms in science and engineering which can get overly reductive, his current role as Principal Investigator of Research for DMV Petri Dish (501(c)(3) non-profit local to the DMV region) embraces computational frameworks that aide scale-up and automation - not only around the processes which already exist with established workflows, but also taking a keen interest in attempting and accomplishing ambitions which have never been perceived to be possible previously. Sam carries a passion for the synergy of computational biology - fused with wet lab validation. This way, one can build a beautiful knowledge base in the theoretical sense, and then test to see if said computational prediction might actually be able to stand in the real world with wet lab validation. Translational modeling starts to become possible once biological experiment design can be iteratively looped alongside computational model design, optimization, and analysis - empowering the design of a better wet lab experiment, followed by a better computational model, back and forth until science is done!

You may also like