Skip to main content Skip to secondary navigation
Main content start

Center for Decoding the Universe Quarterly Forum Recap - Winter 2025

Thank you to all the presenters, volunteers, and attendees who joined us for the Center for Decoding the Universe Winter Forum! Enjoy the recorded sessions on Zoom (Stanford access-only); Recaps by Sydney Erickson, Benjamin Dodge, Ioana Ciuca, and Susan Clark.

Summary Session 1: Discovery and Inference from Multi-Modal Data

The first session on Discovery & Inference from Multi-Modal Data brought together experts from engineering, computer science, and astrophysics. Ellen Kuhl introduced her work on automated model discovery for soft materials, including biomaterials ranging from the soft eye lens to artificial meat. She explained that traditional neural networks lack interpretability and struggle with extrapolation, motivating the development of physics-informed constitutive neural networks. Her group can derive energy outputs and compute stress by hardwiring physical constraints such as symmetry and incompressibility into the network and using specialized activation functions like logarithms and exponentials. This method yields superior fits to stress/strain data. It produces interpretable models that can be compared with traditional formulations, paving the way for realistic 3D simulations of complex structures like the brain and even enabling forward simulations with tools like Abaqus. To the key question of “How do you argue these models are interpretable?”, Ellen responded that you can describe them if they have a finite number of terms/functions that are activated.

Maria Elena Monzani shifted the focus to the elusive search for dark matter. Her group faces the challenge of detecting a handful of events per year amidst overwhelming background noise. To put things into perspective, we expect one billion dark matter particles to cross the detector volume every second, but we hope to detect a handful a year. To tackle this challenge, her research group employs unsupervised learning techniques: by clustering data points in both detector and waveform spaces and applying dimensionality reduction, their algorithm effectively isolates background “accidental coincidences,” revealing previously undetected signatures.

Dalya Baron then presented a grand challenge in astronomy: creating a unified observational picture of nearby galaxies across vast wavelength and energy scales. With billions of galaxies observed yet only a fraction available in high resolution, her work focuses on integrating data that spans from detailed images to complex spectral cubes. Each galaxy may offer between ten and a thousand images, each containing tens of thousands to millions of independent pixels. Dalya emphasized that to decode the physics of the universe, it is first necessary first to encode everything we know.

Steven Dillman finished the session by exploring representation learning for time-domain high-energy astrophysics using data from the Chandra X-ray Observatory. By developing multi-modal models that learn a shared embedding space, exemplified by AstroCLIP, which maps both spectra and images, his work aims to build a general astronomical foundation model. This model facilitates the discovery of time-domain events that might otherwise remain hidden in vast archives. A fascinating result of this approach was the identification of a new extragalactic fast X-ray transient in the Large Magellanic Cloud.

Summary Session 2: Beyond 2D—The World Around Us and the Universe

The second session, Beyond 2D: The World Around Us and the Universe, discussed how to create three-dimensional and even four-dimensional representations of the scenes around us. Jiajun Wu opened the session by drawing attention to the hidden physical processes that generate familiar 2D images, light interacting with geometry, material, and texture, leaving no direct 3D or physical annotations. His group addresses this data gap by building a “physical decoder,” a differentiable simulator that de-renders a single image into its intrinsic properties. When asked how this could benefit real-world robotics, Jiajun emphasized that annotated robot dynamics (such as torque data) are scarce, making it crucial to rely on physically grounded scene representations for more data-efficient manipulation.

Tom Abel followed with C4Universe, a digital twin project that integrates simulations and observations into a coherent 3D model of the cosmos. Despite billions of observable galaxies, each is only partially captured by instruments like Chandra and the Hubble Space Telescope. By leveraging priors, including assumptions about geometric symmetry, Tom wants to create 3D renditions that bridge gaps in resolution and completeness. Asked which priors matter most, he pointed out that every modeling choice sparks valuable debate, from computational grids to physical assumptions.

Koven Yu then discussed everyday fluid phenomena. He presented a method to reconstruct and predict 3D fluid dynamics using a single 2D video. The underlying challenge is that one viewpoint captures only a fraction of the volumetric flow, making the reconstruction ambiguous. Koven’s differentiable rendering pipeline ensures physically plausible results by enforcing Navier-Stokes and other physics-based constraints. He clarified that both are essential in response to a question about whether pretrained multi-view models could replace physics-based priors. While pre-trained networks help synthesize alternative viewpoints, physical constraints remain critical for ensuring realism in the reconstructed flow.

Finally, Rahul Mysore Venkatesh introduced counterfactual world modeling, a strategy that harnesses large-scale videos to learn how scenes evolve through time. By heavily masking future frames, his model focuses on how objects and motions transform rather than predict the next frame. Through interventions in specific patches, like shifting a patch of a tissue box, the system uncovers emergent segmentation and dynamic relations. When asked if chain-of-thought language models could be integrated, Rahul revealed that he is experimenting with such approaches to deepen the causal understanding these models can acquire.

The session stirred a series of thought-provoking questions: As more fields adopt physically informed priors, will we reach a point where such simulators become standard for everything from robotics to galactic modeling? Can digital twins of the universe truly mirror its complexity, or do our priors inevitably shape our view? Moreover, as machine learning blurs the lines between representation and simulation, how do we preserve the essential human insight needed to push scientific boundaries further?

Summary Session 3: Agentic AI for Science

The last session of the Center for Decoding the Universe's latest Quarterly Forum explored the evolving role of artificial intelligence in scientific discovery. Researchers from various fields shared insights into how AI is moving beyond a mere tool—becoming a teammate in the scientific process.

James Zou opened the session with a bold vision: a "virtual lab" where AI agents work alongside human researchers, each with distinct expertise. He illustrated this with an example of an AI-driven research meeting. In this scenario, a principal investigator AI set an objective, an immunologist AI proposed the use of nanobodies, and a critic AI flagged concerns about the lack of empirical data. Remarkably, all communication and code were generated by AI agents, who leveraged tools like AlphaFold and Rosetta to design nanobody candidates for real-world testing. When asked by Risa Wechsler how these specialist AI agents are trained, Zou explained that they retrieve academic papers and textbooks and undergo fine-tuning on the collected materials.

Sihan (Sandy) Yuan and Ioana (Jo) Ciuca turned the discussion to astronomy, emphasizing AI’s necessity in handling the overwhelming volume of data from upcoming projects like the Rubin Observatory, which will generate 20 terabytes of data each night. They introduced the concept of an "astronomy agent," an AI that processes data, formulates hypotheses, and analyzes results. But how creative can AI truly be? Jo drew an analogy to the famous move 37 in AlphaGo’s match against Lee Sedol—an unconventional, seemingly creative move that astonished human players. This sparked a deeper question: What will be the role of human scientists in an AI-driven future? Will we focus on guiding AI and communicating results, or will we continue to engage with the hands-on details of research?

Christine Ye addressed another crucial aspect of AI in science: interpretability. While AI models uncover structure in massive datasets, their true potential is unlocked when we understand their inner workings. She demonstrated how sparse autoencoders trained on scientific paper embeddings can reveal meaningful conceptual “families.” By manipulating these internal representations, she suggested, we might one day direct AI models toward specific goals—perhaps even guiding them to challenge or refine scientific claims.

Finally, Yijia Shao tackled the broader implications of AI agents in scientific collaboration. She framed the discussion with a simple fill-in-the-blank: "Agent __ Human". Should AI assist, replace, or complement human researchers? Fully autonomous AI still struggles with even basic tasks, reinforcing the wisdom that the best AI agents are those that reduce workload and information overload. Shao showcased a chatbot capable of both independent operation and proactive engagement with human users, emphasizing an interaction model based on asynchronous collaboration rather than simple turn-taking. She left the audience with a thought-provoking idea: What if AI not only assisted humans but debated them? Could scientific discovery be accelerated by structured disagreements between human and AI researchers?

The session highlighted AI’s expanding role in science—not just as a computational tool, but as a collaborator, sparking new ideas, challenging assumptions, and reshaping the way we explore the universe.

More News Topics