Skip to content Skip to navigation

New Awards and Appointments to Inspire Data Science Research

New Awards and Appointments to Inspire Data Science Research

October 4, 2018

The Stanford Data Science Initiative (SDSI) is proud to introduce the recipients of seed research awards and the inaugural cohort of Data Science Scholars. Awards have been made to support the outstanding and inspiring work where data science methods are being used all across Stanford to help understand and address critical scientific and societal challenges.

These awards follow two calls for proposals and applications that resulted in a combined 220 applications representing 61 unique departments in all seven schools at Stanford. We are thrilled to see the tremendous interest and excitement for data intensive research present in every corner of campus and we will continue to work to help our colleagues excel in advancing their ambitious goals.

Funding for these awards are provided by the generous support of SDSI’s corporate members and the School of Medicine. We hope and expect that these initial efforts will lead to further opportunities to support and catalyze the incredible data intensive work being pioneered in all departments.


Seed funding provides researchers with valuable resources that can enable the genesis of new discovery. Through this call, we support ambitious research projects where the development of new data science methods may help to solve a scientific or societal challenge and prove to be transformative to one or more domains.

Fall 2018 awards have been made to:

Bayesian Convolutional Neural Networks for Cosmological Inference

Risa Wechsler (Physics), Sean McLaughlin (Physics), Laurence Levasseur (Physics)

Central California Decadal Groundwater Monitoring and Imaging From Ballistic Rayleigh Wave Dispersion

Gregory Beroza (Geophysics), Lise Marie Christelle Retailleau (Geophysics), Aurelien Mordret

Harnessing Data Science to Answer Questions About Diversity and Creativity

Daniel McFarland (Education), Daniel Jurafsky (Linguistics & Computer Science), Bas Hofstra (Education), Londa Schiebinger (History), James Zou (Biomedical Data Science)

Integrating Remote-Sensing and Socioeconomic Microdata to Understand Forest Carbon Loss – and How to Halt It

Gretchen Daily (Biology), Rebecca Chaplin-Kramer (Natural Capital Project)

Mapping Ottoman Epirus

Ali Yaycioglu (History), Antonis Hadjikyriacou (Center for Spatial and Textual Analysis), Erik Steiner, Celena Allen (Center for Spatial and Textual Analysis)

Online Platforms: The Welfare Consequences of Exploration

Mohsen Bayati (Operations, Information & Technology), Ramesh Johari (Management Science)

Representation Learning of Disease Trajectories to Predict Opioid Addiction Risk

James Zou (Biomedical Data Science), Abubakar Abid (Electrical Engineering), Suzanne Tamang (Medicine), Ayin Vala (Foundation for Precision Health), Xiang Zhu (Statistics)

Towards Fair Machine Learning

Omer Reingold (Computer Science), Michael Kim (Computer Science)

Using Computer Vision to Measure Neighborhood Visible Urban Conditions Affecting Health

Jackelyn Hwang (Sociology), Nikhil Naik

Using Network Science to Study Children’s Conceptual Development

Michael Frank (Psychology), Abdellah Fourtassi (Psychology)




The inaugural cohort of Data Science Scholars will make up a diverse group of PhD students and postdocs from all parts of Stanford who are using and developing data science methods in their research. They share a keen interest in solving problems while sharing and exchanging knowledge with others. One primary goal of the program is to create a community of data science researchers, who are representative of the wide array of disciplines, and who can share methods and applications while creating a stimulating, innovative, and supportive environment. Thanks to the generosity of our partners in the Center for Spatial and Textual Analysis (CESTA), the Scholars program will initially be based within their workspace in Wallenberg Hall.

The Fall 2018 Stanford Data Science Scholars include:

Management Science & Engineering

Research in Computational Social Science with a dual focus on analytical tools and social science knowledge. Developing and applying methodological tools to analyze social networks and creating guidelines for how to use these datasets in crafting public policy and business decisions.

Applied Physics

Research in applying data-driven methods to physical sciences problems. Developing a machine learning model that can predict previously unknown two-dimensional materials. Aiming to elucidate not just the properties of some materials but list all possible materials with a particular property, which has not been realized in computational materials science.

Computer Science

Developing algorithms for efficient learning and inference in probabilistic models that can scale to large and complex datasets. Particularly motivated by the potential impact of this core research on applications in computational sustainability.

Earth Systems Science

Using a range of data analysis techniques to identify the causes and impacts of droughts, including clustering analysis in time and space, time series analysis, and econometric analysis. Aiming to intelligently combine different datasets with powerful algorithms to provide insights on climate risks around the world that can help inform sustainable development efforts.


Focusing on challenges unique to the humanities in data science: machine learning applications and interpretations of humanities data and ethics and bias in digital archives. Systematically examining the problem of scale and data quality in data-driven humanities research and exploring how data science visualizations and test results can integrate into a linear storytelling format.


Developing new statistical and machine learning tools for problems inspired by bioengineering applications including new methods for monitoring disease progression from sparse data (e.x. clinical visits) and algorithms for extracting biomechanical biomarkers from mobile phone videos.

Political Science

Focusing on the role of political partisanship in American politics, and specifically the role of partisans’ antipathy towards members of the other party in their political and nonpolitical behavior through the use of internet search data, as well as web browsing data.


Studying ways by which ambitious data science can be done painlessly. Started the open-source software project ClusterJob (CJ), an experiment management system (EMS) that allows data scientists to conduct million-CPU-hour experiments painlessly and reproducibly on remote clusters either in-house or in the cloud.


Studying the effects of Internet exposure on affective political outcomes using online experiments, text analysis, and passive mobile phone data collection. Focal points for research include news and hostility in the United States and Myanmar.

Computer Science

Developing data-driven approaches for policy decision-making, especially in developing countries. Focusing on advancing causal inference methods by leveraging recent progress in machine learning, with the goal to build efficient, fair, and robust statistical frameworks that are accessible to policymakers.


Expanding data science learning opportunities, including reproducible analysis with R. Working on a curriculum of videos, articles, and exercises for teaching data science in R and developed two R packages, purrrplus and combineR, to solve common data analysis pain points.

Biomedical Informatics

Developing new analytical methods for answering important biological applications that can build upon the foundations of human health, including assessment of cardiovascular function using data from wearable sensors. Using machine-learning methods for detection of abnormal heart rhythms through cost-effective sensors.