The Stanford Data Science Initiative (SDSI) is proud to introduce the recipients of seed research awards and the inaugural cohort of Data Science Scholars. Awards have been made to support the outstanding and inspiring work where data science methods are being used all across Stanford to help understand and address critical scientific and societal challenges.
These awards follow two calls for proposals and applications that resulted in a combined 220 applications representing 61 unique departments in all seven schools at Stanford. We are thrilled to see the tremendous interest and excitement for data intensive research present in every corner of campus and we will continue to work to help our colleagues excel in advancing their ambitious goals.
Funding for these awards are provided by the generous support of SDSI’s corporate members and the School of Medicine. We hope and expect that these initial efforts will lead to further opportunities to support and catalyze the incredible data intensive work being pioneered in all departments.
Seed funding provides researchers with valuable resources that can enable the genesis of new discovery. Through this call, we support ambitious research projects where the development of new data science methods may help to solve a scientific or societal challenge and prove to be transformative to one or more domains.
Fall 2018 awards have been made to:
Risa Wechsler (Physics), Sean McLaughlin (Physics), Laurence Levasseur (Physics)
Gregory Beroza (Geophysics), Lise Marie Christelle Retailleau (Geophysics), Aurelien Mordret
Daniel McFarland (Education), Daniel Jurafsky (Linguistics & Computer Science), Bas Hofstra (Education), Londa Schiebinger (History), James Zou (Biomedical Data Science)
Gretchen Daily (Biology), Rebecca Chaplin-Kramer (Natural Capital Project)
Ali Yaycioglu (History), Antonis Hadjikyriacou (Center for Spatial and Textual Analysis), Erik Steiner, Celena Allen (Center for Spatial and Textual Analysis)
Mohsen Bayati (Operations, Information & Technology), Ramesh Johari (Management Science)
James Zou (Biomedical Data Science), Abubakar Abid (Electrical Engineering), Suzanne Tamang (Medicine), Ayin Vala (Foundation for Precision Health), Xiang Zhu (Statistics)
Omer Reingold (Computer Science), Michael Kim (Computer Science)
Jackelyn Hwang (Sociology), Nikhil Naik
Michael Frank (Psychology), Abdellah Fourtassi (Psychology)
The inaugural cohort of Data Science Scholars will make up a diverse group of PhD students and postdocs from all parts of Stanford who are using and developing data science methods in their research. They share a keen interest in solving problems while sharing and exchanging knowledge with others. One primary goal of the program is to create a community of data science researchers, who are representative of the wide array of disciplines, and who can share methods and applications while creating a stimulating, innovative, and supportive environment. Thanks to the generosity of our partners in the Center for Spatial and Textual Analysis (CESTA), the Scholars program will initially be based within their workspace in Wallenberg Hall.
The Fall 2018 Stanford Data Science Scholars include:
Management Science & Engineering
Research in Computational Social Science with a dual focus on analytical tools and social science knowledge. Developing and applying methodological tools to analyze social networks and creating guidelines for how to use these datasets in crafting public policy and business decisions.
Research in applying data-driven methods to physical sciences problems. Developing a machine learning model that can predict previously unknown two-dimensional materials. Aiming to elucidate not just the properties of some materials but list all possible materials with a particular property, which has not been realized in computational materials science.
Developing algorithms for efficient learning and inference in probabilistic models that can scale to large and complex datasets. Particularly motivated by the potential impact of this core research on applications in computational sustainability.
Earth Systems Science
Using a range of data analysis techniques to identify the causes and impacts of droughts, including clustering analysis in time and space, time series analysis, and econometric analysis. Aiming to intelligently combine different datasets with powerful algorithms to provide insights on climate risks around the world that can help inform sustainable development efforts.
Focusing on challenges unique to the humanities in data science: machine learning applications and interpretations of humanities data and ethics and bias in digital archives. Systematically examining the problem of scale and data quality in data-driven humanities research and exploring how data science visualizations and test results can integrate into a linear storytelling format.
Developing new statistical and machine learning tools for problems inspired by bioengineering applications including new methods for monitoring disease progression from sparse data (e.x. clinical visits) and algorithms for extracting biomechanical biomarkers from mobile phone videos.
Focusing on the role of political partisanship in American politics, and specifically the role of partisans’ antipathy towards members of the other party in their political and nonpolitical behavior through the use of internet search data, as well as web browsing data.
Studying ways by which ambitious data science can be done painlessly. Started the open-source software project ClusterJob (CJ), an experiment management system (EMS) that allows data scientists to conduct million-CPU-hour experiments painlessly and reproducibly on remote clusters either in-house or in the cloud.
Studying the effects of Internet exposure on affective political outcomes using online experiments, text analysis, and passive mobile phone data collection. Focal points for research include news and hostility in the United States and Myanmar.
Developing data-driven approaches for policy decision-making, especially in developing countries. Focusing on advancing causal inference methods by leveraging recent progress in machine learning, with the goal to build efficient, fair, and robust statistical frameworks that are accessible to policymakers.
Expanding data science learning opportunities, including reproducible analysis with R. Working on a curriculum of videos, articles, and exercises for teaching data science in R and developed two R packages, purrrplus and combineR, to solve common data analysis pain points.
Developing new analytical methods for answering important biological applications that can build upon the foundations of human health, including assessment of cardiovascular function using data from wearable sensors. Using machine-learning methods for detection of abnormal heart rhythms through cost-effective sensors.