Skip to main content Skip to secondary navigation

Data Science Computation Platform

Main content start

Stanford has long been a leader in the development, analysis, and use of data-intensive methods. Modern scientific breakthroughs and discoveries in almost every field require massive computational resources to explore novel ideas and paradigms at scales that have thus far been the sole purview of industry.

GPU-Centric Cluster

Emmanuel Candès: Welcome to the Future of Stanford Data Science—Conference Keynote

To empower faculty whose research depends on such high-powered computation—and to attract and retain the most talented students, scholars, and faculty—Stanford is making a substantial investment in a large, high-performance, GPU-centric cluster. As envisioned, the infrastructure will offer select investigators the ability to build, analyze, and use large-scale models.

Research Data Scientists

To maximize the utility of the system, Stanford University is investing in research data scientists who can: 

  • Design and optimize tools to get the best system performance. 
  • Work with research groups and students to educate them about techniques and methods to optimize usage. 
  • Provide software tools and operating workflows so groups can easily follow the best open science practices. 

The team will also help Stanford researchers match jobs to the appropriate resources using various job characteristics such as data-dominated or computation-dominated, heterogeneous vs. homogeneous node requirement, CPU-bound vs. GPU-bound, and scale (e.g. some jobs may benefit from resources available only at a National Computing Center). The research data scientists will be skilled in efficiently performing large-scale simulations, and machine-learning tasks and possess other specialized skills. Open science practices (organizing, preserving, and sharing data, metadata, results, and computational methods) as mandated by NIH, NSF, and OSTP are best integrated into the standard workflow right from the start. Such methods become especially important should the job move from our system to other systems such as national infrastructure. We are investing in a team of people including systems engineers, research data scientists, and open science engineers, whose contributions will be worthy of publication authorship.