Marlowe – Stanford’s GPU-Based Computational Instrument

Stanford has long been a leader in the development, analysis, and use of data-intensive methods. Modern scientific breakthroughs and discoveries in almost every field require massive computational resources to explore novel ideas and paradigms at scales that have thus far been the sole purview of industry.
GPU-Based Computational Instrument
To empower faculty whose research depends on such high-powered computation—and to attract and retain the most talented students, scholars, and faculty—Stanford is making a substantial investment in a large, high-performance, GPU-based computational instrument called Marlowe. As envisioned, the infrastructure and the Research Data Science team will offer investigators the ability to build, analyze, and use large-scale models for new types of scientific discoveries.
Learn more on this Stanford Report feature.
Research Data Scientists
To maximize Marlowe's utility, Stanford University is investing in research data scientists who can:
- Design and optimize tools to get the best system performance.
- Work with research groups and students to educate them about techniques and methods to optimize usage.
- Provide software tools and operating workflows so groups can easily follow the best open science practices.

The team will also help Stanford researchers match jobs to the appropriate resources using various job characteristics such as data-dominated or computation-dominated, heterogeneous vs. homogeneous node requirement, CPU-bound vs. GPU-bound, and scale (e.g. some jobs may benefit from resources available only at a National Computing Center). The research data scientists will be skilled in efficiently performing large-scale simulations, and machine-learning tasks and possess other specialized skills. Open science practices (organizing, preserving, and sharing data, metadata, results, and computational methods) as mandated by NIH, NSF, and OSTP are best integrated into the standard workflow right from the start. Such methods become especially important should the job move from our system to other systems such as national infrastructure. We are investing in a team of people including systems engineers, research data scientists, and open science engineers, whose contributions will be worthy of publication authorship.
Subscribe to the SDS newsletter to receive timely updates about Marlowe.
Overview
Marlowe is an NVIDIA DGX H100 Superpod, built using NVIDIA’s reference architecture, designed to deliver cutting-edge computational performance. It comprises 31 NVIDIA H100 nodes, collectively providing 248 NVIDIA H100 GPUs and 2.5PB of high-performance DDN Lustre storage.
Node Overview
Each NVIDIA DGX H100 node includes:
- GPU: 8x NVIDIA H100 80GB GPUs
- CPU: 2x Intel Xeon Platinum 8480C CPUs (112 cores/node)
- Memory: 2TB of RAM
- NVSwitch: 4x NVLink connections, providing up to 900 GB/s GPU-to-GPU bandwidth
- Node-local Storage: 30TB NVMe
- Networking: 8x 400Gbps NDR InfiniBand connections, providing up to 3.2Tbps bandwidth
Marlowe is a collaboration between
