When we seek medical help we hope that our physician has had prior experience with other patients with the same ailment so that they will easily recognize our symptoms and provide the most efficacious treatment.
We also hope that they will acknowledge our individuality and learn enough to avoid counter-indications and misdiagnoses. Dramatic changes in our data-gathering capacity in the last decades have profoundly changed how doctors can identify “patients like us” and understand our individuality.
Recent advancements in technology give us an exquisitely detailed view of the molecular basis of our organism. We can sequence the DNA of individual patients — or even of single cells from a tumor. We can distinguish and count the different cell types in a biopsy. We can estimate how many different viruses our body has been exposed to. High-resolution images give us dynamic views of activation of different areas of the brain and of their connectivity. Our data gathering capacity has also increased outside laboratories. Through cell phones, we can track a subject’s location, movements and social interactions. Pollution, precipitation, temperature and so on are recorded using large numbers of distributed sensors. Portable fitness devices measure heart rate and activity in a continuous manner. Records of searches and purchases on the web give us information on the interests of individuals and on how they spend their time and the products they consume. Electronic medical records provide databases rich with patient information. And direct-to-consumer testing has resulted in a wealth of information that individuals are often willing to contribute to scientific research.
The richness and sophistication of these data sets, combined with the power of machine learning, give hope that we can use these resources to understand the biology behind human diseases and, in parallel, improve diagnosis and treatment. But there are many challenges to overcome. For example, accessing and linking different data sources (owned by separate entities and under privacy rules) is a non-trivial endeavor that requires sophisticated computational solutions. Here the margin of error we are willing to tolerate for a prediction rule is much lower than in other settings — there is a profound difference between recommending the wrong medicine and recommending the wrong ad.
Examples of faculty working in the area include Euan Ashley, Chiara Sabatti, Robert Tibshirani, Lu Tian, Manny Rivas and Jure Leskovec.