• This forum is specifically for the discussion of factual science and technology. When the topic moves to speculation, then it needs to also move to the parent forum, Science Fiction and Fantasy (SF/F).

    If the topic of a discussion becomes political, even remotely so, then it immediately does no longer belong here. Failure to comply with these simple and reasonable guidelines will result in one of the following.
    1. the thread will be moved to the appropriate forum
    2. the thread will be closed to further posts.
    3. the thread will remain, but the posts that deviate from the topic will be relocated or deleted.
    Thank you for understanding.​

Genomics: Wanted: More Data, the Dirtier the Better

Introversion

Pie aren't squared, pie are round!
Kind Benefactor
Super Member
Registered
Joined
Apr 17, 2013
Messages
10,642
Reaction score
14,865
Location
Massachusetts
The computational immunologist Purvesh Khatri embraces messy data as a way to capture the messiness of disease. As a result, he’s making elusive genomic discoveries.

Quanta Magazine said:
To distill a clear message from growing piles of unruly genomics data, researchers often turn to meta-analysis — a tried-and-true statistical procedure for combining data from multiple studies. But the studies that a meta-analysis might mine for answers can diverge endlessly. Some enroll only men, others only children. Some are done in one country, others across a region like Europe. Some focus on milder forms of a disease, others on more advanced cases. Even if statistical methods can compensate for these kinds of variations, studies rarely use the same protocols and instruments to collect the data, or the same software to analyze it. Researchers performing meta-analyses go to untold lengths trying to clean up the hodgepodge of data to control for these confounding factors.

Purvesh Khatri, a computational immunologist at Stanford University, thinks they’re going about it all wrong. His approach to genomic discovery calls for scouring public repositories for data collected at different hospitals on different populations with different methods — the messier the data, the better. “We start with dirty data,” he says. “If a signal sticks around despite the heterogeneity of the samples, you can bet you’ve actually found something.”

This strategy seems too easy, but in Khatri’s hands, it works. Analyzing troves of public data, Khatri and colleagues have uncovered signature genes that could allow clinicians to detect life-threatening infections that cause sepsis, classify infections as bacterial or viral, and tell if someone has a specific disease such as tuberculosis, dengue or malaria. Last year Khatri and two other scientists launched a company to develop a device for measuring these gene signatures at a patient’s bedside. In short, they’re deciphering the host immune response and turning key genes into diagnostics.

Over the past year Khatri discussed his ideas with Quanta Magazine over the phone, by email and from his whiteboard-lined Stanford office. An edited and condensed version of the conversations follows.

What turned you on to biology?

I left India and came to the U.S. in the “fix the Y2K bug” rush with plans to get a master’s in computer science and become a software engineer. Months after arriving at Wayne State University in Detroit I realized that writing software for the rest of my life was going to be really boring. I joined a lab working on neural networks.

But then my adviser switched to bioinformatics and said he’d pay my tuition if I switched with him. I was a poor Indian grad student. I thought, “You’re going to pay my salary? I’ll do whatever you are doing.” That’s how I moved into biology.

...