Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

This is a great question! In short, there is a network effect with our platform. Each sample we sequence contributes to a database of novel bacteria, viruses, and fungi that we can use to discover new microbes.


sort by: page size:

David put it nicely (and more succinctly) in another reply: "In short, there is a network effect with our platform. Each sample we sequence contributes to a database of novel bacteria, viruses, and fungi that we can use to discover new microbes." I would just add that discoveries also apply to new associations that we can include in future reports.

The genetics lab I used to work for spent a lot of time focusing on the microbiome and microbiota, You would be amazed at how much they affect, and I will not be surprised to see more papers similar to this one coming out as sequencing costs drop.

In bioinformatics though, it's going to become about sifting through tons of data more than the sequencing tech. (imho: just a sysadmin)


Thanks for sending this! One aspect of our test we're excited for is the ability to profile viruses, fungi, and bacteriophages since we use Shotgun Whole Genome Sequencing on the microbes (vs. a technology like 16s that only profiles a single bacterial gene).

Good question. We're targeting analysis of "next generation" DNA sequencing data, as is commonly applied in research settings today. Clinicians and researches are beginning to use sequencing to try and identify bacteria, viruses or other organisms in complex samples (either from patients or from environmental samples).

These complex “metagenomic” samples can be processed using our tool (we've set up a demo on the submitted link), in order to understand a sample's composition, or to see if there are relevant pathogens in the sample.


Microbiomics and Metagenomics.

Wherever you look, there's opportunity for sampling and sequencing - in the soil, inside livestock, on hospital floors - that could lead to novel life-saving or industrial applications or at worst, a better understanding of the natural world.

To do this at scale takes capital and resources - difficult for many labs, but not a problem for YC Research.


It helps you sequence a genome . On the website they give examples like sequencing beer yeast.

I think the currenntly more interesting application in sequencing is pathogen detection. These have smaller and simpler genomes (mostly), and tracking them and their features improves epidemiology and the choice of treatments.

This is the real-world use-case! We're using to accelerate the discovery and elucidation of novel or unknown gene function in previously unstudied fungal strains.

I won't be surprised if many new discoveries come out of the dataset uBiome is constructing simply because no one has ever collected such a dataset before. And at this scale.

That's the thing: We sequence a lot of microbials already. Some, it's actually hard NOT to sequence.

Say, for instance, that you are sequencing an insect. To do that, you need at least a part of the insect. When you sequence it, you won't just find that insect's DNA in there, but DNA from viruses and bacteria that live in that insect. The same thing will happen if you are sequencing from a plant, or a human.

Contamination from other sources is so common that after getting a bunch of reads from a large organism, it's pretty much mandatory to do comparisons with something with the same species and with a DNA database of microbials to remove the reads that hit a contaminant, so that the assembly that we produce represents the organism correctly.

Other times, we just look for said microbials specifically. Imagine I want to know the bacteria that grow in the roots of a wheat plant. I could try to culture them all in a lab, and if something doens't grow, I lose it. Or I could sequence the root, take out everything that actually looks like wheat, and try to assemble bacteria out of the rest of the DNA.


We're an all-inclusive service provider (currently), meaning that researchers simply send their samples in and get insights back. We take care of the DNA extraction, sample QC, library prep, sequencing, bioinformatics, and data storage.

There are a couple clients of ours that we're testing our sequencing as a service where we generate monthly contracts and they get the exact same service, protocols, etc. in record turnaround time.

Client base includes companies like Children's National Hospital, UCSD, Pfizer, USDA. Nothing proprietary just yet (except it being really hard) but we're using profits from services revenue to develop IP-driven gene pipelines and informations. Basically specific gene panels and software paired together. We're one of 2 commercial companies in the country offering HLA-Typing which is the same integrated pipeline as described above.

Additionally, we're passively starting to develop high resolution data sets to ingest into a database at some point around specific novel variants, such as HLA Typing, to accelerate research

The end goal is to virtualize the entire lab so that anyone can start research, whether academic of pharmaceutical, from a home office. This means building a software layer that allows them to directly create, design, and implement protocols that interface with the equipment in the lab. So instead of building a $1M lab, you just log in online and run your experiments.

You can think of it as what AWS did to the data center.


I was a sysadmin at a genetics company for a while, and learned quite a bit while there. This is a very accurate article of the current state of things.

I have two main takeaways for anyone curious about this:

1) Sequencing costs are getting lower, yes, but the computation and data complexity is growing higher and higher as the scientists want to analyse more and more, especially those smart enough to realize the microbiome is really where it's at for a holistic approach, which equals massive amounts of data. We are talking about petabytes over time, with 200gb+ data generation a day just for a few sequencers.

2) The end goal I think is eventually there is going to be a sequencer in every hospital, and it will start catching tons of diseases before the patient even experienced any effects. It's going to be great for healthcare, not just practically but healthcare has tons of money floating around.

I think the real interesting developments will be between the sequencing machine manufacturers. Illumina, Roche 454, Ion Torrent, all doing great things and keeping the competition in innovation mode is great for the industry.

I can't wait to see what comes of it all, plus my non-compete is finally over. There is a huge bridge to be gapped between IT and the scientists that I don't think very many companies have done very well.


You summed up my concerns better than I could.

I remember when uBiome launched out of UCSF. I remember thinking “ok, they’ll sequence your microbiome and then...what exactly?”


Can anyone describe how/where this actually fits into Meta's businesses?

"As a test case, they decided to wield their model on a database of bulk-sequenced ‘metagenomic’ DNA from environmental sources including soil, seawater, the human gut, skin and other microbial habitats."

Meta has a lot of data, but I'm unaware of them having a presence in the environmental/medical diagnostics industry which is where I assume this would be applied

Perhaps they are going for a Bell Labs kind of structure?


cfontes – Yes, that's definitely another use case of the underlying technology, though please note that the database loaded for the demo only include archaeal, bacterial and viral genomes (so you'd have a very hard time getting a hit on sugarcane sequences!).

(This is nkrumm's and my project)


But you only have one genome, so you will only ever see a very small subset of that 7 million (I’m assuming that’s how it works, I’ve never used the service). Now you have access to 7 million records at the same time, which is much more powerful in terms of what you can do with that data.

Most of the fragments (sorry don't have a %) could be mapped to a genome. In cases where a fragment could not be mapped it's more likely because of degradation than novel biology.

That said we did identify some novel microbial genomes. Not clear if they are actually ancient though vs contamination in the last 100 years or so.


This is the result what one would expect. Scientists (until recently) have only been able to sequence species which can be cultured in the laboratory (you need massive amounts of DNA for sequencing). But in fact, more than 90 percent of all microbial species cannot be cultured in the lab and hence (until recently) could not be sequenced and stayed unknown. However, in the past few years, "Next next generation sequencing" (that's how I like to call it) techniques emerged and we are now able to sequence nearly everything. The umbrella terms "metagenomics" and "single-cell sequencing" are often used for such new methods and have huuuge potential in many, many fields. Basically, the new methods eliminate the culturing step and instead have novel techniques for amplifying DNA from only a single strand.

Sure, all the time. Network and I/O are the biggest blockers for sequence analysis. For any organization that is working with thousands of genomes they probably have their own compute resources. I know of at least one organization who is currently sending thousands of genomes to the cloud for analysis, so it's certainly feasible to some extent.
next

Legal | privacy