Every environment has an associated microbiome (or ‘microbiota’ if you’re feeling picky); from our homes, to the soil we walk on and the seas we swim in. Each of us have our own unique microbiome too, with approximately 3 microbial cells to every human cell on and in our bodies (which is bit freaky if you think about it too hard!). Even when we compare the microbiota of different body areas, we find that each ‘compartment’ has its own distinct population. If you’ve ever wondered how scientists actually do this, wonder no more! Read on for a quick guide to microbiome analysis. If you’re thinking, hang on, but what the f*ck actually is the microbiome, I’ve written a short blog on that too!

Sample collection

To analyse the microbiome, we first need to collect a sample from that environment and how we do that will depend on what we are interested in. In all cases, it’s important that we know that the sample that we have taken is not contaminated with stuff from other environments on the way. If we want to study the microbiome of – for example – a dog’s ear, then we need to make sure that there’s no chance that we might also pick up bugs from the dog’s tongue (should he be particularly friendly and want to lick us), from his coat, from the owner, from the owner’s other pets, from the owner’s house, from the transport between sampling and the lab, etc. so it’s sterile (super-clean) swabs and tubes all the way and careful handling. I collect all of my samples using sterile double guarded swabs; these are similar to cotton buds but with two sheaths around them to limit contamination. The swab is opened up to actually collect the sample and then closed off again to make sure that nothing else gets in.

I’m interested to know what the microbiome is like in the female reproductive tract – you can fill in your own details about how we actually collect the samples for that. Another area of interest is the gut microbiome – in that case we collect faecal (poo) samples as a representation of what’s going on up the business end, so to speak. This is often the easiest, but least glamourous part of the job!

Isolating microbes

In the olden days (i.e. when I was doing my PhD) we used to determine what bacteria were present in a sample by streaking our samples onto agar plates using specific growth media.

Citrobacter on XL agar by Iqbal Osman CC-BY 2.0

The bugs would grow and then we could identify and count them by looking at them down a microscope. The problem with this method is that some bacteria that grow perfectly happily in their natural environment simply refuse to grow in the lab. As a result, all the data that gets collected by the “grow ‘em and count ‘em” method is biased towards bacteria that transport well and grow well under lab conditions. Nowadays, we usually skip this part entirely and instead extract all of the DNA from our samples using specially designed extraction kits. We can then identify what bacteria we have by analysing the DNA that we have collected. This is a much more accurate way of working. Since the advent of DNA techniques, it’s been estimated that old school culture methods were only able to account for around 1% of the bacteria within a sample! We were actually missing 99% of the bugs that were present in an environment because they refused to grow for us !

DNA analysis

So once we have our DNA, how do we analyse it to determine what bacterial species are present? Well, the first thing that we do is we put it onto a machine that reads out the DNA sequence for us. There are several different types of machines that we can use – all of them are incredibly complicated but effectively what we get out at the end is millions of short stretches of letters that represent the sequence of the DNA from the samples we collected. We then need to work out what bugs those sequences came from.

The best way to visualise this is to think of books, and to do this best, we’re going to take a short detour. Let’s imagine that we’ve been sent to investigate a murder. We know that there’s some connection between the murder and the contents of the library of the murdered man (yes, this is contrived, but stick with it – it will help). If we know what books he had on his shelf, we can solve the mystery. However, the murderer has ripped all of the books into small pieces before leaving and has burnt all the covers. How can we identify what books were there?

Well, if we take the short pieces of book that we have available, and get hold of copies of all the books that might have been in the library, then we can search through those books looking for matches. Does this phrase that we found on this piece of paper occur in this book? Yes it does. Maybe that book was in the library. If we get enough matches with enough distinct phrases (not the “if”, “and”, “the” phrases, the distinctive phrases) then we can be sure that the particular book was there. We have the murderer banged to rights!

So, back to DNA. Every living organism contains within it the information for making it do its thing encoded into DNA which is stored inside the cell in long strings called chromosomes. You can think of this as the instruction manual for making the organism. The instruction sets for making each organism are unique to that particular organism. Over time, scientists have built up collections of DNA that we know come from particular organisms and collected them as “reference genomes”. These are the books in our library.

We take all the short stretches of DNA that we get from the machines (these are our ripped up pieces of book) and we then look through all the possible books to find possible matches. If we find a good match, then that shows us that the organism that the instruction manual belongs to must have been in our sample. But it gets better than that. If we find the same distinctive phrase more than once in our sample, we can also work out that there were more of organism A than there were of organism B. By matching and counting millions and millions of DNA fragments (words and phrases) against all our reference genomes (instruction manuals) we can work out not only what was in in our sample, but how many of them there were.

And then it gets even better still. The instruction manuals for building two different cars will have many more similarities between them than they will with the instruction manual for building a computer. We might find fragments of books that we know are similar to both of these car instruction manuals but we don’t have a perfect match. In that case, we can probably conclude that there was some kind of car instruction manual in our original library, even though we don’t know the exact make or model. We can do the same thing with the DNA – often we can identify a certain category or set of bacteria, fungi or virus within our sample without knowing the exact kind. This can be really helpful information.

Comparing the results

Once we have our list of what’s present in our samples, we can then use bioinformatics and statistical software to compare the bacterial populations of a specific environment e.g. the gut, between groups to identify similarities and differences. That’s how we are able to describe how the gut microbiome of a breastfed baby differs from its formula fed peers.

Like other methodologies, there are limitations to what microbiome analysis can tell us, and potential for bias to be introduced at each stage of the process1. Scientists work hard to control these limitations but it’s important to be aware of them when communicating results.

I hope you agree it’s pretty fascinating stuff and we have much to learn from this type of analysis. If you’re interested in getting to know a bit more about the microbiome then check out Bik’s Picks on Microbiome Digest, it’s a great place to start.



1. Pollock J, Glendinning L, Wisedchanwet, Watson M. (2018). The madness of microbiome: Attempting to find consensus “best practice” for 16S microbiome studies. Applied and Environmental Microbiology doi:10.1128/AEM.02627-17.


Thank you to my co-author Dr Andy Law and to Dr Jolinda Pollock, Prof Mick Watson and Dr Samantha Lycett for helping us translate the complexities of microbiome analysis into easy to digest pieces!

Header image ‘DNA-Extraction’ by Col Ford and Natasha de Vere CC-BY 2.0.