While people disagree about whether species exist (and not just in bacteria) it is plain that the space of all possible genomes is not evenly populated. In short, clusters of related bacteria associated with named species certainly do exist. These persist even in the face of extensive recombination, and using the sequences of housekeeping loci (such as those used in MLST) we can construct trees showing the fringes of such species clusters are ‘fuzzy’ as a result of some isolates having sequence typical of both species.

How this arose and is maintained is not clear. In previous work with Christophe Fraser of Imperial College London, we were able to show the impact of recombination on this, and specifically that at high enough rates, recombination can prevent the budding of daughter clusters.

The recombination rates in question are quite similar to those observed in nature. We are presently extending this work to include the accessory genome – those loci that are not present in all strains. This work was done with Dr Pekka Marttinen (then a postdoc, now faculty at Aalto University in Finland) and Professor Jukka Corander.
We are also interested in the possibility that the spread of genetic information between strains and species reflects ecological opportunity, and may be reduced by a reduction in niche overlap. We have empirical evidence for such a scenario in Campylobacter jejuni (see population genomics). The possibility that this explains the existence of ‘satellite species’ (such as nontypeable pneumococci we recently described as distinct and divergent in core and accessory gene content) is under investigation.