Metagenomic Analysis of Crohn’s Disease Patients Identifies Changes in the Virome and Microbiome Related to Disease Status and Therapy, and Detects Potential Interactions and Biomarkers.

Pérez-Brocal V1, García-López R, Nos P, Beltrán B, Moret I, Moya A.
  • 1*Genomics and Health Area, Fundación para el Fomento de la Investigación Sanitaria y Biomédica de la Comunidad Valenciana (FISABIO)-Salud Pública, Valencia, Spain; †Institut Cavanilles de Biodiversitat i Biologia Evolutiva, Universitat de València, Paterna, Spain; ‡CIBER en Epidemiología y Salud Pública (CIBERESP), Madrid, Spain; §Servicio de Medicina Digestiva, Hospital Universitari i Politècnic La Fe, Valencia, Spain; ‖CIBER en Enfermedades Hepáticas y Digestivas (CIBEREHD), Madrid, Spain; and ¶Servicio de Medicina Digestiva, Instituto de Investigación Sanitaria La Fe, Valencia, Spain.


BACKGROUND: The aim of this study was to survey the bacterial and viral communities in different types of samples from patients with Crohn’s disease (CD) at different stages of the disease to relate their distribution with the origin and progression of this disorder.

METHODS: A total of 42 fecal samples and 15 biopsies from 20 patients with CD and 20 healthy control individuals were collected for bacterial 16S rRNA gene profiling and DNA/RNA virome metagenomic analysis through 454 pyrosequencing. Their composition, abundance, and diversity were analyzed, and comparisons of disease status, patient status, and sample origin were used to determine statistical differences between the groups.

RESULTS: Bacterial composition and relative abundance in new-onset patients with CD differed markedly from control individuals. Individual variability and sample origin had a stronger impact on viral communities than the disease, contrary to what was observed for bacterial populations although increased numbers of overrepresented viruses were observed in feces from patients with CD. Correlation-based networks were constructed to show potential relations between bacteria and between those and viruses.

CONCLUSIONS: The bacterial community reflects the disease status of individuals more accurately than their viral counterparts. However, numerous viral biomarkers specifically associated with CD disease were identified. Because viruses can modulate bacterial communities, the correlation networks between both communities constitute a step forward in unraveling their interactions under normal and CD disease conditions.

PMID: 26313691



Crohn’s disease (CD) is a chronic, relapsing, transmural inflammation that affects the gastrointestinal tract but also has extraintestinal manifestations. Patients affected by CD suffer a noticeable reduction in their quality of life. There is no medical or surgical cure for CD, and the target of the existing therapies is to induce and maintain remission of the disease1. CD etiology is poorly known, and causes include genetic susceptibility (e.g. involving genes implicated in the autophagy pathway), and environmental factors (diet, habits, gut microbiota, etc.) that can trigger the inflammatory response2,3.

In this study, we used a metagenomic approach based on high-throughput sequencing to analyze two of the microbial communities associated with CD. One of these, the gastrointestinal bacterial community (i.e. microbiome), has been object of several studies, but with inconclusive and controversial results4.5. In comparison, gut viruses (i.e. virome) have received much less attention until recently and, therefore, their subsequent interactions with the enteric bacteria have been overlooked6,7.

Our aim was to study both communities in diverse types of samples (feces and biopsies) from a control healthy group as well as in a group of active CD patients at different stages of the disease. We wanted to understand their distribution in order to relate it with the origin and progression of the symptoms of the disease.

We collected 42 fecal samples and 15 biopsies from 20 CD patients and 20 healthy control individuals for the bacterial 16S rRNA gene profiling and the DNA/RNA virome from the same set of samples, through next-generation sequencing, using the 454-pyrosequencing platform. All the CD patients were active for symptoms and were classified as new-onset patients (those recently diagnosed with CD and having received no specific treatment for the disease at the time of sample collection), as patients under treatment, and as patients sampled before or during surgery.

We obtained 843,540 and 2,152,590 raw reads for the 57 bacterial and 57 viral samples, respectively. Approximately 67.8% and 17.7% of them matched bacterial and viral hits from the databases, respectively, with the remaining reads corresponding to other origins, e.g. human or not matching any known organism.

In order to compare the effect of different variables on the microbiome and virome distribution, samples were grouped according to the disease status of the subjects (healthy individuals vs CD patients), patient status (new onset, under treatment, and pre-surgery), sample origin (feces vs biopsies), and treatment with steroids and/or immunosuppressors.

We analyzed, separately for bacteria and viruses, the composition, abundance, richness and diversity of the samples at different taxonomy levels.

Composition of bacterial operational taxonomic units (OTUs) and viral species are illustrated as Venn diagrams (Figure 1), where we were able to identify more different bacterial OTUs but fewer viruses in fecal samples from the control group than in those from the CD patients, and within the latter, feces contained more different bacteria and viruses than biopsies.

We found 17 bacterial phyla in biopsies and 12 in feces (Figure 2).The same phyla were the most abundant ones in feces and in biopsies, but with different relative weight. In the control group and CD patients globally considered, the ratios between Bacteroidetes and Firmicutes were inverse, but with high variations by subgroup within the CD patients. Proteobacteria and Actinobacteria were detected in higher frequency, and Tenericutes in lower proportions in feces in the group of CD patients. Phylum Fusobacteria was identified only in CD patients.

As for viruses (Figure 3), DNA viruses comprised prophages, bacteriophages and eukaryotic viruses. Bacteriophages emerged as dominant in feces while eukaryotic viruses were relatively more abundant in biopsies. RNA plant viruses were predominant in feces, but not in biopsies, where other viruses were detected with higher frequency.

We also analyzed statistical differences between groups of samples to identify possible sources of variability (Figure 4). Bacteria differed more significantly between control and CD individuals, including the new-onset ones, and between feces and biopsies; and to less extent between treatment with immunosuppressors and/or steroids. Viruses also differed by sample origin, and to some extent between CD and control individuals (but not for the new-onset ones), and the treatment with steroids and/or immunosuppressors in stool samples.

We also searched for the presence of potential biomarkers (i.e. features, in this case bacteria and viruses that could be differentially represented in one group of samples compared another one). In bacteria (Figure 5), those differentially identified between feces from the control group and CD patients, included significantly higher frequencies of members from the ActinobacteriaGammaproteobacteria and Fusobacteria in the case of disease, but lower frequencies of Tenericutes and the majority of Clostridiales (with some exceptions). Bacteroidetes showed representatives with significantly higher and lower frequencies in similar proportions.

As for viruses, (Figure 6), those showing higher frequency in feces from CD patients were less numerous than those from the control group. However, we found more viruses with a higher abundance in new-onset patients with CD, even more in active patients and especially before surgery. Feces from the subgroups of patients with CD showed few relevant viral features. Markers were more frequent in feces than in biopsies. Comparisons by treatment with steroids and/or immunosuppressors resulted in few differential viruses.

Finally, we integrated the bacterial and viral abundances to establish basic correlation networks in order to identify positive and negative correlations among bacteria and between bacteria and bacteriophages, which could serve us as a starting point for future analyses of interactions within the gut of different types of patients of CD.



  • Our results indicate that bacterial communities may reflect the disease status of the patients more accurately than viruses, which show higher individual sample variability.
  • Bacteria show the most significant differences between the Control group and the patients with CD, particularly the new-onset ones. This may be an indicator of a dysregulated environment associated with inflammation. After therapy begins, our data support that bacteria experience a loss of taxa in feces and biopsies.
  • The main differences observed in the potential bacterial biomarkers between the Control group and the new-onset CD group suggest that the condition per se may be, at least in part, associated with changes in the bacterial community, regardless of its stage, although the treatment must have a major impact in the microbiota in actively treated patients.
  • As for the virome, the grouping of samples shows lower statistical values compared with those of bacteria. In viruses, unlike in bacteria, variation between feces from healthy individuals and new-onset CD patients was not significant.
  • Despite that, more viral biomarkers were detected in feces from patients with CD, showing that uneven distribution of certain viruses was observed regardless of therapy, and thus it may be linked to the inflammatory process.
  • New-onset CD samples displayed higher diversity of viral taxa than healthy individuals, but it seems to be drastically reduced by the treatment. Thus, viral increments linked to the first appearance of symptoms of CD before the beginning of the therapy may be relevant.
  • No individual viruses can be categorically assigned as etiological factors of CD, but the observation that more viruses were more frequent in CD patients than in healthy controls, together with the fact that viruses modulate the bacterial populations makes the study of the interactions between both communities an essential step to understand more complex indirect relationships between viruses and CD.



  1. Baumgart DC, Sandborn WJ. Inflammatory bowel disease: clinical aspects and established and evolving therapies. Lancet 2007;369:1641–57.
  2. Abdullah M, Syam AF, Simadibrata M et al. New insights on the pathomechanisms of inflammatory bowel disease. J Dig Dis 2013;14:455-62.
  3. Manichanh C, Borruel N, Casellas F et al. The gut microbiota in IBD. Nat Rev Gastroenterol Hepatol 2012;9:599–608.
  4. Sartor RB. Microbial influences in inflammatory bowel diseases. Gastroenterology 2008;134:577–94.
  5. Cadwell K, Patel KK, Maloney NS et al. Virus-plus-susceptibility gene interaction determines Crohn’s disease gene Atg16L1 phenotypes in intestine. Cell 2010;141:1135–45.
  6. Khan RR, Lawson AD, Minnich LL et al. Gastrointestinal norovirus infection associated with exacerbation of inflammatory bowel disease. J Pediatr Gastroenterol Nutr 2009;48:328–33.



Vicente Pérez Brocal, Ph.D.

Research Fellow at the Genomics and Health Area, Foundation for the Promotion of Health and Biomedical Research of Valencia Region (FISABIO).

Avda. de Catalunya 21, 46020,

Valencia (Spain).



Figure legends:

Figure 1. Relative abundance of bacteria by group of samples based on the 16S rRNA gene, at phylum taxonomic level.

Sample group code: CDFa: CD patients’ fecal samples, new onset; CDFb: CD patients’ fecal samples, active; CDFc: CD patients’ fecal samples, active before surgery; CDBa: CD patients’ endoscopic biopsies, new onset; CDBc: CD patients ‘surgical biopsies, active; CDF: CD patients’ fecal samples; CDB: CD patients’ biopsies.

Figure2. Relative abundance of DNA viruses (top) and RNA viruses (bottom) by group of samples, at family taxonomic level.

Sample group code: CDFa: CD patients’ fecal samples, new onset; CDFb: CD patients’ fecal samples, active; CDFc: CD patients’ fecal samples, active before surgery; CDBa: CD patients’ endoscopic biopsies, new onset; CDBc: CD patients ‘surgical biopsies, active; CDF: CD patients’ fecal samples; CDB: CD patients’ biopsies.

Figure 3. Comparisons between categories based on the nonparametric statistical method adonis, in bacteria and viruses.

Figure 4. Bacterial biomarkers or distinctive bacteria between feces from the Control individuals (n=20) and CD patients (n=22). The taxonomic level is preceded by a prefix: none: species, g__: genus, f__: family, o__: order and c__: class.

Figure 5. Number of viral biomarkers in the samples by groups.

Sample group code: Administration of steroids / immunosuppressors: NN: No-No, NY: No-Yes, YN: Yes-No; YY: Yes-Yes.