WORKFLOW

1. Before sampling

Study design, networking, and paperwork

1

Study design

If you are planning to start a new project in which you plan to get biological samples from animals, you should consider developing it within the framework of the EHI. But also, if you are sampling animals for research or training purposes, and you can get biological samples that may contribute to existing projects, we encourage you to contact us and request sampling kits. Participate in the EHI and make the most of your field work!

2

Network and sampling strategy

If you want to participate in the EHI, contact us. We anticipate three main types of participation, although we are open to discuss all means of collaboration. Just let us know what you are willing to provide to the initiative and what you want to get from the EHI. We will then formalise the collaboration in the EHI Participation agreement.

3

Capture and sample export paperwork

Let's work together to make sure all required permits are in place before the sampling campaign!

2. In the field

Respectful animal handling and correct sample
collection and storage

1

Sample collection and preservation

Familiarise yourself with the EHI sampling requirements. Samples must be frozen within 14 days, and the time and conditions before freezing must be acknowledged. Samples need to be collected as sterile as possible as the laboratory procedures used in the EHI (shotgun sequencing) are very sensitive to contamination. Inform yourself about the sampling options

2

Metadata collection

The EHI follows the GSC-developed data reporting standard designed for accurate reporting of contextual information for samples associated with genomic and metagenomic sequencing. We request four types of metadata per sample: sampling event, host animal, environment, sample. Metadata collection sheets and instructions can be found here.

3

Sample shipment

After the sampling is done, the samples need to be shipped to the EHI in Denmark. We will assist you as much as possible with the required paperwork, the boxing, and the shipping of the samples.

3. In the lab

Secure sample storage and high-quality sample processing
The complete EHI laboratory workflow is detailed here

1

Sample homogenisation and digestion

to break down the complex matrix of the sample.

2

DNA extraction

to isolate DNA molecules from the rest of organic materials in the mixture.

3

DNA shearing

to achieve desired molecule sizes for optimal short-read sequencing.

4

Sequencing library preparation

to convert fragmented DNA molecules into a format that is compatible with the sequencing platform.

5

Sequencing library indexing

to amplify the library using primers containing unique identifiers.

6

Sequencing pool generation

to create a single sequencing pool containing multiple libraries in desired proportions.

4. On the server

Efficient data processing and sound result generation
The bioinformatic steps are detailed here

1

Data preprocessing

The bioinformatic pipeline begins by assessing and filtering raw sequencing data to remove low-quality reads, adapters, and contaminants. This step ensures the data’s reliability and quality. Then, host and non-host data are being split: the metagenomic fraction is separated from the host, by mapping the reads against a reference host genome.

2

Assembly and binning

Next is the metagenomic assembly: the remaining non-host reads are assembled into contigs or scaffolds using metagenomic assembly software. This step results in a set of contigs representing the genetic material of the microbial community. The assembled contigs are then binned: clustered into metagenome-assembled genomes (MAGs) based on sequence composition, coverage, and other characteristics.

3

Bin annotation

The MAGs are annotated to determine their taxonomic identity and functional potential.

4

Dereplication and mapping

MAGs are dereplicated to remove redundancy and retain only unique genomic representatives. This step ensures that each MAG represents a distinct microbial population or genome. Finally, to understand the composition of the gut microbiota in individual samples, the pre-processed reads from each sample are mapped against the dereplicated MAG catalogue. This step quantifies the abundance of each MAG in each sample, allowing for a comprehensive view of the microbial community composition.


Bioinformatic hologenomic approaches require reference genome sequences of host species to be available in order to split host DNA from metagenomic DNA and generate whole genome sequence profiles of analysed individuals. 


5. Analysis, statistics, and archiving

Secure sample storage and high-quality sample processing
The full EHI data analysis workflow is detailed here

1

Data summary

Generating some general statistics to obtain an overview of the data. Further, production of general MAG statistics and investigation of the geographic distribution of the sample.

2

MAG catalogue

Exploring the characteristics of the MAG catalogue generated through the EHI pipeline. Analysing the MAG phylogeny, quality, and functional attributes. Using the MAG functional annotation, it is possible to ordinate prokaryotic genomes on a bidimentional space. In doing so, one can assess how close any group of bacteria are in functional terms, or how functionally diverse the members of a given phylum can be. 

3

Sequencing assessment

Investigating the distribution of reads across samples and the estimated vs. mapped prokaryotic fraction, in order to estimate whether a prokaryotic community has been properly represented, or whether further sequencing is required.

4

Count data

Transformation and visualisation of the quantitative information of the MAGs. Including minimum coverage filtering to minimise artificial inflation of diversity. Also performing genome size normalisation to account for genome size biases: read-counts can be normalised by applying a normalisation factor that modifies the read numbers according to the size of each genome compared to the average genome size in the dataset. Finally, generating a count table to visualise the relative MAG abundances per sample.

5

Taxonomic composition

Exploration of the taxonomic characteristics of the MAG catalogue across samples.

6

Diversity analyses

Performing alpha and beta diversity analyses of the count data generated through the EHI pipeline.

6. In the end

FAIR data, open source publications,
high-quality dissemination and inspiring outreach

Address

Center for Evolutionary Hologenomics, GLOBE Institute
University of Copenhagen
Øster Farimagsgade 5, 7
1353 Copenhagen K, Denmark

Contact

Coordinator: Antton Alberdi, PhD
Email: ehi@sund.ku.dk