ALLBIO training workshop – Marseille, June 23-25, 2014

Analyzing thousands of bacterial genomes: gene annotation, metabolism, regulation

Practical info
Date June 23-25, 2014
Location Hotel Novotel Marseille Vieux Port
Participants ~20



This training workshop is funded by the ALLBIO project (, whose goal is to establish the link between bioinformatics and biologist communities who express uncovered needs for other fields than human genomics (microbial, plant, livestock).


Description: sequencing technologies allow microbiologists to sequence, for a moderate cost, not only the genome of their favourite species, but a few dozens of related species, a collection of strains, and even to characterize inter-individual variations. The Ensembl Genomes resource contains over 10,000 completely sequenced genomes. The availability of such data requires new bioinformatics resources suited for annotating, querying and analysing multitudes of sequenced bacterial genomes in parallel.

This training workshop will show how to combine several specialized bioinformatics resources to extract information about bacterial genes and their annotations (Ensembl Genomes), metabolism (MICROSCOPE), and regulation (RSAT). The course will be oriented towards comparative genomics, and give a perspective about advances that can be expected from the massive reduction of sequencing costs resulting from Next Generation Sequencing. The course will mainly be based on user-friendly Web interfaces. It will also include a basic introduction about the programmatic access to bioinformatics resources, showing by simple and well-documented examples how to automatize the collection of data, and extract information from multiple data sets.

Target audience

This course is addressed to microbiologists confronted to the need to use bioinformatics resources to analyse multiple genomes.


The course does not require any computing skills. A basic knowledge of the Linux environment is welcome, but not required.


This training course will include

  1. Short talks illustrating the functionalities offered by bioinformatics resources suited for the analysis of multiple bacterial genomes.
  2. Practicals based on concrete study cases:
    • collecting multiple genomes from bacteria;
    • comparing annotations between different species;
    • extracting metabolic information for selected genomes;
    • metabolic projection: coverage of reference pathways by the enzymes found in a genome;
    • phylogenetic profiling: detecting genes co-occurrences across genomes;
    • phylogenetic footprints: detection of conserved cis-regulatory elements.
  3. Practicals on programmatic use of the tools:
    • extracting data for multiple genomes;
    • combining data from diverse sources.
Computer environment

If possible, participants are invited to bring their own laptop if they wish to work in their familiar environment. We will rent additional computers for participants who do not dispose of a laptop.


Dan StainesEuropean Bioinformatics Institute (EBI), UKEnsembl Genomes
François LeFevre, Eugeni Belda, Claudine MédigueGenoscope (CEA), FranceMICROSCOPE
Jacques van Helden, Denis PuthierAix-Marseille Université (AMU), FranceRegulatory Sequence Analysis Tools (RSAT)
public/valid-ulb.txt · Last modified: 2019/02/12 09:04 by
Trace: valid-ulb