Gene set enrichment analysis: performance evaluation and usage guidelines

UMMS Affiliation

Department of Biochemistry and Molecular Pharmacology; Program in Bioinformatics and Integrative Biology



Document Type


Medical Subject Headings

Algorithms; Computational Biology; Databases, Genetic; Gene Expression; Guidelines as Topic; Humans


Bioinformatics | Computational Biology | Molecular Biology | Systems Biology


A central goal of biology is understanding and describing the molecular basis of plasticity: the sets of genes that are combinatorially selected by exogenous and endogenous environmental changes, and the relations among the genes. The most viable current approach to this problem consists of determining whether sets of genes are connected by some common theme, e.g. genes from the same pathway are overrepresented among those whose differential expression in response to a perturbation is most pronounced. There are many approaches to this problem, and the results they produce show a fair amount of dispersion, but they all fall within a common framework consisting of a few basic components. We critically review these components, suggest best practices for carrying out each step, and propose a voting method for meeting the challenge of assessing different methods on a large number of experimental data sets in the absence of a gold standard.

Rights and Permissions

Citation: Brief Bioinform. 2012 May;13(3):281-91. doi: 10.1093/bib/bbr049. Epub 2011 Sep 7. Link to article on publisher's site

Related Resources

Link to Article in PubMed