About ExAC

The Exome Aggregation Consortium (ExAC) is a coalition of investigators seeking to aggregate and harmonize exome sequencing data from a variety of large-scale sequencing projects, and to make summary data available for the wider scientific community.

The data set provided on this website spans 61,486 unrelated individuals sequenced as part of various disease-specific and population genetic studies. We have removed individuals affected by severe pediatric disease, so this data set should serve as a useful reference set of allele frequencies for severe disease studies. All of the raw data from these projects have been reprocessed through the same pipeline, and jointly variant-called to increase consistency across projects.

A list of ExAC Principal Investigators and groups that have contributed data to the current release is available below.

All data here are released publicly for the benefit of the wider biomedical community. Now that the ExAC flagship paper has been published, there are no publication restrictions on these data. Please cite the ExAC paper for any use of these data.

The data are available under the ODC Open Database License (ODbL) (summary available here): you are free to share and modify the ExAC data so long as you attribute any public use of the database, or works produced from the database; keep the resulting data-sets open; and offer your shared or adapted version of the dataset under the same ODbL license.

The aggregation and release of summary data from the exomes collected by the Exome Aggregation Consortium has been approved by the Partners IRB (protocol 2013P001339, “Large-scale aggregation of human genomic data”).

For bug reports, please file an issue on Github.

ExAC Principal Investigators

  • Daniel MacArthur
  • David Altshuler
  • Diego Ardissino
  • Michael Boehnke
  • Mark Daly
  • John Danesh
  • Roberto Elosua
  • Jose Florez
  • Gad Getz
  • Christina Hultman
  • Sekar Kathiresan
  • Markku Laakso
  • Steven McCarroll
  • Mark McCarthy
  • Dermot McGovern
  • Ruth McPherson
  • Benjamin Neale
  • Aarno Palotie
  • Shaun Purcell
  • Danish Saleheen
  • Jeremiah Scharf
  • Pamela Sklar
  • Patrick Sullivan
  • Jaakko Tuomilehto
  • Hugh Watkins
  • James Wilson

Contributing projects

  • 1000 Genomes
  • Bulgarian Trios
  • Finland-United States Investigation of NIDDM Genetics (FUSION)
  • GoT2D
  • Inflammatory Bowel Disease
  • METabolic Syndrome In Men (METSIM)
  • Jackson Heart Study
  • Myocardial Infarction Genetics Consortium:
    • Italian Atherosclerosis, Thrombosis, and Vascular Biology Working Group
    • Ottawa Genomics Heart Study
    • Pakistan Risk of Myocardial Infarction Study (PROMIS)
    • Precocious Coronary Artery Disease Study (PROCARDIS)
    • Registre Gironi del COR (REGICOR)
  • NHLBI-GO Exome Sequencing Project (ESP)
  • National Institute of Mental Health (NIMH) Controls
  • Sequencing in Suomi (SISu)
  • Swedish Schizophrenia & Bipolar Studies
  • Schizophrenia Trios from Taiwan
  • The Cancer Genome Atlas (TCGA)
  • Tourette Syndrome Association International Consortium for Genomics (TSAICG)

Production team

  • Monkol Lek
  • Fengmei Zhao
  • Ryan Poplin
  • Eric Banks
  • Timothy Fennell

Analysis team

  • Monkol Lek
  • Kaitlin Samocha
  • Konrad Karczewski
  • Eric Minikel
  • James Ware
  • Anne O'Donnell Luria
  • Andrew Hill
  • Beryl Cummings
  • Daniel Birnbaum
  • Taru Tukiainen
  • Laramie Duncan
  • Karol Estrada
  • Menachem Fromer
  • Adam Kiezun
  • Mitja Kurki
  • Ron Do
  • Pradeep Natarajan
  • Gina Peloso
  • Hong-Hee Won

Website team

  • Konrad Karczewski
  • Brett Thomas
  • Daniel Birnbaum
  • Ben Weisburd

Ethics team

  • Stacey Donnelly
  • Andrea Saltzman
  • Namrata Gupta

Broad Genomics Platform

  • Stacey Gabriel
Many thanks to the Genomics Platform both for generating much of the exome data displayed here and for providing the computing resources required for this analysis.


  • NIGMS R01 GM104371 (PI: MacArthur)
  • NIDDK U54 DK105566 (PIs: MacArthur and Neale)