The Exome Aggregation Consortium (ExAC) is a coalition of investigators seeking to aggregate and harmonize exome sequencing data from a variety of large-scale sequencing projects, and to make summary data available for the wider scientific community.
The data set provided on this website spans 61,486 unrelated individuals sequenced as part of various disease-specific and population genetic studies. We have removed individuals affected by severe pediatric disease, so this data set should serve as a useful reference set of allele frequencies for severe disease studies. All of the raw data from these projects have been reprocessed through the same pipeline, and jointly variant-called to increase consistency across projects.
A list of ExAC Principal Investigators and groups that have contributed data to the current release is available below.
All data here are released publicly for the benefit of the wider biomedical community. Now that the ExAC flagship paper has been published, there are no publication restrictions on these data. Please cite the ExAC paper for any use of these data.
The data are available under the ODC Open Database License (ODbL) (summary available here): you are free to share and modify the ExAC data so long as you attribute any public use of the database, or works produced from the database; keep the resulting data-sets open; and offer your shared or adapted version of the dataset under the same ODbL license.
The aggregation and release of summary data from the exomes collected by the Exome Aggregation Consortium has been approved by the Partners IRB (protocol 2013P001339, “Large-scale aggregation of human genomic data”).
For bug reports, please file an issue on Github.