All files are gzipped plain text files (tab-delimited). some old gz files appear double compressed when downloaded with firefox. Please apply decompression twice.
GRCh38 / hg38
- Manifest with mapping information
(MSA,
EPICv2,
EPIC+,
EPIC,
HM450)
Column header: (header description)
CpG_chrm |
CpG_beg |
CpG_end |
address_A |
address_B |
target |
nextBase |
channel |
Probe_ID |
mapFlag_A |
mapChrm_A |
mapPos_A |
mapQ_A |
mapCigar_A |
AlleleA_ProbeSeq |
mapNM_A |
mapAS_A |
mapYD_A |
mapFlag_B |
mapChrm_B |
mapPos_B |
mapQ_B |
mapCigar_B |
AlleleB_ProbeSeq |
mapNM_B |
mapAS_B |
mapYD_B |
type |
Previous versions:
(EPIC,
HM450,
HM27,
header description)
- Mask information
(MSA,
EPICv2,
EPIC+)
Column header: (header description),
see M_general column for the recommended masking
Probe_ID |
mask |
maskUniq |
M_general |
Previous versions:
(EPIC,
HM450,
HM27,
header description)
Previous population-specific SNP masking:
(EPIC,
HM450,
HM27,
header description)
- Trait associations
(MSA)
Column header:
Probe_ID |
Trait_Associations |
Header description:
(2) Trait_Associations: a comma-delimited string of trait associations. EWAS
hits has the following format:
EWAS_hit:[trait]:[PMID]:[q-value of association]. NA is used when the info is missing.
- Gene annotation (GENCODEv41) and promoters
(MSA,
EPICv2)
Column header: (header description)
CpG_chrm |
CpG_beg |
CpG_end |
probe_strand |
Probe_ID |
genesUniq |
geneNames |
transcriptTypes |
transcriptIDs |
distToTSS |
Previous versions:
GENCODEv36
(EPICv2,
EPIC,
HM450,
HM27)
GENCODEv22
(EPIC,
HM450,
HM27)
- Functional annotations (CGI, enhancer, TF binding, chromatin state, ...)
(MSA,
EPICv2,
EPIC,
HM450,
HM27)
Column header: (header description)
- SNP annotations
(MSA,
EPICv2,
HM450)
Column header: (header description)
chrm |
beg |
end |
strand |
rs |
designType |
U |
REF |
ALT |
Probe_ID |
This is for converting probes to genotype VCFs.
Previous version: rs probes
(EPIC,
header description)
Previous version: channel-switching Infinium-I probes
(EPIC,
header description)
- Bisulfite non-uniqueness of 3'-subsequence of 10-50 bases in length
(EPIC,
HM450,
HM27)
Column header: (header description)
probeID |
copy_10 |
copy_11 |
copy_12 |
copy_13 |
... |
copy_49 |
copy_50 |
GRCh37 / hg19
(Links to all archived platforms)
Other Genomes
(Click here if you are using arrays on non-target species.)
(see here for working with the mouse array)
GRCm38 / mm10
- Manifest with mapping information
(MM285)
Column header: (header description)
CpG_chrm |
CpG_beg |
CpG_end |
address_A |
address_B |
target |
nextBase |
channel |
Probe_ID |
mapFlag_A |
mapChrm_A |
mapPos_A |
mapQ_A |
mapCigar_A |
AlleleA_ProbeSeq |
mapNM_A |
mapAS_A |
mapYD_A |
mapFlag_B |
mapChrm_B |
mapPos_B |
mapQ_B |
mapCigar_B |
AlleleB_ProbeSeq |
mapNM_B |
mapAS_B |
mapYD_B |
type |
Note: This annotation is based on the design paper (N=296,070). It described a slightly different set that corrects the Illumina A2 manifest (N=287,692).
See Manifest comparison for details. Sesame default preprocessing is based on the 296070 version.
Previous versions:
Illumina A2 manifest (MM285) (N=287,692)
LEGX manifest (MM285,
header description)
- Mouse array design groups
(MM285)
Column header:
Header description:
(2) design: contains the annotation of the probes. For the syntenic EPIC probe mapping, search for EPIC prefix in the design column.
- Mask information
(MM285)
Column header: (header description),
see M_general column for the recommended masking
Probe_ID |
mask |
maskUniq |
M_general |
Header description:
(2) mask: a boolean indication of whether probes are recommended to be masked in data preprocessing. Probes masked by default includes control probes, multi-mapping probes (mapQ < 30 for either allele A or B) and non-informative probes (uk probes)
Previous versions:
Illumina A2 manifest (MM285),
LEGX manifest (MM285)
- Gene annotation (GENCODEvM25) and promoters
(MM285)
Column header: (header description)
CpG_chrm |
CpG_beg |
CpG_end |
probe_strand |
probeID |
genesUniq |
geneNames |
transcriptTypes |
transcriptIDs |
distToTSS |
Previous versions:
Illumina A2 manifest (MM285)
- Functional annotations (CGI, enhancer, TF binding, chromatin state, ...)
(MM285)
Column header: (header description)
Note: This annotation is based on the design paper (N=296,070). It described a slightly different set that corrects the Illumina A2 manifest (N=287,692).
See Manifest comparison for details. Sesame default preprocessing is based on the 296070 version.
- The Illustration of the new array ID system
The mouse array employs an improved ID system on top of the traditional "cg" numbers. The new ID system uniquely specifies the design details. The new ID is designed to accomodate more flexible probe design such as replicates and opposite-strand design.
- ChromHMM annotation of 66 samples
(MM285)
Column header: (header description)
chrm |
beg |
end |
probeID |
ENCFF005IEW_forebrain_embryonic_12.5 |
ENCFF014LBF_hindbrain_postnatal_0 |
ENCFF023ETX_liver_embryonic_14.5 |
ENCFF065PNO_midbrain_embryonic_16.5 |
ENCFF072LNA_liver_embryonic_11.5 |
... |
- Imprinting ICR/DMR annotation
(MM285, also see ICR annotation, and a comparison of different ICR/DMR evidences).
- PhastCons Evolutionary Conservation
(MM285)
- PhyloP Evolutionary Conservation
(MM285)
GRCm39 / mm39
- Manifest with mapping information
(MM285)
Column header: (header description)
CpG_chrm |
CpG_beg |
CpG_end |
address_A |
address_B |
target |
nextBase |
channel |
Probe_ID |
mapFlag_A |
mapChrm_A |
mapPos_A |
mapQ_A |
mapCigar_A |
AlleleA_ProbeSeq |
mapNM_A |
mapAS_A |
mapYD_A |
mapFlag_B |
mapChrm_B |
mapPos_B |
mapQ_B |
mapCigar_B |
AlleleB_ProbeSeq |
mapNM_B |
mapAS_B |
mapYD_B |
type |
Other genomes
- 310 species manifest from Ensemble v101
(MM285)
Column header: (header description)
CpG_chrm |
CpG_beg |
CpG_end |
address_A |
address_B |
target |
nextBase |
channel |
Probe_ID |
mapFlag_A |
mapChrm_A |
mapPos_A |
mapQ_A |
mapCigar_A |
AlleleA_ProbeSeq |
mapNM_A |
mapAS_A |
mapYD_A |
mapFlag_B |
mapChrm_B |
mapPos_B |
mapQ_B |
mapCigar_B |
AlleleB_ProbeSeq |
mapNM_B |
mapAS_B |
mapYD_B |
type |
GRCh38 / hg38
- Manifest with mapping information
(Mammal40)
Column header: (header description)
CpG_chrm |
CpG_beg |
CpG_end |
address_A |
address_B |
target |
nextBase |
channel |
Probe_ID |
mapFlag_A |
mapChrm_A |
mapPos_A |
mapQ_A |
mapCigar_A |
AlleleA_ProbeSeq |
mapNM_A |
mapAS_A |
mapYD_A |
mapFlag_B |
mapChrm_B |
mapPos_B |
mapQ_B |
mapCigar_B |
AlleleB_ProbeSeq |
mapNM_B |
mapAS_B |
mapYD_B |
type |
Previous versions:
Old manifest (Mammal40,
header description)
- KnowYourCG annotations (CGI, enhancer, TF binding, chromatin state, ...)
(Mammal40)
Column header: (header description)
Other genomes
- 310 species manifest from Ensemble v101
(Mammal40)
Column header: (header description)
CpG_chrm |
CpG_beg |
CpG_end |
address_A |
address_B |
target |
nextBase |
channel |
Probe_ID |
mapFlag_A |
mapChrm_A |
mapPos_A |
mapQ_A |
mapCigar_A |
AlleleA_ProbeSeq |
mapNM_A |
mapAS_A |
mapYD_A |
mapFlag_B |
mapChrm_B |
mapPos_B |
mapQ_B |
mapCigar_B |
AlleleB_ProbeSeq |
mapNM_B |
mapAS_B |
mapYD_B |
type |
Update Mar-06-2024:
- Added Methylation Screening Array (MSA) manifest and annotations
Update Apr-27-2023:
- Added KYCG annotation files links
Update Nov-10-2022:
- Added EPICv2 manifest and annotations
Update Sep-03-2022:
- update to Ensembl manifests, mouse manifests and human anotation
- update to the documentation of column headers
Update Jun-15-2021:
- updated gene annotation to GENCODE v36 to be consistent with GDC. CGI column is updated to contain CGI only when CGIposition is not NA.
- updated mouse array gene model
Update Apr-14-2021:
- updated gene annotation to GENCODE v37
Update Jul-4-2020:
- More detailed annotation to SNP and channel-switching probes
Release Sep-9-2018:
- Added gene_HGNC column for gene symbols corrected for HGNC compatibility
- Updated SNP masking with dbSNP build 151
- All maskings were redone on hg19. Previously hg19 manifest file was "borrowing" hg38 maskings.
- CCS probes have shrunk in numbers based on AF filtering (>1%).
- See Updates Summary for more details.
Release Aug-8-2018:
- Fix to MASK_mapping, some mapping with NM>0 are now masked as well
- Addition of NM columns and gene annotation to the manifest file.
- Previously, masking of mapping issues were merged from hg19 and hg38. Now masking on hg19 and hg38 was re-built entirely independently. There is some small decrease in the number of masking due to this change and a small increase due to the added NM-based masking (included in MASK_mapping).
- See Updates Summary for more details.
Release Mar-4-2018
- Fix to hg19 decoy mapping inconsistency.
Release Jan-4-2018
- Updated strand information.
Release Nov-23-2017
- Changed to RDS from RData following R's recommendation.
- Update to missing value representation.
- Update to table headers.
- Added gene annotation in short forms.
- Added mapping to genome both including and excluding alt-chromosomes.
- Made consistent naming of columns with underscores.
Release Mar-13-2017
- fix to relative position in SNP masking of type-I probes.
Human MSA - Goldberg et al. MSA: scalable DNA methylation screening BeadChip for high-throughput trait association studies, bioRxiv 2024
Human EPICv2 - Kaur and Lee et al. Comprehensive evaluation of the Infinium human MethylationEPIC v2 BeadChip, Epigenetics Communications 2023
Mammalian and nonhuman species - Ding et al. Comparative epigenome analysis using Infinium DNA methylation BeadChips, Briefing in Bioinformatics 2023
Mouse MM285 - Zhou W et al. DNA methylation dynamics and dysregulation delinated by high-throughput profiling in the mouse, Cell Genomics 2022
Human EPIC, HM450 - Zhou W, Laird PW and Shen H, Comprehensive characterization, annotation and innovative use of Infinium DNA Methylation BeadChip probes, Nucleic Acids Research 2017
Questions regarding this annotation can be addressed to wanding.zhou@pennmedicine.upenn.edu.