All files are gzipped plain text files (tab-delimited). some old gz files appear double compressed when downloaded with firefox. Please apply decompression twice.
GRCh38 / hg38
- Basic manifest with mapping information
(EPICv2,
EPIC+,
EPIC,
HM450)
Column header: (header description)
CpG_chrm |
CpG_beg |
CpG_end |
address_A |
address_B |
target |
nextBase |
channel |
Probe_ID |
mapFlag_A |
mapChrm_A |
mapPos_A |
mapQ_A |
mapCigar_A |
AlleleA_ProbeSeq |
mapNM_A |
mapAS_A |
mapYD_A |
mapFlag_B |
mapChrm_B |
mapPos_B |
mapQ_B |
mapCigar_B |
AlleleB_ProbeSeq |
mapNM_B |
mapAS_B |
mapYD_B |
type |
Previous versions:
(EPIC,
HM450,
HM27,
header description)
- Mask information
(EPICv2,
EPIC+)
Column header: (header description),
see MASK_general column for the recommended masking
Probe_ID |
mask |
maskUniq |
MASK_general |
Previous versions:
(EPIC,
HM450,
HM27,
header description)
Previous population-specific SNP masking:
(EPIC,
HM450,
HM27,
header description)
- Gene annotation (GENCODEv41), promoter, and CpG island
(EPICv2)
Column header: (header description)
CpG_chrm |
CpG_beg |
CpG_end |
probe_strand |
Probe_ID |
genesUniq |
geneNames |
transcriptTypes |
transcriptIDs |
distToTSS |
CGI |
CGIposition |
Previous versions:
GENCODEv36
(EPIC,
HM450,
HM27)
GENCODEv22
(EPIC,
HM450,
HM27)
- SeSAMe annotations (CGI, enhancer, TF binding, chromatin state, ...)
(EPICv2,
EPIC+)
Column header: (header description)
- Converting SNP probes to VCF format - rs probes
(EPIC)
Column header: (header description)
chrm |
beg |
end |
rs |
designType |
U |
REF |
ALT |
- Converting SNP probes to VCF format - channel-switching Infinium-I probes
(EPIC)
Column header: (header description)
chrm |
beg |
end |
strand |
cg |
designType |
In-band |
REF |
ALT |
rs |
- Bisulfite non-uniqueness of 3'-subsequence of 10-50 bases in length
(EPIC,
HM450,
HM27)
Column header: (header description)
probeID |
copy_10 |
copy_11 |
copy_12 |
copy_13 |
... |
copy_49 |
copy_50 |
GRCh37 / hg19
(Links to all archived platforms)
Other Genomes
-
310 species manifest from Ensemble v101
(EPICv2,
EPIC,
HM450,
HM27)
Column header: (header description)
CpG_chrm |
CpG_beg |
CpG_end |
address_A |
address_B |
target |
nextBase |
channel |
Probe_ID |
mapFlag_A |
mapChrm_A |
mapPos_A |
mapQ_A |
mapCigar_A |
AlleleA_ProbeSeq |
mapNM_A |
mapAS_A |
mapYD_A |
mapFlag_B |
mapChrm_B |
mapPos_B |
mapQ_B |
mapCigar_B |
AlleleB_ProbeSeq |
mapNM_B |
mapAS_B |
mapYD_B |
type |
- Mouse mm10/GRCm38 mapping
(EPIC,
HM450,
HM27)
Column header: (header description)
probeID |
chrm_A |
beg_A |
flag_A |
mapQ_A |
cigar_A |
chrm_B |
beg_B |
flag_B |
mapQ_B |
cigarB |
- Mouse 3'-subsequence copy number based on mm10/GRCm38
(EPIC,
HM450,
HM27)
Column header: (header description)
probeID |
copy_10 |
copy_11 |
copy_12 |
copy_13 |
... |
copy_49 |
copy_50 |
(see here for working with the mouse array)
GRCm38 / mm10
- Mapping information
(MM285)
Column header: (header description)
CpG_chrm |
CpG_beg |
CpG_end |
address_A |
address_B |
target |
nextBase |
channel |
Probe_ID |
mapFlag_A |
mapChrm_A |
mapPos_A |
mapQ_A |
mapCigar_A |
AlleleA_ProbeSeq |
mapNM_A |
mapAS_A |
mapYD_A |
mapFlag_B |
mapChrm_B |
mapPos_B |
mapQ_B |
mapCigar_B |
AlleleB_ProbeSeq |
mapNM_B |
mapAS_B |
mapYD_B |
type |
Previous versions:
Old manifest (MM285,
header description)
- Mouse array design groups
(MM285)
Column header:
Header description:
(2) design: contains the annotation of the probes. For the syntenic EPIC probe mapping, search for EPIC prefix in the design column.
- Mask information
(MM285)
Column header: (header description),
see MASK_general column for the recommended masking
Probe_ID |
mask |
maskUniq |
MASK_general |
Previous versions:
(MM285)
(2) mask: a boolean indication of whether probes are recommended to be masked in data preprocessing. Probes masked by default includes control probes, multi-mapping probes (mapQ < 30 for either allele A or B) and non-informative probes (uk probes)
- Gene annotation (GENCODEvM25), promoter, and CpG island
(MM285)
Column header: (header description)
CpG_chrm |
CpG_beg |
CpG_end |
probe_strand |
probeID |
genesUniq |
geneNames |
transcriptTypes |
transcriptIDs |
distToTSS |
CGI |
CGIposition |
- SeSAMe annotations (CGI, enhancer, TF binding, chromatin state, ...)
(MM285)
Column header: (header description)
- The Illustration of the new array ID system
The mouse array employs an improved ID system on top of the traditional "cg" numbers. The new ID system uniquely specifies the design details. The new ID is designed to accomodate more flexible probe design such as replicates and opposite-strand design.
- ChromHMM annotation of 66 samples
(MM285)
Column header: (header description)
chrm |
beg |
end |
probeID |
ENCFF005IEW_forebrain_embryonic_12.5 |
ENCFF014LBF_hindbrain_postnatal_0 |
ENCFF023ETX_liver_embryonic_14.5 |
ENCFF065PNO_midbrain_embryonic_16.5 |
ENCFF072LNA_liver_embryonic_11.5 |
... |
- Imprinting ICR/DMR annotation
(MM285, also see ICR annotation, and a comparison of different ICR/DMR evidences).
- PhastCons Evolutionary Conservation
(MM285)
- PhyloP Evolutionary Conservation
(MM285)
GRCm39 / mm39
- Mapping information
(MM285)
Column header: (header description)
CpG_chrm |
CpG_beg |
CpG_end |
address_A |
address_B |
target |
nextBase |
channel |
Probe_ID |
mapFlag_A |
mapChrm_A |
mapPos_A |
mapQ_A |
mapCigar_A |
AlleleA_ProbeSeq |
mapNM_A |
mapAS_A |
mapYD_A |
mapFlag_B |
mapChrm_B |
mapPos_B |
mapQ_B |
mapCigar_B |
AlleleB_ProbeSeq |
mapNM_B |
mapAS_B |
mapYD_B |
type |
Other genomes
- 310 species manifest from Ensemble v101
(MM285)
Column header: (header description)
CpG_chrm |
CpG_beg |
CpG_end |
address_A |
address_B |
target |
nextBase |
channel |
Probe_ID |
mapFlag_A |
mapChrm_A |
mapPos_A |
mapQ_A |
mapCigar_A |
AlleleA_ProbeSeq |
mapNM_A |
mapAS_A |
mapYD_A |
mapFlag_B |
mapChrm_B |
mapPos_B |
mapQ_B |
mapCigar_B |
AlleleB_ProbeSeq |
mapNM_B |
mapAS_B |
mapYD_B |
type |
GRCh38 / hg38
- Mapping information
(Mammal40)
Column header: (header description)
CpG_chrm |
CpG_beg |
CpG_end |
address_A |
address_B |
target |
nextBase |
channel |
Probe_ID |
mapFlag_A |
mapChrm_A |
mapPos_A |
mapQ_A |
mapCigar_A |
AlleleA_ProbeSeq |
mapNM_A |
mapAS_A |
mapYD_A |
mapFlag_B |
mapChrm_B |
mapPos_B |
mapQ_B |
mapCigar_B |
AlleleB_ProbeSeq |
mapNM_B |
mapAS_B |
mapYD_B |
type |
Previous versions:
Old manifest (Mammal40,
header description)
Other genomes
- 310 species manifest from Ensemble v101
(Mammal40)
Column header: (header description)
CpG_chrm |
CpG_beg |
CpG_end |
address_A |
address_B |
target |
nextBase |
channel |
Probe_ID |
mapFlag_A |
mapChrm_A |
mapPos_A |
mapQ_A |
mapCigar_A |
AlleleA_ProbeSeq |
mapNM_A |
mapAS_A |
mapYD_A |
mapFlag_B |
mapChrm_B |
mapPos_B |
mapQ_B |
mapCigar_B |
AlleleB_ProbeSeq |
mapNM_B |
mapAS_B |
mapYD_B |
type |
Update Nov-10-2022:
- Added EPICv2 manifest and annotations
Update Sep-03-2022:
- update to Ensembl manifests, mouse manifests and human anotation
- update to the documentation of column headers
Update Jun-15-2021:
- updated gene annotation to GENCODE v36 to be consistent with GDC. CGI column is updated to contain CGI only when CGIposition is not NA.
- updated mouse array gene model
Update Apr-14-2021:
- updated gene annotation to GENCODE v37
Update Jul-4-2020:
- More detailed annotation to SNP and channel-switching probes
Release Sep-9-2018:
- Added gene_HGNC column for gene symbols corrected for HGNC compatibility
- Updated SNP masking with dbSNP build 151
- All maskings were redone on hg19. Previously hg19 manifest file was "borrowing" hg38 maskings.
- CCS probes have shrunk in numbers based on AF filtering (>1%).
- See Updates Summary for more details.
Release Aug-8-2018:
- Fix to MASK_mapping, some mapping with NM>0 are now masked as well
- Addition of NM columns and gene annotation to the manifest file.
- Previously, masking of mapping issues were merged from hg19 and hg38. Now masking on hg19 and hg38 was re-built entirely independently. There is some small decrease in the number of masking due to this change and a small increase due to the added NM-based masking (included in MASK_mapping).
- See Updates Summary for more details.
Release Mar-4-2018
- Fix to hg19 decoy mapping inconsistency.
Release Jan-4-2018
- Updated strand information.
Release Nov-23-2017
- Changed to RDS from RData following R's recommendation.
- Update to missing value representation.
- Update to table headers.
- Added gene annotation in short forms.
- Added mapping to genome both including and excluding alt-chromosomes.
- Made consistent naming of columns with underscores.
Release Mar-13-2017
- fix to relative position in SNP masking of type-I probes.
Human Array - Zhou W, Laird PW and Shen H, Comprehensive characterization, annotation and innovative use of Infinium DNA Methylation BeadChip probes, Nucleic Acids Research 2017
Mouse Array - Zhou W et al. DNA methylation dynamics and dysregulation delinated by high-throughput profiling in the mouse, Cell Genomics 2022
Questions regarding this annotation can be addressed to wanding.zhou@pennmedicine.upenn.edu.