CpG_chrm CpG_beg CpG_end probe_strand Probe_ID address_A address_B channel designType nextBase nextBaseRef probeType orientation probeCpGcnt context35 probeBeg probeEnd ProbeSeq_A ProbeSeq_B gene gene_HGNC chrm_A beg_A flag_A mapQ_A cigar_A NM_A chrm_B beg_B flag_B mapQ_B cigar_B NM_B wDecoy_chrm_A wDecoy_beg_A wDecoy_mapQ_A wDecoy_cigar_A wDecoy_NM_A wDecoy_chrm_B wDecoy_beg_B wDecoy_mapQ_B wDecoy_cigar_B wDecoy_NM_B posMatch MASK.mapping MASK.typeINextBaseSwitch MASK.rmsk15 MASK.sub25.copy MASK.sub30.copy MASK.sub35.copy MASK.sub40.copy MASK.snp5.common MASK.snp5.GMAF1p MASK.extBase MASK.general

(1-3) CpG_chrm, CpG_beg, CpG_end: the location of the target. CpG_beg is 0-based coordinate and CpG_end is 1-based. The coordinates should have a span of 2 nucleotides for CpG probes, or 1 nucleotide for CpH and SNP probes. Some erroneous CpH probe coordinates mapping information in the manufacturer's manifest have been corrected.
(4) probe_strand: This is consistent with "orientation" column for strand information of the actual probe. '+' is for all the up-probes positioned in smaller coordinates and '-' for all the down-probes positioned in greater coordinates with respect to the target CpGs. "*" is used for unmapped probes.
(5) Probe_ID: Probe ID (6-7) address_A, address_B: addresses of probe A and B on the chip designated by the original manifest.
(8) channel: "Both" for type II probes and "Grn"/"Red" for type I probes.
(9) designType: either "I" or "II".
(10) nextBase: the actual extension base (on the probe strand) after bisulfite conversion ("A" or "C" or "T"). Unmapped probe has extension base labeled in the original manifest.
(11) nextBaseRef: the extension base (on the hybridized/template DNA) before bisulfite conversion ("A", "C", "G" or "T"). Unmapped probe has "NA".
(12) probeType: either "cg", "ch" or "rs".
(13) orientation: either "up" or "down" specifying whether the probe is positioned upstream (in smaller coordinates) or downstream (in greater coordinates) the target.
Note that by design, probes positioned upstream (in smaller coordinates) are always on the Watson strand and probes positioned downstream (in greater coordinates) are always on the Crick strand.
(14) probeCpGcnt: the number of additional CpGs in the probe (not counting the interrogated CpG).
(15) context35: the number of CpG in the [-35bp, +35bp] window.
(16-17) probeBeg, probeEnd: the mapped start and end position of the probe, it is always 50bp long.
(18-19) ProbeSeq_A, ProbeSeq_B: the probe sequence for allele A and B.
(20) gene: ";"-separated list of gene annotations (unique and alphabetically sorted). Gene models follows GENCODE version 22 (hg38).
(21) gene_HGNC: ";"-separated list of gene annotations (unique and alphabetically sorted). Genes are checked using HGNChelper for compatibility with HGNC. Gene models follows GENCODE version 22 (hg38).
(22-27) chrm_A, beg_A, flag_A, mapQ_A, cigar_A, NM_A: the mapping info for probe A excluding decoy chromsomes. mapQ=mapping quality score, 0-60, with 60 being the best.
(28-33) chrm_B, beg_B, flag_B, mapQ_B, cigar_B, NM_B: the mapping info for probe B excluding decoy chromosomes, like above.
(34-39) wDecoy_chrm_A, wDecoy_beg_A, wDecoy_flag_A, wDecoy_mapQ_A, wDecoy_cigar_A, wDecoy_NM_A: the mapping info for probe A including decoy chromosomes.
(40-45) wDecoy_chrm_B, wDecoy_beg_B, wDecoy_flag_B, wDecoy_mapQ_B, wDecoy_cigar_B, wDecoy_NM_B: the mapping info for probe B including decoy chromosomes.
(46) posMatch: whether the mapping matches the original manifest, it only applies to hg19 and will be NA under hg38.
(47) MASK.mapping: whether the probe is masked for mapping reason. Probes retained should have high quality (>=40 on 0-60 scale) consistent (with designed MAPINFO) mapping (for both in the case of type I) without INDELs.
(48) MASK.typeINextBaseSwitch: whether the probe has a SNP in the extension base that causes a color channel switch from the official annotation (described as color-channel-switching, or CCS SNP in the reference). These probes should be processed differently than designed (by summing up both color channels instead of just the annotated color channel).
(49) MASK.rmsk15: whether the 15bp 3'-subsequence of the probe overlap with repeat masker, this MASK is NOT recommended.
(50-53) MASK.sub25.copy, MASK.sub30.copy, MASK.sub35.copy, MASK.sub40.copy: whether the 25bp, 30bp, 35bp and 40bp 3'-subsequence of the probe is non-unique.
(54) MASK.snp5.common: whether 5bp 3'-subsequence (including extension for typeII) overlap with any of the common SNPs from dbSNP (global MAF can be under 1%).
(55) MASK.snp5.GMAF1p: whether 5bp 3'-subsequence (including extension for typeII) overlap with any of the SNPs with global MAF >1%.
(56) MASK.extBase: probes masked for extension base inconsistent with specified color channel (type-I) or CpG (type-II) based on mapping.
(57) MASK.general: the recommended general purpose masking merged from "MASK.sub30.copy", "MASK.mapping", "MASK.extBase", "MASK.typeINextBaseSwitch" and "MASK.snp5.GMAF1p".