Methylize Walkthrough

[1]:
#Install joblib module for parallelization
import sys
!conda install --yes --prefix {sys.prefix} joblib
Collecting package metadata (current_repodata.json): done
Solving environment: done

# All requested packages already installed.

[2]:
import methylize
import methylcheck
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from statsmodels.stats.multitest import multipletests

Differentially Methylated Position Analysis with Binary Phenotypes

DMPs are probes where methylation levels differ between two groups of samples, while DMRs are genomic regions where DNA methylation levels differ between two groups of samples.

Beta or m-values can be used, but the format needs to be samples as rows and probes as columns.

[3]:
betas, meta = methylcheck.load_both('../data/asthma') #load in the beta values and metadata
betas = betas.T
print(meta.shape)
meta.head()
INFO:methylcheck.load_processed:Found several meta_data files; attempting to match each with its respective beta_values files in same folders.
WARNING:methylcheck.load_processed:Columns in sample sheet meta data files does not match for these files and cannot be combined:['../data/asthma/sample_sheet_meta_data.pkl', '../data/asthma/GPL13534/GSE157651_GPL13534_meta_data.pkl']
INFO:methylcheck.load_processed:Multiple meta_data found. Only loading the first file.
INFO:methylcheck.load_processed:Loading 40 samples.
Files: 100%|██████████| 1/1 [00:00<00:00,  8.94it/s]
INFO:methylcheck.load_processed:loaded data (485512, 40) from 1 pickled files (0.118s)
INFO:methylcheck.load_processed:Transposed data and reordered meta_data so sample ordering matches.
INFO:methylcheck.load_processed:meta.Sample_IDs match data.index (OK)
(40, 19)
[3]:
tissue disease smoking_status passage Sex age description Sample_ID Sentrix_ID Sentrix_Position Sample_Group Sample_Name Sample_Plate Sample_Type Sub_Type Sample_Well Pool_ID GSM_ID Control
20 Airway Fibroblast Healthy Ex 4 M 65 fun2norm normalized average beta 9976861129_R03C01 9976861129 R03C01 None HAF091 None Unknown None None None GSM4772065 False
38 Parenchymal Fibroblast Asthmatic Non 4 F 19 fun2norm normalized average beta 9976861129_R04C02 9976861129 R04C02 None 7294_P4_LG None Unknown None None None GSM4772083 False
37 Parenchymal Fibroblast Asthmatic Non 4 F 14 fun2norm normalized average beta 9976861129_R05C01 9976861129 R05C01 None 7291_lg_p4 None Unknown None None None GSM4772082 False
32 Parenchymal Fibroblast Asthmatic Current 4 F 15 fun2norm normalized average beta 9976861129_R06C02 9976861129 R06C02 None 7188_lg_p4_R None Unknown None None None GSM4772077 False
23 Airway Fibroblast Asthmatic Non 4 F 8 fun2norm normalized average beta 9976861137_R06C01 9976861137 R06C01 None 7016_BR_P4 None Unknown None None None GSM4772068 False

Get binary phenotype

The phenotypes can be anything list-like (i.e. a list, numpy array, or Pandas series).

When using Logistic Regression DMP, use phenotype data that is binary (has only 2 classes). In this example, we will take disease state as our phenotype because there are only 2 classes, Asthmatic and Healthy. The DMP step will automatically convert your string phenotype to 0 and 1 if there are not in that format already.

[4]:
pheno_data = meta.disease
print(pheno_data.shape)
dummies = pd.get_dummies(meta, columns=['disease'])[['disease_Asthmatic', 'disease_Healthy']]
healthy = dummies.disease_Healthy.sum()
asthma = dummies.disease_Asthmatic.sum()
fig1, ax1 = plt.subplots()
ax1.pie((healthy, asthma), labels=['Healthy', 'Asthma'], autopct='%1.1f%%',
        shadow=True, startangle=90)
plt.show()
(40,)
../_images/docs_methylize_tutorial_8_1.png

DMP Function

This function searches for individual differentially methylated positions/probes (DMPs) by regressing the methylation beta or M-value for each sample at a given genomic location against the phenotype data for those samples. In this first example, we are using the argument regression_method="logistic" because we are using binary phenotypes.

impute argument:

  • Default: ‘delete’ probes if any samples have missing data for that probe.
  • True or ‘auto’: if <30 samples, deletes rows; if >=30 samples, uses average.
  • False: don’t impute and throw an error if NaNs present ‘average’ - use the average of probe values in this batch ‘delete’ - drop probes if NaNs are present in any sample
  • ‘fast’ - use adjacent sample probe value instead of average (much faster but less precise)

Note: if you want to analyze every probe in your betas dataframe, remove the .sample(N). What this does is randomly samples N probes from your dataframe and decreases the time it takes to complete this analysis.

[5]:
test_results = methylize.diff_meth_pos(betas.sample(20000, axis=1), pheno_data, regression_method="logistic", export=False)
print(test_results.shape)
test_results.head()
WARNING:methylize.diff_meth_pos:Dropped 2902 probes with missing values from all samples
All samples with the phenotype (Asthmatic) were assigned a value of 0 and all samples with the phenotype (Healthy) were assigned a value of 1 for the logistic regression analysis.
(17098, 6)
[5]:
Coefficient StandardError PValue 95%CI_lower 95%CI_upper FDR_QValue
cg22048228 -0.320912 0.654584 0.623955 -0.962050 1.603874 0.999978
cg19366543 0.048680 1.217084 0.968095 -2.434121 2.336761 0.999978
cg12408293 -0.070025 1.835602 0.969570 -3.527689 3.667739 0.999978
cg08640790 -0.043054 2.412119 0.985759 -4.684612 4.770720 0.999978
cg02835494 0.027930 1.860959 0.988025 -3.675343 3.619483 0.999978

Differentially Methylated Positions Analysis with Continuous Numeric Phenotypes

Get Continuous Numeric Phenotype

In order to find DMPs against a numeric continuous phenotype, the linear regression method must be used.

In this example, we are using age as the continuous numeric phenotype.

[6]:
pheno_data = meta.age.astype(int)
print(pheno_data.shape)
(pheno_data.sort_values()).hist()
plt.xlabel('Age (years)'); plt.ylabel('Frequency')
(40,)
[6]:
Text(0, 0.5, 'Frequency')
../_images/docs_methylize_tutorial_16_2.png

DMP Function

In this second example, we are using the argument regression_method="linear" because we are using continuous numeric phenotypes.

[7]:
test_results2 = methylize.diff_meth_pos(betas, pheno_data, regression_method="linear", export=False)
print(test_results2.shape)
test_results2.head()
WARNING:methylize.diff_meth_pos:Dropped 71139 probes with missing values from all samples
(414373, 7)
[7]:
Coefficient StandardError PValue 95%CI_lower 95%CI_upper Rsquared FDR_QValue
cg03555227 0.004583 0.000380 1.453304e-14 0.890321 0.890629 0.792946 6.022099e-09
cg16867657 0.008122 0.000718 9.888059e-14 0.877810 0.878454 0.771117 2.048672e-08
cg11169102 -0.004757 0.000460 1.323437e-12 -0.859258 -0.858785 0.737918 1.550767e-07
cg16909962 0.009223 0.000903 1.871221e-12 0.855763 0.856707 0.733140 1.550767e-07
cg23045594 0.005227 0.000511 1.856802e-12 0.856031 0.856566 0.733247 1.550767e-07

To BED

This function converts the stats dataframe from the DMP function into BED format. For more information regarding this function and its arguments, please view the API Reference.

[8]:
bed = methylize.to_BED(test_results2, manifest_or_array_type='450k', save=False, filename='test_bed.bed')
bed.head()
INFO:methylprep.files.manifests:Reading manifest file: HumanMethylation450k_15017482_v3.csv
[8]:
chrom chromStart chromEnd pvalue name
0 1 69591.0 69641.0 0.060267 cg21870274
1 1 864703.0 864753.0 0.164362 cg08258224
2 1 870161.0 870211.0 0.473560 cg16619049
3 1 877159.0 877209.0 0.267582 cg18147296
4 1 898803.0 898853.0 0.283629 cg13938959

Manhattan Plots

After we run the DMP analysis, we can move on to visualizing the results. One way to do this is a Manhattan plot.

A Manhattan plot has probe position on the x-axis (grouped and colored by chromosome) and -log(p-value) on the y-axis. Because of the scale on the y-axis, the greater the probe is on the y-axis, the more significant that probe is to the phenotype that was used. It is common to have a cutoff on Manhattan plots where we set our α, which commonly is 0.05. Other common α are 0.01 and 0.001. The lower your α is, the more significant probes are that are above that cufoff on the Manhattan plot. The cutoff is for the p-values, meaning the probe is statistically significant if the p-value < α.

Because there are a high number of probes, a p-value correction is needed to prevent false positives. The Manhattan plot function automatically applies a Bonferroni correction to the cutoff to lower the α (increase the dotted cutoff line on the plot) to a more conservative value. The post-test correction is controlled by the adjust argument, where "bonferroni" is the default, and other options are "fdr" for an FDR correction, and None for no post-test correction to keep the cutoff the same as is entered in the cutoff argument.

Manhattan Plot for Binary Phenotypes

[18]:
methylize.manhattan_plot(test_results, cutoff=0.05, palette='default', save=False, array_type='450k', adjust=None)
INFO:methylprep.files.manifests:Reading manifest file: HumanMethylation450k_15017482_v3.csv
Total probes to plot: 17098
01 1692 | 02 1214 | 03 903 | 04 691 | 05 900 | 06 1275 | 07 1029 | 08 755 | 09 357 | 10 825 | 11 1086 | 12 830 | 13 396 | 14 477 | 15 538 | 16 747 | 17 1058 | 18 221 | 19 904 | 20 392 | 21 144 | 22 304 | X 360
../_images/docs_methylize_tutorial_25_2.png

Find Significant Probes

To see which probes are significant (above the cutoff line in the Manhattan plot), filter the results dataframe by the p-value. Because we set post_test=None in the arguments, this means the p-value cutoff that was set is correct, and there was no correction to this value. The next DMR example will have a p-value correction.

[10]:
interesting_probes = test_results[test_results['PValue'] <= 0.05]
interesting_probes
[10]:
Coefficient StandardError PValue 95%CI_lower 95%CI_upper FDR_QValue minuslog10pvalue chromosome MAPINFO
cg11703729 -1.403577 0.708667 0.047638 0.014614 2.792539 0.999978 1.322049 10 CHR-13809615.0
cg01096199 -1.518270 0.754115 0.044082 0.040232 2.996308 0.999978 1.355743 14 CHR-89724701.0
cg14543255 -1.834690 0.796496 0.021254 0.273586 3.395794 0.999978 1.672569 17 CHR-9002458.0
cg26494916 1.621125 0.733396 0.027075 -3.058554 -0.183696 0.999978 1.567434 05 CHR-149453321.0
cg27196131 1.991505 0.985877 0.043380 -3.923787 -0.059222 0.999978 1.362712 04 CHR-187176529.0
cg07351758 1.814658 0.810237 0.025113 -3.402694 -0.226623 0.999978 1.600105 03 CHR-59766624.0
cg23103043 1.830804 0.707215 0.009632 -3.216920 -0.444688 0.999978 2.016264 17 CHR-71650582.0
cg25314266 1.448918 0.706773 0.040360 -2.834168 -0.063668 0.999978 1.394050 04 CHR-76188042.0
cg07538039 1.761092 0.897471 0.049730 -3.520103 -0.002081 0.999978 1.303385 04 CHR-15374511.0
cg21664636 -1.624543 0.698487 0.020029 0.255533 2.993553 0.999978 1.698338 03 CHR-70383513.0

Manhattan Plot for Continuous Numeric Phenotypes

For this example, there will be a Benferoni post-test correction. The adjusted p-value cutoff is displayed on the horizontal cutoff line as well as printed above the plot. Please note that this value is the -log(p-value) of the p-value cutoff.

[11]:
methylize.manhattan_plot(test_results2, cutoff=0.05, palette='default', save=False, array_type='450k', verbose=True)
INFO:methylprep.files.manifests:Reading manifest file: HumanMethylation450k_15017482_v3.csv
Total probes to plot: 414373
01 40528 | 02 30059 | 03 21877 | 04 17328 | 05 20851 | 06 30620 | 07 24564 | 08 17814 | 09 8501 | 10 20863 | 11 25272 | 12 21229 | 13 10370 | 14 12956 | 15 12921 | 16 18404 | 17 24069 | 18 5215 | 19 21756 | 20 9259 | 21 3329 | 22 7167 | X 9419 | Y 2
p-value cutoff adjusted: 1.3010299956639813 ==[ Bonferoni ]==> 6.9184214452654755
../_images/docs_methylize_tutorial_31_2.png

Find Significant Probes

For this Manhattan plot, there has been a p-value cuttoff correction, meaning that the cutoff line is at a more conservative value (lower p-value, higher cutoff line) to account for false positives. Because of this correction, when filtering the results dataframe to find significant probes, you need to filter by the Bonferoni (or FDR) adjust p-value to find the probes above the cutoff line on the Manhattan plot.

[12]:
adjusted = multipletests(test_results2.PValue, alpha=0.05)
pvalue_cutoff_y = -np.log10(adjusted[3])
interesting_probes2 = test_results2[test_results2['minuslog10pvalue'] >= pvalue_cutoff_y] #bonferoni correction for cutoff
interesting_probes2
[12]:
Coefficient StandardError PValue 95%CI_lower 95%CI_upper Rsquared FDR_QValue minuslog10pvalue chromosome MAPINFO
cg03555227 0.004583 0.000380 1.453304e-14 0.890321 0.890629 0.792946 6.022099e-09 13.837644 05 CHR-170862066.0
cg16867657 0.008122 0.000718 9.888059e-14 0.877810 0.878454 0.771117 2.048672e-08 13.004889 06 CHR-11044644.0
cg11169102 -0.004757 0.000460 1.323437e-12 -0.859258 -0.858785 0.737918 1.550767e-07 11.878297 05 CHR-170858836.0
cg16909962 0.009223 0.000903 1.871221e-12 0.855763 0.856707 0.733140 1.550767e-07 11.727875 01 CHR-229270964.0
cg23045594 0.005227 0.000511 1.856802e-12 0.856031 0.856566 0.733247 1.550767e-07 11.731235 02 CHR-71276753.0
... ... ... ... ... ... ... ... ... ... ...
cg04091914 -0.003923 0.000604 1.185223e-07 -0.726027 -0.724906 0.526302 6.947979e-05 6.926200 02 CHR-227686822.0
cg25874108 -0.002309 0.000355 1.190858e-07 -0.725717 -0.725057 0.526186 6.947979e-05 6.924140 17 CHR-74896813.0
cg01235463 -0.002696 0.000415 1.188893e-07 -0.725800 -0.725029 0.526227 6.947979e-05 6.924857 10 CHR-91154018.0
cg11126313 -0.001694 0.000261 1.199705e-07 -0.725505 -0.725020 0.526006 6.972307e-05 6.920925 03 CHR-46242771.0
cg04161137 -0.003023 0.000465 1.199026e-07 -0.725704 -0.724839 0.526020 6.972307e-05 6.921172 01 CHR-22647687.0

713 rows × 10 columns

Volcano Plot

Logistic Regression Volcano Plots are under construction

Below shows how to create a Volcano plot using the linear regression DMR analysis.

Positive correlation (hypermethylated) in red. Negative correclation (hypomethylated) in blue

A higher |Regression (beta) coefficient| means that that specific probe is more significant in predicting if the probe is methylated (positive) vs not methylated (negative). The non-gray probes are the probes that are statistically significant, and have a higher absolute value of the regression coefficient as the cutoff. The red probes further to the right on the plot are hypermethylated with increase in age, and the opposite is true for the blue probes.

Useful Arguments:

  • alpha: Default: 0.05 alpha level The significance level that will be used to highlight the most significant adjusted p-values (FDR Q-values) on the plot. This is the horizontal cutoff line you see on the plot.
  • cutoff: Default: No cutoff format: a list or tuple with two numbers for (min, max) If specified in kwargs, will exclude values within this range of regression coefficients from being “significant” and put dotted vertical lines on chart. These are the vertical cutoff lines you see on the plot. These cutoffs are dependent on the study and up to the researcher in choosing which cutoff is the most desirable.
[20]:
methylize.volcano_plot(test_results2, fontsize=16, cutoff=(-0.0005,0.0005), alpha=0.05, save=False)
Excluded 269865 probes outside of the specified beta coefficient range: (-0.0005, 0.0005)
../_images/docs_methylize_tutorial_38_1.png

Find Significant Probes

[21]:
interesting_probes3 = test_results2[(test_results2['FDR_QValue'] <= 0.05) & (np.abs(test_results2['Coefficient']) > 0.0005)]
interesting_probes3
[21]:
Coefficient StandardError PValue 95%CI_lower 95%CI_upper Rsquared FDR_QValue minuslog10pvalue chromosome MAPINFO
cg03555227 0.004583 0.000380 1.453304e-14 0.890321 0.890629 0.792946 6.022099e-09 13.837644 05 CHR-170862066.0
cg16867657 0.008122 0.000718 9.888059e-14 0.877810 0.878454 0.771117 2.048672e-08 13.004889 06 CHR-11044644.0
cg11169102 -0.004757 0.000460 1.323437e-12 -0.859258 -0.858785 0.737918 1.550767e-07 11.878297 05 CHR-170858836.0
cg16909962 0.009223 0.000903 1.871221e-12 0.855763 0.856707 0.733140 1.550767e-07 11.727875 01 CHR-229270964.0
cg23045594 0.005227 0.000511 1.856802e-12 0.856031 0.856566 0.733247 1.550767e-07 11.731235 02 CHR-71276753.0
... ... ... ... ... ... ... ... ... ... ...
cg18227216 -0.000972 0.000331 5.680062e-03 -0.430024 -0.428965 0.184466 4.997593e-02 2.245647 17 CHR-10225251.0
cg16297435 0.001548 0.000528 5.681080e-03 0.428642 0.430330 0.184459 4.997954e-02 2.245569 06 CHR-43060558.0
cg05515570 -0.000716 0.000244 5.681038e-03 -0.429877 -0.429096 0.184459 4.997954e-02 2.245572 10 CHR-28125056.0
cg13346820 -0.001955 0.000667 5.680922e-03 -0.430553 -0.428421 0.184460 4.997954e-02 2.245581 05 CHR-865203.0
cg12773197 -0.001329 0.000453 5.681300e-03 -0.430209 -0.428760 0.184457 4.998041e-02 2.245552 13 CHR-111586326.0

43781 rows × 10 columns

Differentiated Methylized Regions (DMR) Analysis

DMRs are genomic regions where DNA methylation levels differ between two groups of samples, while DMPs are probes where methylation levels differ between two groups of samples. DMR looks at clusters or adjacent probes, and if the whole cluster shows the same direction of effect between groups, then it is more significant (likely to be meaningful).

We have already ran methylize.diff_meth_pos with the continuous numeric phenotypes, and the statistics for that fuction is stored in the dataframe test_restuls2. This is what we will use in the DMR function below.

This following step should create these files: * dmr.acf.txt

  • dmr.args.txt
  • dmr.fdr.bed.gz
  • dmr.manhattan.png
  • dmr.regions-p.bed.gz
  • dmr.slk.bed.gz
  • dmr_regions.csv
  • dmr_regions_genes.csv
  • dmr_stats.csv
  • stats.bed

This could run for a while. This function compares all of the probes and clusters CpG probes that show a difference together if they are close to each other in the genomic sequence.

There are many adjustible parameters for the following function, and you can refer to the API Reference for more information.

[15]:
files_created = methylize.diff_meth_regions(test_results2, '450k', prefix='../data/asthma/dmr/')
INFO:methylprep.files.manifests:Reading manifest file: HumanMethylation450k_15017482_v3.csv
INFO:methylprep.files.manifests:Reading manifest file: HumanMethylation450k_15017482_v3.csv
/Users/jaredmeyers/opt/anaconda3/lib/python3.8/site-packages/methylize/diff_meth_regions.py:150: DtypeWarning: Columns (0) have mixed types.Specify dtype option on import or set low_memory=False.
  results = _pipeline(kw['col_num'], kw['step'], kw['dist'],
Calculating ACF out to: 453
with 17 lags: [1, 31, 61, 91, 121, 151, 181, 211, 241, 271, 301, 331, 361, 391, 421, 451, 481]
18581032 bases used as coverage for sidak correction
INFO:methylize.diff_meth_regions:wrote: ../data/asthma/dmr/.regions-p.bed.gz, (regions with corrected-p < 0.05: 29)
/var/folders/3k/vrkjxj8s7l1gq4x6n7ymqncw0000gn/T/ipykernel_10952/1491901786.py:1: DtypeWarning: Columns (0) have mixed types.Specify dtype option on import or set low_memory=False.
  files_created = methylize.diff_meth_regions(test_results2, '450k', prefix='../data/asthma/dmr/')
INFO:methylize.genome_browser:Loaded 4802 CpG regions from ../data/asthma/dmr/_regions.csv.
INFO:methylize.genome_browser:Using cached `refGene`: /Users/jaredmeyers/opt/anaconda3/lib/python3.8/site-packages/methylize/data/refGene.pkl with (135634) genes
Mapping genes: 100%|██████████| 135634/135634 [00:30<00:00, 4381.01it/s]
INFO:methylize.genome_browser:Wrote ../data/asthma/dmr/_regions_genes.csv

Many of the files created by this function can be useful when using other tools. The file with the main summary of what was just done is found in dmr_regions_genes.csv.

[16]:
regions_genes = pd.read_csv('../data/asthma/dmr_regions_genes.csv')
regions_genes.head()
[16]:
Unnamed: 0 chrom chromStart chromEnd min_p n_probes z_p z_sidak_p name genes distances descriptions
0 0 1 22289407 22289457 0.03571 1 0.000094 0.7979 cg02053477 NaN NaN NaN
1 1 1 34757646 34757696 0.03110 1 0.000054 0.6018 cg13170235 SNRPC 142 Homo sapiens small nuclear ribonucleoprotein p...
2 2 1 34761106 34761156 0.03401 1 0.000081 0.7500 cg15750705 NaN NaN NaN
3 3 1 202858152 202858202 0.03110 1 0.000038 0.4790 cg15569630 NaN NaN NaN
4 4 10 22334619 22334669 0.03110 1 0.000038 0.4790 cg13327545 NaN NaN NaN

This shows clusters of CpG probes that were significantly different and annotes these clusters with mne or more nearby genes using the UCSV Genome Browswer database.

Distances are the number of base-pairs that separate the start of the CpG probe and the start of the coding sequence of the gene.

Only the rows with signicicant p-values will have an annotation.

Adopted from the combined-pvalues package by Brend Pedersen et al, 2013: Comb-p: software for combining, analyzing, grouping and correcting spatially correlated P-values doi: 10.1093/bioinformatics/bts545

More information on DMR on this page

Fetching Genes

This function annotates the DMR region output file using the UCSC Genome Browser database as a reference to what genes are nearby. For more information on this function and the specific argument, please refer to the API Reference

This function can now accept either the filepath or the DMR dataframe output.

[17]:
reg = pd.read_csv('../data/asthma/dmr_regions.csv')
genes = methylize.fetch_genes('../data/asthma/dmr_regions.csv')
genes.head(12)
INFO:methylize.genome_browser:Loaded 60 CpG regions from ../data/asthma/dmr_regions.csv.
INFO:methylize.genome_browser:Using cached `refGene`: /Users/jaredmeyers/opt/anaconda3/lib/python3.8/site-packages/methylize/data/refGene.pkl with (135634) genes
Mapping genes: 100%|██████████| 135634/135634 [00:28<00:00, 4768.33it/s]
INFO:methylize.genome_browser:Wrote ../data/asthma/dmr_regions_genes.csv
[17]:
chrom chromStart chromEnd min_p n_probes z_p z_sidak_p name genes distances descriptions
0 1 22289407 22289457 0.03571 1 0.000094 0.7979 cg02053477
1 1 34757646 34757696 0.03110 1 0.000054 0.6018 cg13170235 SNRPC 142 Homo sapiens small nuclear ribonucleoprotein p...
2 1 34761106 34761156 0.03401 1 0.000081 0.7500 cg15750705
3 1 202858152 202858202 0.03110 1 0.000038 0.4790 cg15569630
4 10 22334619 22334669 0.03110 1 0.000038 0.4790 cg13327545
5 11 2170376 2170426 0.03110 1 0.000038 0.4790 cg11911653 VARS2 -165,-135 Homo sapiens valyl-tRNA synthetase 2, mitochon...
6 11 2170376 2170426 0.03110 1 0.000038 0.4790 cg14714364 VARS2 -165,-135 Homo sapiens valyl-tRNA synthetase 2, mitochon...
7 11 2346840 2346890 0.04207 1 0.000138 0.9041 cg05532869
8 11 61755269 61755319 0.04150 1 0.000133 0.8968 cg11179625 MYRF -108 Homo sapiens myelin regulatory factor (MYRF), ...
9 11 69403865 69403915 0.03110 1 0.000068 0.6847 cg05582286 TACR2 -37 Homo sapiens tachykinin receptor 2 (TACR2), mR...
10 11 114063572 114063622 0.03110 1 0.000054 0.6018 cg10096321
11 11 117861576 117861626 0.03110 1 0.000038 0.4790 cg14848289