Release History¶

v1.1.0¶

We found that diff_meth_pos results were not accurate in prior versions and have fixed the regression optimization.
diff_meth_pos function kwargs changed to provide more flexibility in how the model is optimized.
- Added support for COVARIATES in logistic regression. Provide a dataframe with both the phenotype and covariates, and specify which columns are phenotype or covariates. It will rearrange and normalize to ensure the model works best.
- Use the new ‘solver’ kwarg in diff_meth_pos to specify which form of linear or logistic regression to run. There are two flavors of each, and both give nearly identical results.
- Auto-detects logistic or linear based on input: if non-numeric inputs in phenotype of exactly two values, it assumes logistic.
Upgraded manhattan and volcano plots with many more options. Default settings should mirror most R EWAS packages now, with a “suggestive” and “significant” threshold line on manhattan plots.
Unit test coverage added.

Fixed option to use Differentially methylated regions (DMR) via cached local copy of UCSC database (via fetch_genes) without using the internet. Previously, it would still contact the internet database even if user told it not to.
Added testing via github actions, and increased speed
updated documentation

fixed bug in fetch_genes() from UCSC browser; function will now accept either the filepath or the DMR dataframe output.

Added a differentially methylated regions (DMR) functions that takes the output of the diff_meth_pos (DMP) function.
- DMP maps differences to chromosomes; DMR maps differences to specific genomic locii, and requires more processing.
- upgraded methylprep manifests to support both old and new genomic build mappings for all array types. In general, you can supply a keyword argument (genome_build='OLD') to change from the new build back to the old one.
- Illumina 27k arrays are still not supported, but mouse, epic, epic+, and 450k ARE supported. (Genome annotation won’t work with mouse array, only human builds.)
- DMP integrates the combined-pvalues package (https://pubmed.ncbi.nlm.nih.gov/22954632/)
- DMP integrates with UCSC Genome (refGene) and annotates the genes near CpG regions.
- Annotation includes column(s) showing the tissue specific expression levels of relevant genes (e.g. filter=blood) this function is also available with extended options as methylize.filter_genes()
- provides output BED and CSV files for each export into other genomic analysis tools
methylize.to_BED will convert the diff_meth_pos() stats output into a standard BED file (a tab separated CSV format with standardized, ordered column names)

Fixed bug where methylize could not find a data file in path, causing ImportError
Improved diff_meth_pos() function and added support for all array types. Now user must specify the array_type when calling the function, as the input data are stats, not probe betas, so it cannot infer the array type from this information.