.. Pyllelic documentation master file, created by sphinx-quickstart on Sat Feb 20 10:57:56 2021. You can adapt this file completely to your liking, but it should at least contain the root `toctree` directive. Welcome to Pyllelic's documentation! ==================================== .. raw:: html `pyllelic `__: a tool for detection of allelic-specific methylation variation in bisulfite DNA sequencing files. .. toctree:: :maxdepth: 2 :caption: Contents: :hidden: index Quickstart ~~~~~~~~~~ Run an interactive sample pyllelic environment in your web browser using `mybinder.org `__: .. raw:: html Demo gif ~~~~~~~~ .. raw:: html

pyllelic demo gif

Dependencies and Installation ============================= Source code for pyllelic is available on `Github `__ Using Conda (preferred) ~~~~~~~~~~~~~~~~~~~~~~~ Create a new conda environment using python 3.8: Easiest: .. code:: bash # Get environment.yml file from this repo curl -L https://github.com/Paradoxdruid/pyllelic/blob/master/environment.yml?raw=true > env.yml # Create and activate conda environment conda env create --file=env.yml conda activate pyllelic .. raw:: html
.. raw:: html or more explict step by step instructions .. raw:: html .. code:: bash conda create --name pyllelic python=3.8 conda activate pyllelic conda config --env --add channels conda-forge conda config --env --add channels bioconda conda config --env --add channels paradoxdruid conda install pyllelic # Optional but usual use case: conda install notebook jupyter_contrib_nbextensions ipywidgets .. raw:: html
Docker container ~~~~~~~~~~~~~~~~ .. code:: bash docker pull ghcr.io/paradoxdruid/pyllelic:latest PyPi installation ~~~~~~~~~~~~~~~~~ .. raw:: html
.. raw:: html PyPi instructions .. raw:: html This will require independent installation of samtools, bowtie2, and bismark packages. .. code:: bash # PyPi python3 -m pip install pyllelic # or Github python3 -m pip install git+https://github.com/Paradoxdruid/pyllelic.git .. raw:: html
Quickstart ========== Example exploratory use in jupyter notebook ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Set up files: .. code:: python from pyllelic import process from pathlib import Path # Retrieve promoter genomic sequence of region to analyze process.retrieve_seq("tert_genome.txt", chrom="chr5", start=1293000, end=1296000) # Download a reference genome and bisulfite sequencing data # Genome data from, e.g. http://hgdownload.soe.ucsc.edu/goldenPath/hg19 # Fastq data from, e.g. http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeHaibMethylRrbs/ genome = Path("/{your_directory}/{genome_file_directory}") fastq = Path("/{your_directory}/{your_fastq_file.fastq.gz}") # Use bismark tool to prepare bisulfite genome and align fastq to bam file process.prepare_genome(genome) # can optionally give path to bowtie2 if not in PATH process.bismark(genome, fastq) # Sort and index the resultant bam file bamfile = Path("/{your_directory}/{bam_filename}.bam") process.sort_bam(bamfile) process.index_bam(bamfile.parent / f"{bamfile.stem}_sorted.bam") Run pyllelic: .. code:: python from pyllelic import pyllelic config = pyllelic.configure( # Specify file and directory locations base_path="/home/jovyan/assets/", prom_file="tert_genome.txt", prom_start="1293200", prom_end="1296000", chrom="5", offset=1293000, # start position of retrieved promoter sequence # viz_backend="plotly", # fname_pattern=r"^[a-zA-Z]+_([a-zA-Z0-9]+)_.+bam$", # test_dir="test", # results_dir="results", ) files_set = pyllelic.make_list_of_bam_files(config) # finds bam files # Run pyllelic; make take some time depending on number of bam files data = pyllelic.pyllelic(config=config, files_set=files_set) positions = data.positions cell_types = data.cell_types means_df = data.means # mean methylation of reads modes_df = data.modes # mode methylation of reads diff_df = data.diffs # difference mean - mode of reads individual_data = data.individual_data # read methylation values data.save("output.xlsx") # save methylation results data.save_pickle("my_run.pickle") # save data object for later analysis data.write_means_modes_diffs(filename="Run1_") # write output data files data.histogram("CELL_LINE", "POSITION") # visualize data for a point data.heatmap(min_values=1) # methylation level heatmap data.reads_graph() # individual methylated / unmethylated reads graph data.quma_results["CELL_LINE"] # see summary data for a cell line -------------- Function Reference ================== .. toctree:: :maxdepth: 2 pyllelic Authors ======= This software is developed as academic software by `Dr. Andrew J. Bonham `__ at the `Metropolitan State University of Denver `__. It is licensed under the GPL v3.0. This software incorporates implementation from `QUMA `__, licensed under the GPL v3.0.