From ASVs to metabolic pathways with PICRUSt2
Resources
https://github.com/picrust/picrust2/wiki https://github.com/picrust/picrust2/wiki/PICRUSt2-Tutorial-(v2.1.4-beta) Be aware of limitations which are outlined in the above wiki page
The code shown below can be executed on a high performance computer (HPC) with PICRUSt2 installed or on any home computer with decent CPU power, RAM and PICRUSt2 installed. All code is compiled into a .txt file which is lined up on to be executed on the HPC
Filtering Qiime2 feature table
Contamination
qiime taxa filter-table //
--i-table feature_table.qza //
--i-taxonomy taxonomy_silva.qza //
--p-exclude mitochondria,chloroplast //
--o-filtered-table feature_table_no-mitoch_chloropl.qza
Mean read frequency = 10 and minimum 2 samples
qiime feature-table filter-features //
--i-table feature_table_no-mitoch_chloropl.qza //
--p-min-samples 2 --p-min-frequency 10 //
--o-filtered-table feature_table_filtered.qza
Filter sequences to the filtered feature table
qiime feature-table filter-seqs //
--i-data sample_rep_seqs.qza //
--i-table feature_table_filtered.qza //
--o-filtered-data sample_rep_seqs-filtered.qza
Export qiime sequences feature table
Export into fasta (fna) and biome file to run with PICRUSt2 which was installed in a conda (python) environment. Using the min10 frequency and min 2 samples filtered with qiime2.
qiime tools export //
--input-path sample_rep_seqs-filtered.qza //
--output-path exports_for_picrust
qiime tools export //
--input-path feature_table_filtered.qza //
--output-path exports_for_picrust
This produces a folder called exports_for_picrust
containing a feature-table.biom
and a dna-sequences.fasta
file which are inputs for PICRUSt2
Picrust pipeline
This is the guts of PICRUSt2 and runs all of the steps and outputs the Enzyme metagenomes and MetaCyc (default) pathway abundances. It is run with PICRUSt2 installed on a conda environment using the default nearest-sequenced taxon index (NSTI)
picrust2_pipeline.py -s dna-sequences.fasta //
--stratified -i feature-table.biom -o picrust2_out_pipeline -p 4
KEGG pathways
to get the pathway abundances from the KO predictions instead of the EC predictions use this, but be aware that mapping is from 2011
pathway_pipeline.py -i rep_sequences_filteredtofeaturetable.qza -o picrust //
--no_regroup //
--map ~/miniconda3/envs/picrust2/lib/python3.6/site-packages/picrust2/default_files/pathway_mapfiles/KEGG_pathways_to_KO.tsv
## or use the stratified KO file
## to get the pathway abundances from the KO predictions use this, but be aware that mapping is from 2011 - see link above
pathway_pipeline.py -i KO_metagenome_out/pred_metagenome_contrib.tsv -o KEGG_pathways_out //
--no_regroup //
--map ~/miniconda3/envs/picrust2/lib/python3.6/site-packages/picrust2/default_files/pathway_mapfiles/KEGG_pathways_to_KO.tsv
For more information see PICRUSt2 wiki links above