From ASVs to metabolic pathways with PICRUSt2

Resources

https://github.com/picrust/picrust2/wiki https://github.com/picrust/picrust2/wiki/PICRUSt2-Tutorial-(v2.1.4-beta) Be aware of limitations which are outlined in the above wiki page

The code shown below can be executed on a high performance computer (HPC) with PICRUSt2 installed or on any home computer with decent CPU power, RAM and PICRUSt2 installed. All code is compiled into a .txt file which is lined up on to be executed on the HPC

Filtering Qiime2 feature table

Contamination

qiime taxa filter-table //
  --i-table feature_table.qza //
  --i-taxonomy taxonomy_silva.qza //
  --p-exclude mitochondria,chloroplast //
  --o-filtered-table feature_table_no-mitoch_chloropl.qza

Mean read frequency = 10 and minimum 2 samples

qiime feature-table filter-features //
  --i-table feature_table_no-mitoch_chloropl.qza //
  --p-min-samples 2 --p-min-frequency 10 //
  --o-filtered-table feature_table_filtered.qza

Filter sequences to the filtered feature table

qiime feature-table filter-seqs //
  --i-data sample_rep_seqs.qza //
  --i-table feature_table_filtered.qza //
  --o-filtered-data sample_rep_seqs-filtered.qza

Export qiime sequences feature table

Export into fasta (fna) and biome file to run with PICRUSt2 which was installed in a conda (python) environment. Using the min10 frequency and min 2 samples filtered with qiime2.

qiime tools export //
  --input-path sample_rep_seqs-filtered.qza  //
  --output-path exports_for_picrust

qiime tools export //
  --input-path feature_table_filtered.qza  //
  --output-path exports_for_picrust

This produces a folder called exports_for_picrust containing a feature-table.biom and a dna-sequences.fasta file which are inputs for PICRUSt2

Picrust pipeline

This is the guts of PICRUSt2 and runs all of the steps and outputs the Enzyme metagenomes and MetaCyc (default) pathway abundances. It is run with PICRUSt2 installed on a conda environment using the default nearest-sequenced taxon index (NSTI)

picrust2_pipeline.py -s dna-sequences.fasta //
  --stratified -i feature-table.biom -o picrust2_out_pipeline -p 4

KEGG pathways

to get the pathway abundances from the KO predictions instead of the EC predictions use this, but be aware that mapping is from 2011

pathway_pipeline.py -i rep_sequences_filteredtofeaturetable.qza -o picrust //
  --no_regroup //
  --map ~/miniconda3/envs/picrust2/lib/python3.6/site-packages/picrust2/default_files/pathway_mapfiles/KEGG_pathways_to_KO.tsv

## or use the stratified KO file
## to get the pathway abundances from the KO predictions use this, but be aware that mapping is from 2011 - see link above
pathway_pipeline.py -i KO_metagenome_out/pred_metagenome_contrib.tsv -o KEGG_pathways_out //
  --no_regroup //
  --map ~/miniconda3/envs/picrust2/lib/python3.6/site-packages/picrust2/default_files/pathway_mapfiles/KEGG_pathways_to_KO.tsv

For more information see PICRUSt2 wiki links above