Pf
Tuesday 16 March 2021
Recipee for the PF MN CSCS calculations
First it appears that conda is not present
This https://askubuntu.com/a/1062621 solved the problem export PATH=~/anaconda3/bin:$PATH
Need to reinstall the latest qiime version sudo wget https://data.qiime2.org/distro/core/qiime2-2021.2-py36-linux-conda.yml
source activate qiime2-2021.2
wget / curl not working (proxy reason most likely on x2go)
We use the peak_list_formatter_cscs.py to get GNPS job and propelry format the quantification tables
biom convert -i for_biom.tsv -o for_biom.biom --table-type="OTU table" --to-hdf5
Then convert the .biom feature table to a .qza feature table :
qiime tools import --type 'FeatureTable[Frequency]' --input-path for_biom.biom --output-path feature_table.qza
To compute the chemical structural and compositional dissimilarity metric for all pairs of samples in your feature table type:
nohup time qiime cscs cscs --p-css-edges networkedges_selfloop --i-features quantification_table --p-cosine-threshold 0.7 --p-normalization --p-cpus 40 --o-distance-matrix cscs_distance_matrix.qza
'time qiime cscs cscs --p-css-edges networkedges_selfloop/31a1340378cd46d7b9f5ebf8afbb2565..selfloop --i-features feature_table.qza --p-cosine-threshold 0.7 --p-normalization --p-cpus 40 --o-distance-matrix cscs_distance_matrix.qza'
'nohup bash -c 'time qiime cscs cscs --p-css-edges networkedges_selfloop/31a1340378cd46d7b9f5ebf8afbb2565..selfloop --i-features feature_table.qza --p-cosine-threshold 0.7 --p-normalization --p-cpus 40 --o-distance-matrix cscs_distance_matrix.qza''
'nohup bash -c 'time qiime cscs cscs --p-css-edges networkedges_selfloop/31a1340378cd46d7b9f5ebf8afbb2565..selfloop --i-features feature_table.qza --p-cosine-threshold 0.7 --p-normalization --p-no-weighted --p-cpus 40 --o-distance-matrix cscs_distance_matrix_unweighed.qza''
nohup bash -c 'time qiime cscs cscs --p-css-edges networkedges_selfloop/3466497461974198a9ab8c9463d05b53..selfloop --i-features feature_table.qza --p-cosine-threshold 0.7 --p-normalization --p-no-weighted --p-cpus 40 --o-distance-matrix cscs_distance_matrix_unweighed.qza'
nohup bash -c 'time qiime cscs cscs --p-css-edges networkedges_selfloop/3466497461974198a9ab8c9463d05b53..selfloop --i-features feature_table.qza --p-cosine-threshold 0.7 --p-normalization --p-cpus 40 --o-distance-matrix cscs_distance_matrix.qza'
Visualize the chemical structural and compositional dissimilarity in interactive PCoA space To create PCos from the chemical structural and compositional dissimilarity matrix type:
qiime diversity pcoa --i-distance-matrix cscs_distance_matrix.qza --o-pcoa cscs_PCoA.qza
To create an interactive ordination plot of the above created PCoA with integrated sample metadata, prepare a metadata file. You can find a metadata file for this example dataset within the Example folder. Make sure that the Sample IDs provided in the metadata file correspond to the Sample IDs in your distance_matrix.qza file. Then type:
Uploading previously generated metadata rsync -rvz -e 'ssh' --progress ./PF_metadata_qiime.tsv allardp@x2go.epgl-geneve.org:
qiime emperor plot --i-pcoa cscs_PCoA.qza --m-metadata-file PF_metadata_qiime.tsv --o-visualization cscs_PCoA.qzv
To visualize the interactive PCoA type:
qiime tools view cscs_PCoA.qzv
--p-normalization / --p-no-normalization Perform Total Ion Current Normalization (TIC) on the feature table [default: True] --p-weighted / --p-no-weighted Weight CSCS by feature intensity [default: True]
Qemistree dataset
https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=044e981ff0d84246ae5c91ef3db643a8
Update
Monday 29 November 2021
Reiterating qiime-cscs calculations for the MEMO paper. Installing qiime on the beast cluster
When trying to install using the latest qiime version (qiime2-2021.11) the qiime2-cscs plugin (https://github.com/madeleineernst/q2-cscs) doesn't appears I thus switch back to qiime2-2021.2
I used this script to fetch and format data from switch https://github.com/mandelbrot-project/qiime-empress-formatter/blob/main/src/python/peak_list_formatter_cscs_memo.py
From now restarting with the recipee above (will modify or update when needed)
We then convert using biom
(Note that BIOM needs to be installed in an env with conda > 3.8)
biom convert -i feature_table_for_biom.tsv -o biom_feature_table.biom --table-type="OTU table" --to-hdf5
I get a
'biom.exception.TableException: Duplicate observation IDs'
Apparently this error came from the fact that the feature-id were kept as index but not exported in the tsv table. Make sure they are.
I then activate the qiime2-2021.2 env
qiime tools import --type 'FeatureTable[Frequency]' --input-path biom_feature_table.biom --output-path feature_table.qza
We now launch the command and explicitly specify all options
nohup bash -c 'time qiime cscs cscs --p-css-edges networkedges_selfloop/31a1340378cd46d7b9f5ebf8afbb2565..selfloop --i-features feature_table.qza --p-cosine-threshold 0.7 --p-normalization --p-weighted --p-cpus 40 --o-distance-matrix pf_gnps3_cscs_distance_matrix_weighted.qza'
nohup bash -c 'time qiime cscs cscs --p-css-edges networkedges_selfloop/31a1340378cd46d7b9f5ebf8afbb2565..selfloop --i-features feature_table.qza --p-cosine-threshold 0.7 --p-normalization --p-no-weighted --p-cpus 40 --o-distance-matrix pf_gnps3_cscs_distance_matrix_unweighted.qza'
Here I get an error
Plugin error from cscs:
'function' object has no attribute 'ids'
Debug info has been saved to /tmp/qiime2-q2cli-err-d8fmgz9a.log
real 0m23.731s
user 0m8.089s
sys 0m3.754s
And this changed the file
nano ./anaconda3/envs/qiime2-2021.2/lib/python3.6/site-packages/q2_cscs/q2_cscs.py
by adding () to pa
if normalization:
features = features.norm(axis='sample', inplace=False)
if weighted == False:
features = features.pa()
To get the tsv
The resulting qza artifacts can then be renamed to .zip and extracted to fetch the corresponding distance matrix in tsv format.
qiime diversity pcoa --i-distance-matrix qeemistree_set_cscs_distance_matrix_norm_weighted.qza --o-pcoa PCOA_qeemistree_set_cscs_distance_matrix_norm_weighted.qza
qiime emperor plot --i-pcoa PCOA_qeemistree_set_cscs_distance_matrix_norm_weighted.qza --m-metadata-file metadata_table/metadata_table-00000.tsv --p-ignore-missing-samples --o-visualization PCOA_qeemistree_set_cscs_distance_matrix_norm_weighted_viz.qzv
qiime emperor plot --i-pcoa PCOA_qeemistree_set_cscs_distance_matrix_norm_unweighted.qza --m-metadata-file metadata_table-00000.tsv --p-ignore-missing-samples --o-visualization PCOA_qeemistree_set_cscs_distance_matrix_norm_unweighted_viz.qzv
qiime emperor plot --i-pcoa PCOA_qeemistree_set_cscs_distance_matrix_nonorm_weighted.qza --m-metadata-file metadata_table-00000.tsv --p-ignore-missing-samples --o-visualization PCOA_qeemistree_set_cscs_distance_matrix_nonorm_weighted_viz.qzv
qiime emperor plot --i-pcoa PCOA_qeemistree_set_cscs_distance_matrix_nonorm_unweighted.qza --m-metadata-file metadata_table-00000.tsv --p-ignore-missing-samples --o-visualization PCOA_qeemistree_set_cscs_distance_matrix_nonorm_unweighted_viz.qzv