Skip to content

GitLab

  • Menu
Projects Groups Snippets
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
  • M MOSCA
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 1
    • Issues 1
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 0
    • Merge requests 0
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Monitor
    • Monitor
    • Incidents
  • Packages & Registries
    • Packages & Registries
    • Package Registry
    • Container Registry
    • Infrastructure Registry
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • João Sequeira
  • MOSCA
  • Wiki
  • Partial runs

Last edited by João Sequeira Mar 01, 2021
Page history

Partial runs

You may not want to use the entire workflow of MOSCA. Here follow some interesting examples of tasks that are better executed running parts of MOSCA separately. The following commands assume you have installed MOSCA as instructed.

Preprocess NGS reads

MOSCA's preprocessing script can be used standalone, as it automatically downloads all resources required.

python ~/anaconda3/envs/mosca/share/MOSCA/scripts/preprocess.py -i {your input reads (e.g. mg_R1.fq,mg_R2.fq)} -t {number of threads} -o {output directory} -adaptdir {resources directory}/adapters -rrnadbs {resources directory}/rRNA_databases -d {data_type (either "dna" or "mrna")} -rd {resources directory} -n --minlen {minimum length of reads to keep} --avgqual {minimum average quality of reads to keep}

Run MOSCA without replicates

MOSCA's differential expression analysis module requires replicates. MOSCA's analysis is still possible without replicates by bypassing this task:

  1. First, preprocess your datasets as explained above
  2. Join your reads by sample by running, for each "forward" and "reverse" files, the following command:
cat {forward_file} >> {output}/Preprocess/{sample}_forward.fastq
cat {reverse_file} >> {output}/Preprocess/{sample}_forward.fastq
  1. Perform assembly by running this, for each sample
python ~/anaconda3/envs/mosca/share/MOSCA/scripts/assembly.py -r {output}/Preprocess/{sample}_forward.fastq,{output}/Preprocess/{sample}_reverse.fastq -t {threads} -o {output}/Assembly/{sample} -a {assembler (either "metaspades" or "megahit"} -m {max_memory}
  1. Perform binning, if you want to, by running, for each sample
python ~/anaconda3/envs/mosca/share/MOSCA/scripts/binning.py -c {output}/Assembly/{sample}/contigs.fasta -t {threads} -o {output}/Binning/{sample} -r {output}/Preprocess/{sample}_forward.fastq,{output}/Preprocess/{sample}_reverse.fastq -mset {markerset (either "107" or "40")}
  1. Perform gene calling and annotation over the contigs by running, for each sample
python ~/anaconda3/envs/mosca/share/MOSCA/scripts/annotation.py -i {output}/Assembly/{sample}/contigs.fasta -t {threads} -o {output}/Annotation/{sample} -em {error_model} -db {path/to/diamond_database.(fasta/dmnd)} -mts {diamond_max_target_seqs} --assembled"
  1. Run UPIMAPI for each sample
upimapi.py -i {output}/Annotation/{sample}/aligned.blast -o {output}/Annotation/uniprotinfo --blast --full-id
  1. Run reCOGnizer for each sample
recognizer.py -f {output}/Annotation/{sample}/fgs.faa -t {threads} -o {output}/Annotation/{sample} -rd {path/to/resources_directory} --remove-spaces
  1. Run quantification, all at once
python ~/anaconda3/envs/mosca/share/MOSCA/scripts/quantification_analyser.py -e {path/to/experiments_file} -t {threads} -o {output} -if {input_format_of_experiments_file ("excel" or "tsv")}
  1. Join all information
python ~/anaconda3/envs/mosca/share/MOSCA/scripts/join_information.py -e {path/to/experiments_file} -t {threads} -o {output} -if {input_format_of_experiments_file ("excel" or "tsv")} -nm {normalization_method ("TMM" or "RLE"}
  1. Run KEGGCharter
kegg_charter.py -f {output}/MOSCA_Entry_Report.xlsx -o {output}/KEGG_maps -mm {metabolic_maps comma-separate (e.g. 00030,00680,...)} -gcol {mg_names comma-separated} -tcol {mt_names comma-separated} -tc 'Taxonomic lineage ({taxa_level})' -not {number_of_taxa} -keggc 'Cross-reference (KEGG)'
  1. Run final reporting
python ~/anaconda3/envs/mosca/share/MOSCA/scripts/report.py -e {path/to/experiments_file} -o {output} -ldir ~/anaconda3/envs/mosca/share/MOSCA/resources -if {input_format_of_experiments_file ("excel" or "tsv")}
Clone repository
  • Home
  • Installing and running MOSCA
  • Parameters of MOSCA
  • Partial runs
  • Project and community
  • Technical documentation