João Sequeira · dbea63c6
--- a/Parameters-of-MOSCA.md
+++ b/Parameters-of-MOSCA.md
+## Base arguments for running MOSCA
+
+MOSCA accepts input from a config file, in either JSON or YAML format.
+This repo has an available [config file](https://github.com/iquasere/MOSCA/blob/development/config/config.json), 
+which can be used for MOSCA as follows:
+```
+python mosca.py --configfile config.json
+```
+The config file allows to customize MOSCA's workflow, but for the convenience of users, many typical decisions in MG and
+MT workflow are already automized. The customization, therefore, is only related to steps that are not yet well established
+in the field of MG (e.g. assembling data into contigs is still a controversial step that may lose information on data).
+
+Following are the options available in the config file, and the accepted values:
+
+|            Parameter           |                            Options                           | Required | Description                                                                                                                                                                                                                   |
+|:------------------------------:|:------------------------------------------------------------:|:--------:|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+|             output             |                            String                            |    Yes   | Name of folder where MOSCA's results will be stored (if it doesn't exist, it will be created)                                                                                                                                 |
+|             threads            |                              Int                             |    Yes   | Number of maximum threads for MOSCA to use                                                                                                                                                                                    |
+|           experiments          |                            String                            |    Yes   | Name of TSV file with information on samples/files/conditions                                                                                                                                                                 |
+| trimmomatic_adapters_directory |                            String                            |    Yes   | Name of folder containing adapters for Trimmomatic's ADAPTER REMOVAL preprocessing tool                                                                                                                                       |
+|    rrna_databases_directory    |                            String                            |    Yes   | Name of folder containing rRNA databases to use as reference for rRNA removal with SortMeRNA                                                                                                                                  |
+|            assembler           |                      metaspades, megahit                     |    Yes   | Name of assembler to use for iterative co-assembly of MG data                                                                                                                                                                 |
+|            markerset           |                            40, 107                           |    Yes   | Name of markerset to use for completeness/contamination estimation with CheckM over the contigs obtained with MaxBin2                                                                                                         |
+|           error_model          | sanger_5, sanger_10, 454_10, 454_30, illumina_5, illumina_10 |    No    | Name of file to use as the error model for gene calling with FragGeneScan. sanger, 454 or illumina if either Sanger, pyro- or Illumina sequencing reads are the input to gene calling. Leave empty if assembly was performed. |
+|        diamond_database        |                            String                            |    Yes   | Name of FASTA or DMND (DIAMOND formatted database) file to use as input for annotation with DIAMOND                                                                                                                           |
+|        download_uniprot        |                          TRUE, FALSE                         |    Yes   | If UniProtKB (SwissProt + TrEMBL) is to be download. If TRUE, will download it to the folder indicated in diamond_database                                                                                                    |
+|     diamond_max_target_seqs    |                              Int                             |    Yes   | Number of matches to report for each protein from annotation with DIAMOND                                                                                                                                                     |
+| recognizer_databases_directory |                            String                            |    Yes   | Name of folder containing the resources for reCOGnizer annotation. If those are not present in the folder, they will be downloaded                                                                                            |
+|      normalization_method      |                           TMM, RLE                           |    Yes   | Method to use for normalization                                                                                                                                                                                               |
+|        keggcharter_maps        |            Comma-separated list of KEGG maps' IDs            |    No    | If empty, KEGGCharter will use the default prokaryotic maps. These metabolic maps will have MG information represented in them, and gene expression if MT data is available                                                   |
+|     keggcharter_taxa_level     |  SPECIES, GENUS, FAMILY, ORDER, CLASS, PHYLUM, SUPERKINGDOM  |    Yes   | The taxonomic level to represent with KEGGCharter. If above SPECIES, KEGGCharter will represent group information and represent is as such for each taxonomic group                                                           |
+|   keggcharter_number_of_taxa   |                     Int, ideally under 11                    |    Yes   | How many of the most abundant taxa should be represented with KEGGCharter                                                                                                                                                     |
+|    reporter_lists_directory    |                            String                            |    Yes   | Name of folder containing lists for reporter module of MOSCA                                                                                                                                                                  |