Commit f8ea1354 authored by Vítor Pereira's avatar Vítor Pereira
Browse files

UPDATE

parent 507a116c
.ipynb_checkpoints
\ No newline at end of file
.ipynb_checkpoints
logs
\ No newline at end of file
%% Cell type:markdown id: tags:
# Strain Optimization in MEWpy
This notebook exemplifies how MEWpy may be used in strain optimization tasks.
We will consider the as goal to increase the production of succinate in E. coli in anaerobic conditions.
%% Cell type:code id: tags:
``` python
from cobra.io.sbml import read_sbml_model
model = read_sbml_model("data/e_coli_core.xml.gz")
```
%% Cell type:markdown id: tags:
## Optimization problem
The optimization problem requires the definition of one or more objective functions.
In the next example, we define as objective the maximization of the Biomass-Product coupled yield and the maximization of the targeted product. MEWpy makes available other optimization objectives, please refer to the documentation.
%% Cell type:code id: tags:
``` python
from mewpy.optimization.evaluation import BPCY, TargetFlux
objs = [
TargetFlux("EX_succ_e"),
BPCY('BIOMASS_Ecoli_core_w_GAM', "EX_succ_e")
]
```
%% Cell type:markdown id: tags:
We can now define the growth medium and select the type of modification.
We will begin by performing reaction deletion (RKO) by selecting the appropriate problem instance, RKOProblem.
%% Cell type:code id: tags:
``` python
from mewpy.problems import RKOProblem
anaerobic = {'EX_o2_e': (0, 0)}
problem = RKOProblem(model, fevaluation=objs, envcond=anaerobic)
```
%% Cell type:markdown id: tags:
Other optimization strategies may be used:
* Reaction under or over expression: *ROUProblem*
* Gene deletion: *GKOProblem*
* Gene under or over expression: *GOUProblem*
A problem may include other parameters, such as the maximum number of genetic modification, a list of specitic targerts, or non-targets.
%% Cell type:markdown id: tags:
## Optimization algorithm
An optimization engine to solve the problem need to be instanciated. MEWpy uses Evolutionary Algorithms (EA) for this task.
EAs are algorithms that mimic the Darwinian evolutionary process, where a population of solutions evolves generation after generation. In the example we define a maximum of 20 generations.
%% Cell type:code id: tags:
``` python
from mewpy.optimization import EA
ea = EA(problem, max_generations=20)
```
%% Cell type:markdown id: tags:
The EA may contemplate aditional parameters such as the identification of a specific algorithm (e.g. SPEA2, NSGAII, NSGAIII, GDE3, etc.), multiprocessing options, initial seeding, etc.
To start the optimization process invoke the run method:
%% Cell type:code id: tags:
``` python
solutions = ea.run()
```
%% Cell type:markdown id: tags:
We can now list the set of solutions:
%% Cell type:code id: tags:
``` python
df=ea.dataframe()
df
```
%% Cell type:markdown id: tags:
Or view the best solutions on the objective space, the so called Pareto front:
%% Cell type:code id: tags:
``` python
ea.plot()
```
%% Cell type:markdown id: tags:
We can save the solutions to file:
%% Cell type:code id: tags:
``` python
df.to_csv('solutions.csv')
```
%% Cell type:markdown id: tags:
or performe additional analysis on the solutions by retreiving the simulator used during the optimization.
%% Cell type:code id: tags:
``` python
sim = problem.simulator
```
%% Cell type:markdown id: tags:
## Working with solutions
Let us select the first solution:
%% Cell type:code id: tags:
``` python
solution = solutions[0]
solution
```
%% Cell type:markdown id: tags:
The solution is converted to metabolict constraints to be applied to the model. We can access these constrains using *solution.constraints*
%% Cell type:code id: tags:
``` python
solution.constraints
```
%% Cell type:markdown id: tags:
We may run phenotypic simulations with the solution using those constraints. Note that there is no need to redefine the medium as they are persistent in the simulator instance.
%% Cell type:code id: tags:
``` python
res=sim.simulate(constraints=solution.constraints,method='pFBA')
res
```
%% Cell type:code id: tags:
``` python
res.dataframe
```
%% Cell type:code id: tags:
``` python
res.find('succ|BIOMASS')
```
%% Cell type:code id: tags:
``` python
%matplotlib inline
from mewpy.visualization.envelope import plot_flux_envelope
plot_flux_envelope(sim,'BIOMASS_Ecoli_core_w_GAM','EX_succ_e',constraints = solution.constraints)
```
%% Cell type:code id: tags:
``` python
```
%% Cell type:markdown id: tags:
# MEWpy Simulation
This notebook exemplifies how MEWpy may be used for phenotype simulation.
%% Cell type:markdown id: tags:
Models can be loaded using MEW, REFRAMED, or COBRApy.
Load a model using REFRAMED:
%% Cell type:code id: tags:
``` python
from reframed.io.sbml import load_cbmodel
model = load_cbmodel('data/e_coli_core.xml.gz')
```
%% Cell type:markdown id: tags:
or using COBRApy:
%% Cell type:code id: tags:
``` python
from cobra.io import read_sbml_model
model = read_sbml_model('data/e_coli_core.xml.gz')
```
%% Cell type:markdown id: tags:
A simulator object provides a common interface to realize the main phenotype analysis tasks. The *get_simulator* function returns a simulator, a wrapper, for the provided model. The simulator interface remains the same regardless of how the model was loaded, using REFRAMED or COBRApy. This simplify the use of both environments and ease the management of future changes and deprecation on their APIs.
%% Cell type:code id: tags:
``` python
from mewpy.simulation import get_simulator
simul = get_simulator(model)
```
%% Cell type:markdown id: tags:
The simulator offers a wide API, and enable to perform basic tasks, such as, list metabolites, reactions, genes, compartments, uptake reactions, and transport reactions:
%% Cell type:code id: tags:
``` python
# first 10 metabolites
simul.metabolites[:10]
```
%% Cell type:code id: tags:
``` python
# first 10 metabolites
simul.reactions[:10]
```
%% Cell type:code id: tags:
``` python
# first 10 genes
simul.genes[:10]
```
%% Cell type:code id: tags:
``` python
simul.compartments
```
%% Cell type:code id: tags:
``` python
simul.get_uptake_reactions()
```
%% Cell type:code id: tags:
``` python
simul.get_transport_reactions()
```
%% Cell type:markdown id: tags:
A simulator may also be loaded considering environmental conditions, that will be considered during phenotype simulations. In the next example, glucose consumption is limited to 10 mmol/gDW/h in anaerobical conditions.
%% Cell type:code id: tags:
``` python
envcond = {'EX_glc__D_e': (-10.0, 100000.0),
'EX_o2_e':(0,0)}
simul = get_simulator(model,envcond=envcond)
```
%% Cell type:markdown id: tags:
All phenotype simulations will consider the imposed environmental conditions, and as such they only need to be set once. Also, these conditions do not persistently alter the model, which can be reused with a different simulator instance.
%% Cell type:markdown id: tags:
## Phenotype simulation
Phenotype simulations are also run using the simulator instance using the `simulate` method.
%% Cell type:code id: tags:
``` python
# FBA
result = simul.simulate()
# or
result = simul.simulate(method='FBA')
result
```
%% Cell type:markdown id: tags:
Flux Balance Analysis (FBA) can be run without identifying any method, or by passing the 'FBA' as method parameter. Other phenotype simulation methods may also be run using one of the identifiers:
- Flux Balance Analysis: `method = 'FBA'`
- Parsimonious FBA:`method = 'pFBA'`
- Minimization of Metabolic Adjustment:`method = 'MOMA'`
- Linear MOMA: `method = 'lMOMA'`
- Regulatory on/off minimization of metabolic flux: `method = 'ROOM'`
%% Cell type:code id: tags:
``` python
# pFBA
result = simul.simulate(method = 'pFBA')
result
```
%% Cell type:markdown id: tags:
## Reaction fluxes
The phenotype simulation result object, besides containing the objective value and solver status, also include reaction fluxes in the form of a dictionary:
%% Cell type:code id: tags:
``` python
result.fluxes
```
%% Cell type:markdown id: tags:
or as a table (dataframe):
%% Cell type:code id: tags:
``` python
result.dataframe
```
%% Cell type:markdown id: tags:
Individual reaction flux values can be obtained from the dictionary representation. For example, the *triose-phosphate isomerase* (TPI) reaction flux can be obtained from the previous pFBA simulation using the reaction identifier:
%% Cell type:code id: tags:
``` python
result.fluxes['TPI']
```
%% Cell type:markdown id: tags:
To retrieve the net conversion equation invoke the `get_net_conversion` method. Exchange reactions with no flux are not presented.
%% Cell type:code id: tags:
``` python
result.get_net_conversion()
```
%% Cell type:markdown id: tags:
## Retrieving and setting the model objective
The simulation objective, when running FBA or pFBA phenotype simulations, is, by default, the model objective which can be seen using the simulator.
%% Cell type:code id: tags:
``` python
simul.objective
```
%% Cell type:markdown id: tags:
The simulator may also be used to change the model objective, for example, to optimize the ATP maintenance requirement (ATPM):
%% Cell type:code id: tags:
``` python
simul.objective = 'ATPM'
```
%% Cell type:code id: tags:
``` python
simul.objective
```
%% Cell type:code id: tags:
``` python
simul.objective='BIOMASS_Ecoli_core_w_GAM'
```
%% Cell type:markdown id: tags:
## Adding additional constraints to phenotype simulations
Simulations may include additional metabolic constraints on reaction fluxes. From the previous pFBA simulation one can observe that the organism does not produce succinate:
%% Cell type:code id: tags:
``` python
result.fluxes['EX_succ_e']
```
%% Cell type:markdown id: tags:
Additional constraints may be added to the model so that the organism start to produce this aromatic amino acid. We may alter, for example, the *phosphogluconate dehydrogenase* reaction bounds, among others, starting by verifying its initial bounds:
%% Cell type:code id: tags:
``` python
simul.get_reaction_bounds('GND')
```
%% Cell type:code id: tags:
``` python
constraints = {'GND': 0, # deletion
'PYK': 0, # deletion
'ME2': 0, # deletion
}
# run a pFBA simulation accounting with the new constraint
result = simul.simulate(method='pFBA',constraints=constraints)
result.fluxes['EX_succ_e']
```
%% Cell type:markdown id: tags:
Note that the modifications are not persistently applied to the model, they only exist during the simulation.
We also need to verify that the organism continues to grow:
%% Cell type:code id: tags:
``` python
result.fluxes['BIOMASS_Ecoli_core_w_GAM']
```
%% Cell type:markdown id: tags:
We can also plot the production envelope:
%% Cell type:code id: tags:
``` python
%matplotlib inline
from mewpy.visualization.envelope import plot_flux_envelope
plot_flux_envelope(simul,'BIOMASS_Ecoli_core_w_GAM','EX_succ_e',constraints = constraints)
```
%% Cell type:markdown id: tags:
The `simulate` method includes additional parameters, such as the optimization direction. For a full description please refer to the module documentation.
%% Cell type:markdown id: tags:
## Flux Variability Analysis
The simulator interface also allows to perform Flux Variability Analysis (FVA) for L-tyrosine:
%% Cell type:code id: tags:
``` python
# returns a dictionar
simul.FVA(reactions=['EX_succ_e'])
```
%% Cell type:code id: tags:
``` python
# or a data frame
simul.FVA(reactions=['EX_succ_e'],format='df')
```
%% Cell type:markdown id: tags:
By default, MEWpy sets the model objective fraction to 90%, however this fraction may be altered. For example, one might want to consider a fraction of 10% from optimal growth:
%% Cell type:code id: tags:
``` python
simul.FVA(reactions=['EX_succ_e'],obj_frac=0.9)
```
%% Cell type:markdown id: tags:
The FVA simulations are run considering the defined environmental conditions. Additional constraints may be added, or changed, such as the previouly used to increase L-tyrosine production.
%% Cell type:code id: tags:
``` python
simul.FVA(reactions=['EX_succ_e'], constraints=constraints)
```
%% Cell type:markdown id: tags:
COBRApy users may have noticed that this same task would have required many additional coding lines if using the COBRApy API directly.
%% Cell type:markdown id: tags:
## Genes and reactions essentiality
Gene and reaction essentiality tests identify, respectively, the list of genes and reactions whose deletion would prevent the organism to grow.
%% Cell type:code id: tags:
``` python
simul.essential_reactions()
```
%% Cell type:code id: tags:
``` python
simul.essential_genes()
```
%% Cell type:markdown id: tags:
For more options and methods please refer to the MEWpy documentation.
......@@ -15,7 +15,13 @@ To run these tutorials you need a Python 3.6+ environment. If you don't have one
pip install mewpy cplex escher
```
Please note that this will install the free version of [CPLEX](https://www.ibm.com/analytics/cplex-optimizer), which is limited to the simulation of small models. To install the full version you should obtain an academic license.
Please note that this will install the free version of [CPLEX](https://www.ibm.com/analytics/cplex-optimizer), which is limited to the simulation of small models. To install the full version you should obtain an academic license. The exercises can not be run with the community CPLEX.
To run optimization on [ETFL](https://github.com/EPFL-LCSB/etfl) models you first need to install the ETFL package.
### Running the examples
......
This diff is collapsed.
This diff is collapsed.
%% Cell type:markdown id: tags:
## Module 3 session 2
# Exercice 1
Nutrient and proteome limited growth.
%% Cell type:code id: tags:
``` python
%matplotlib inline
from reframed.io.sbml import load_cbmodel
import matplotlib.pyplot as plt
from mewpy.simulation import get_simulator
```
%% Cell type:markdown id: tags:
Start by loading the E. coli model iJO1366.
You may use COBRApy or REFRAMED.
%% Cell type:code id: tags:
``` python
model = load_cbmodel('data/iJO1366.xml',flavor='cobra')
simul = get_simulator(model)
```
%% Cell type:markdown id: tags:
Create medium conditions where all nutrients, except glucose, are unlimited. This medium simulates what somehow could be found in batch cultures
%% Cell type:code id: tags:
``` python
GLC = 'R_EX_glc__D_e'
BIOMASS = "R_BIOMASS_Ec_iJO1366_WT_53p95M"
envcond = {rxn:(-10000,10000) for rxn in simul.get_uptake_reactions() if rxn!=GLC}
```
%% Cell type:markdown id: tags:
Redefine the simulation environment considering this medium.
%% Cell type:code id: tags:
``` python
simul = get_simulator(model,envcond=envcond)
simul.objective = BIOMASS
```
%% Cell type:markdown id: tags:
The function bellow plots the glucose dependent growth given a simulation environment
%% Cell type:code id: tags:
``` python
def plot_growth(simul,title=''):
x_values =[]
y_values =[]
for x in range(0,100,5):
c = {GLC:(-x,1000)}
s = simul.simulate(constraints=c)
try:
b = s.fluxes[BIOMASS]
except:
b = 0
x_values.append(x)
y_values.append(b)