Tutorial
The easiest way to use Crispulator is via the command line config file. If this format is too constraining, the Custom Simulations has a detailed walk- through of writing a custom simulation where each step can be modified according to need.
Graphical overview
The simulation is laid out in the following manner:
Getting started
First, navigate to the Crispulator directory.
You can find the directory by running
$ julia -e 'import Crispulator; println(normpath(pathof(Crispulator), "..", ".."))'
There should be a YAML file called example_config.yml
. Open this is in a text editor and it should look like this
# This is an example configuration file. Whitespace is important.
# Settings pertaining to the library design
library:
genome:
num-genes: 500
num-guides-per-gene: 5
frac-increasing-genes: 0.02 # fraction of genes with a positive phenotype
frac-decreasing-genes: 0.1 # fraction of genes with a negative phenotype
guides:
crispr-type: CRISPRn # either CRISPRi or CRISPRn
frac-high-quality: 0.9 # fraction of high quality guides
mean-high-quality-kd: 0.85 # mean knockdown by a high quality guide (CRISPRi only)
screen:
type: facs # either facs or growth
num-runs: 10 # how many independent runs
representation: # integer value, how much larger are samples than the library
- transfection: 100
- selection: 100
- sequencing: 100
# screen-type specific parameters
bin-size: 0.25 # size of tail to sample from, must be between 0 and 0.5 (FACS only)
std-noise: 1 # (FACS only)
num-bottlenecks: 10 # (Growth only)
This gives access to most dials in the simulation, if something is missing than see Custom Simulations.
Now, lets remove all genes that have a positive phenotype by changing line 8 to 0.0:
frac-increasing-genes: 0.0 # fraction of genes with a positive phenotype
Running simulation
Now, we can actually run the code by executing the following command
julia run.jl config example_config.yml test_output
Here config
tells CRISPulator
to use the provided config example_config.yml
and test_output
is the directory where the results will be saved. This directory will be created if it doesn't exist.
The output should look like
[ Info: Activating simulation environment
Activating project at `~/work/Crispulator.jl/Crispulator.jl`
[ Info: Instantiating environment
Activating project at `~/work/Crispulator.jl/Crispulator.jl`
[ Info: Loading simulation framework
[ Info: Directory test_output does not exist, attempting to create
[ Info: Using 1 thread(s)
[ Info: Parsing config
[ Info: Running config
[ Info: Analyzing results
[ Info: Saving results in test_output
Quick results:
##############
Venn score = 0.992, 95% conf int (0.979, 1.006)
AUPRC score = 0.917, 95% conf int (0.885, 0.949)
SNR score = 4.055 +/- 0.382
The test_output/
directory should now be populated with all the files
counts.svg
results_table.csv
volcano.svg
Output
The folder contains one of the raw count scatterplots (left) and a volcano plot of mean log2 fold change versus significance of each gene (right)
It also has a useful table that contains all the summary statistic information.
Row | method | measure | genetype | std_score | mean_score | conf_max | conf_min | n |
---|---|---|---|---|---|---|---|---|
String7 | String7 | String15 | Float64 | Float64 | Float64 | Float64 | Int64 | |
1 | venn | inc | sigmoidal | NaN | NaN | NaN | NaN | 10 |
2 | auprc | inc | sigmoidal | NaN | NaN | NaN | NaN | 10 |
3 | venn | dec | sigmoidal | 0.0 | 1.0 | 1.0 | 1.0 | 10 |
4 | auprc | dec | sigmoidal | 0.0856376 | 0.928485 | 0.998353 | 0.858616 | 10 |
5 | venn | incdec | sigmoidal | 0.0 | 1.0 | 1.0 | 1.0 | 10 |
6 | auprc | incdec | sigmoidal | 0.114481 | 0.879854 | 0.973255 | 0.786452 | 10 |
7 | venn | inc | linear | NaN | NaN | NaN | NaN | 10 |
8 | auprc | inc | linear | NaN | NaN | NaN | NaN | 10 |
9 | venn | dec | linear | 0.0150585 | 0.995238 | 1.00752 | 0.982952 | 10 |
10 | auprc | dec | linear | 0.0272137 | 0.951162 | 0.973365 | 0.92896 | 10 |
The table below describes each column
Column Name | Meaning |
---|---|
method | Which summary statistic was used (e.g. Crispulator.auprc ) |
measure | Whether the score is only for increasing genes (inc ), decreasing (dec ) or both (incdec ). Allows independent evaluation on which type of genes the screen can accurately evaluate. |
genetype | Whether the score is for linear, sigmoidal, or all genes (see Crispulator.KDPhenotypeRelationship ). Helps determine if CRISPRn or CRISPRi is better for this design. |
mean_score | Average score |
std_score | Standard deviation in scores |
conf_max | Upper limit of 95% confidence interval |
conf_min | Lower limit of 95% confidence interval |
n | Number of independent replicates |
Experiments from the paper
This repository also includes a collection of experiments that were run for the paper. You can view the full list by running
julia run.jl ls
They are located in the exps/
directory and also listed here for convenience:
Experiment File |
---|
compare_methods.jl |
facs_binning.jl |
facs_binning_snr.jl |
gen_plots.jl |
growth_bottleneck_snr.jl |
growth_bottlenecks.jl |
growth_representation.jl |
growth_sensitivity_library.jl |
scan_rep_space.jl |
You can run them as follows:
julia run.jl exp growth_sensitivity_library.jl output.csv
where the simulation result will be saved to output.csv
.
Many of the experiments are quite computationally expensive so I recommend using Multiprocessing to accelerate the process.