Tutorial

The easiest way to use Crispulator is via the command line config file. If this format is too constraining, the Custom Simulations has a detailed walk- through of writing a custom simulation where each step can be modified according to need.

Graphical overview

The simulation is laid out in the following manner:

Getting started

First, navigate to the Crispulator directory.

Tip

You can find the directory by running

$ julia -e 'import Crispulator; println(normpath(pathof(Crispulator), "..", ".."))'

There should be a YAML file called example_config.yml. Open this is in a text editor and it should look like this

# This is an example configuration file. Whitespace is important.

# Settings pertaining to the library design
library:
    genome:
        num-genes: 500
        num-guides-per-gene: 5
        frac-increasing-genes: 0.02 # fraction of genes with a positive phenotype
        frac-decreasing-genes: 0.1 # fraction of genes with a negative phenotype

    guides:
        crispr-type: CRISPRn # either CRISPRi or CRISPRn
        frac-high-quality: 0.9 # fraction of high quality guides
        mean-high-quality-kd: 0.85 # mean knockdown by a high quality guide (CRISPRi only)

screen:
    type: facs # either facs or growth
    num-runs: 10 # how many independent runs

    representation: # integer value, how much larger are samples than the library
        - transfection: 100
        - selection: 100
        - sequencing: 100

# screen-type specific parameters

    bin-size: 0.25 # size of tail to sample from, must be between 0 and 0.5 (FACS only)
    std-noise: 1 # (FACS only)
    num-bottlenecks: 10 # (Growth only)

This gives access to most dials in the simulation, if something is missing than see Custom Simulations.

Now, lets remove all genes that have a positive phenotype by changing line 8 to 0.0:

        frac-increasing-genes: 0.0 # fraction of genes with a positive phenotype

Running simulation

Now, we can actually run the code by executing the following command

julia run.jl config example_config.yml test_output
Tip

Here config tells CRISPulator to use the provided config example_config.yml and test_output is the directory where the results will be saved. This directory will be created if it doesn't exist.

The output should look like

[ Info: Activating simulation environment
  Activating project at `~/work/Crispulator.jl/Crispulator.jl`
[ Info: Instantiating environment
  Activating project at `~/work/Crispulator.jl/Crispulator.jl`
[ Info: Loading simulation framework
[ Info: Directory test_output does not exist, attempting to create
[ Info: Using 1 thread(s)
[ Info: Parsing config
[ Info: Running config
[ Info: Analyzing results
[ Info: Saving results in test_output

Quick results:
##############
Venn score = 0.992, 95% conf int (0.979, 1.006)
AUPRC score = 0.917, 95% conf int (0.885, 0.949)
SNR score = 4.055 +/- 0.382

The test_output/ directory should now be populated with all the files

counts.svg
results_table.csv
volcano.svg

Output

The folder contains one of the raw count scatterplots (left) and a volcano plot of mean log2 fold change versus significance of each gene (right)

It also has a useful table that contains all the summary statistic information.

10×8 DataFrame
Rowmethodmeasuregenetypestd_scoremean_scoreconf_maxconf_minn
String7String7String15Float64Float64Float64Float64Int64
1vennincsigmoidalNaNNaNNaNNaN10
2auprcincsigmoidalNaNNaNNaNNaN10
3venndecsigmoidal0.01.01.01.010
4auprcdecsigmoidal0.08563760.9284850.9983530.85861610
5vennincdecsigmoidal0.01.01.01.010
6auprcincdecsigmoidal0.1144810.8798540.9732550.78645210
7venninclinearNaNNaNNaNNaN10
8auprcinclinearNaNNaNNaNNaN10
9venndeclinear0.01505850.9952381.007520.98295210
10auprcdeclinear0.02721370.9511620.9733650.9289610

The table below describes each column

Column NameMeaning
methodWhich summary statistic was used (e.g. Crispulator.auprc)
measureWhether the score is only for increasing genes (inc), decreasing (dec) or both (incdec). Allows independent evaluation on which type of genes the screen can accurately evaluate.
genetypeWhether the score is for linear, sigmoidal, or all genes (see Crispulator.KDPhenotypeRelationship). Helps determine if CRISPRn or CRISPRi is better for this design.
mean_scoreAverage score
std_scoreStandard deviation in scores
conf_maxUpper limit of 95% confidence interval
conf_minLower limit of 95% confidence interval
nNumber of independent replicates

Experiments from the paper

This repository also includes a collection of experiments that were run for the paper. You can view the full list by running

julia run.jl ls

They are located in the exps/ directory and also listed here for convenience:

Experiment File
compare_methods.jl
facs_binning.jl
facs_binning_snr.jl
gen_plots.jl
growth_bottleneck_snr.jl
growth_bottlenecks.jl
growth_representation.jl
growth_sensitivity_library.jl
scan_rep_space.jl

You can run them as follows:

julia run.jl exp growth_sensitivity_library.jl output.csv

where the simulation result will be saved to output.csv.

Warning

Many of the experiments are quite computationally expensive so I recommend using Multiprocessing to accelerate the process.