Tutorials · BioSimulator.jl

Accessing simulation data

A single run generates a SamplePath object which is a recursive array provided by RecursiveArrayTools.jl.

sample_path = simulate(state, model, SortingDirect(), tfinal = 100.0, save_points = 0:25:100.0)

sample_path

t: 5-element Array{Float64,1}:
   0.0
  25.0
  50.0
  75.0
 100.0
x: 5-element Array{Array{Int64,1},1}:
 [10, 0, 0, 0, 0]   
 [1, 9, 1, 16, 58]  
 [0, 10, 0, 17, 141]
 [0, 10, 0, 18, 143]
 [0, 10, 0, 17, 142]

Access the state at the first time point:

sample_path[1]

5-element Array{Int64,1}:
 10
  0
  0
  0
  0

Access the third component at the second time point:

sample_path[3,2]

Access entire history for the third and fourth components:

sample_path[[3,4], :]

2×5 Array{Int64,2}:
 0   1   0   0   0
 0  16  17  18  17

Running multiple simulations generates an Ensemble, a collection of SamplePath objects; that is, Vector{SamplePath}. These objects also support indexing:

# ensemble of 10 trajectories
ensemble = [simulate(state, model, SortingDirect(), tfinal = 100.0, save_points = 0:25:100.0) for _ in 1:10];

10-element Array{SamplePath{Int64,2,Array{Array{Int64,1},1},Array{Float64,1}},1}:
 t: [0.0, 25.0, 50.0, 75.0, 100.0]
x: Array{Int64,1}[[10, 0, 0, 0, 0], [3, 7, 0, 7, 53], [0, 10, 0, 10, 48], [0, 10, 0, 15, 104], [1, 9, 0, 12, 105]]  
 t: [0.0, 25.0, 50.0, 75.0, 100.0]
x: Array{Int64,1}[[10, 0, 0, 0, 0], [4, 6, 0, 5, 14], [5, 5, 0, 3, 15], [4, 6, 0, 5, 12], [6, 4, 0, 7, 11]]         
 t: [0.0, 25.0, 50.0, 75.0, 100.0]
x: Array{Int64,1}[[10, 0, 0, 0, 0], [3, 7, 0, 8, 31], [0, 10, 1, 15, 176], [1, 9, 0, 24, 184], [0, 10, 0, 15, 186]] 
 t: [0.0, 25.0, 50.0, 75.0, 100.0]
x: Array{Int64,1}[[10, 0, 0, 0, 0], [2, 8, 0, 6, 30], [2, 8, 0, 3, 31], [2, 8, 0, 10, 26], [3, 7, 0, 7, 28]]        
 t: [0.0, 25.0, 50.0, 75.0, 100.0]
x: Array{Int64,1}[[10, 0, 0, 0, 0], [3, 7, 1, 4, 22], [1, 9, 0, 15, 106], [0, 10, 0, 14, 129], [1, 9, 0, 17, 124]]  
 t: [0.0, 25.0, 50.0, 75.0, 100.0]
x: Array{Int64,1}[[10, 0, 0, 0, 0], [3, 7, 0, 11, 57], [1, 9, 0, 9, 55], [1, 9, 0, 10, 53], [2, 8, 0, 11, 53]]      
 t: [0.0, 25.0, 50.0, 75.0, 100.0]
x: Array{Int64,1}[[10, 0, 0, 0, 0], [1, 9, 0, 11, 57], [0, 10, 0, 14, 112], [1, 9, 0, 15, 112], [0, 10, 0, 12, 111]]
 t: [0.0, 25.0, 50.0, 75.0, 100.0]
x: Array{Int64,1}[[10, 0, 0, 0, 0], [10, 0, 1, 0, 0], [1, 9, 2, 18, 193], [2, 8, 0, 28, 255], [0, 10, 0, 23, 252]]  
 t: [0.0, 25.0, 50.0, 75.0, 100.0]
x: Array{Int64,1}[[10, 0, 0, 0, 0], [2, 8, 1, 6, 25], [1, 9, 0, 10, 39], [3, 7, 0, 10, 39], [1, 9, 0, 5, 39]]       
 t: [0.0, 25.0, 50.0, 75.0, 100.0]
x: Array{Int64,1}[[10, 0, 0, 0, 0], [3, 7, 0, 7, 15], [2, 8, 0, 4, 16], [6, 4, 0, 11, 33], [0, 10, 1, 14, 163]]

Retrieve an individual sample path:

ensemble[2]

t: 5-element Array{Float64,1}:
   0.0
  25.0
  50.0
  75.0
 100.0
x: 5-element Array{Array{Int64,1},1}:
 [10, 0, 0, 0, 0]
 [4, 6, 0, 5, 14]
 [5, 5, 0, 3, 15]
 [4, 6, 0, 5, 12]
 [6, 4, 0, 7, 11]

Index into a SamplePath:

ensemble[2][[3,4], :]

2×5 Array{Int64,2}:
 0  0  0  0  0
 0  5  3  5  7

Both SamplePath and Ensemble implement the IterableTables.jl interface. This means that they act as data sources that can be converted into any supported data sink. For example, you can convert an Ensemble into a DataFrame:

using DataFrames

DataFrame(ensemble)

50 rows × 7 columns

	trial	t	X1	X2	X3	X4	X5
	Int64	Float64	Int64	Int64	Int64	Int64	Int64
1	1	0.0	10	0	0	0	0
2	1	25.0	3	7	0	7	53
3	1	50.0	0	10	0	10	48
4	1	75.0	0	10	0	15	104
5	1	100.0	1	9	0	12	105
6	2	0.0	10	0	0	0	0
7	2	25.0	4	6	0	5	14
8	2	50.0	5	5	0	3	15
9	2	75.0	4	6	0	5	12
10	2	100.0	6	4	0	7	11
11	3	0.0	10	0	0	0	0
12	3	25.0	3	7	0	8	31
13	3	50.0	0	10	1	15	176
14	3	75.0	1	9	0	24	184
15	3	100.0	0	10	0	15	186
16	4	0.0	10	0	0	0	0
17	4	25.0	2	8	0	6	30
18	4	50.0	2	8	0	3	31
19	4	75.0	2	8	0	10	26
20	4	100.0	3	7	0	7	28
21	5	0.0	10	0	0	0	0
22	5	25.0	3	7	1	4	22
23	5	50.0	1	9	0	15	106
24	5	75.0	0	10	0	14	129
25	5	100.0	1	9	0	17	124
26	6	0.0	10	0	0	0	0
27	6	25.0	3	7	0	11	57
28	6	50.0	1	9	0	9	55
29	6	75.0	1	9	0	10	53
30	6	100.0	2	8	0	11	53
31	7	0.0	10	0	0	0	0
32	7	25.0	1	9	0	11	57
33	7	50.0	0	10	0	14	112
34	7	75.0	1	9	0	15	112
35	7	100.0	0	10	0	12	111
36	8	0.0	10	0	0	0	0
37	8	25.0	10	0	1	0	0
38	8	50.0	1	9	2	18	193
39	8	75.0	2	8	0	28	255
40	8	100.0	0	10	0	23	252
41	9	0.0	10	0	0	0	0
42	9	25.0	2	8	1	6	25
43	9	50.0	1	9	0	10	39
44	9	75.0	3	7	0	10	39
45	9	100.0	1	9	0	5	39
46	10	0.0	10	0	0	0	0
47	10	25.0	3	7	0	7	15
48	10	50.0	2	8	0	4	16
49	10	75.0	6	4	0	11	33
50	10	100.0	0	10	1	14	163

Each component of the state vector appears as a column, along with trajectory trial and timestamp t columns. If the conversion does not produce the expected result, one may be able to force the correct behavior using the tablefy function:

import BioSimulator: tablefy

DataFrame(tablefy(sample_path)) # DataFrame(sample_path) is currently incorrect

5 rows × 6 columns

	t	X1	X2	X3	X4	X5
	Float64	Int64	Int64	Int64	Int64	Int64
1	0.0	10	0	0	0	0
2	25.0	1	9	1	16	58
3	50.0	0	10	0	17	141
4	75.0	0	10	0	18	143
5	100.0	0	10	0	17	142

Saving simulation results

Because SamplePath and Ensemble support iteration, you can save simulation data directly using Julia's I/O interface. Alternatively, you can use existing packages such as CSV.jl:

using CSV

CSV.write(file, result, delim = '\t')

Obtaining summary statistics

Summary statistics for ensembles are supported directly:

using Statistics

mean(ensemble)
std(result)
var(result)

Customizing output

It is possible to customize the data stored in a SamplePath with the keyword argument save_function in simulate. This function must accept exactly three arguments:

function myfunction(simulator, state, model)
    # your code here
end

It has access to data stored in a simulator object, the process's state, and the underlying model. This customized function should return the data to be stored in a SamplePath. Data can be any object, including custom Julia types. As an example, the function save_state is one of the default options that simply saves a copy of the system state at a given time. Note the use of annotations to dispatch on the type of state

# this gets called for well-mixed simulations
save_state(simulator, state::Vector{T}, model) where T <: Int = copy(state)

# this gets called for interacting particle systems
save_state(simulator, state::Lattice, model) = Configuration(state)

Another option is save_rates which copies the rates for each possible jump

save_rates(simulator, state, model) = copy(jump_rates(simulator))

BioSimulator provides the following high-level interface for accessing data:

BioSimulator.cumulative_intensity — Function

cumulative_intensity(simulator::ExactSimulator)

Retrieve the cumulative intensity of the process.

BioSimulator.jump_rates — Function

jump_rates(simulator::ExactSimulator)

Retrieve the current rates for each jump.

If the cache type is HasRates(), then rate[j] is the rate for reaction channel j. Otherwise if the cache type is HasSums(), then rate[j] is the cumulative sum(rate[1:j]).

BioSimulator.next_jump_index — Function

next_jump_index(simulator::ExactSimulator)

Retrieve the index of the next jump.

BioSimulator.next_jump_time — Function

next_jump_time(simulator::ExactSimulator)

Retrieve the time to the next jump.

Tip

Avoid heavy computations inside a save function. It is usually better to store the data needed and do the work outside the simulation loop. Benchmark different implementations to see what approach works best for your application.
Some data, such as jump_rates(simulator) and state are stored as numerical arrays. These objects are mutable so you should generally return copies rather than the object itself.
The generated SamplePath object will support multi-dimensional indexing provided its data is stored as an array. For example, if we set save_function = save_rates, then trajectory[k,j] will return rate[j] at time trajectory.t[k].

Accessing simulation data

Working with DataFrames

Saving simulation results

Obtaining summary statistics

Customizing output

Working with `DataFrames`