Accessing simulation data
A single run generates a SamplePath
object which is a recursive array provided by RecursiveArrayTools.jl.
sample_path = simulate(state, model, SortingDirect(), tfinal = 100.0, save_points = 0:25:100.0)
sample_path
t: 5-element Array{Float64,1}:
0.0
25.0
50.0
75.0
100.0
x: 5-element Array{Array{Int64,1},1}:
[10, 0, 0, 0, 0]
[1, 9, 1, 16, 58]
[0, 10, 0, 17, 141]
[0, 10, 0, 18, 143]
[0, 10, 0, 17, 142]
Access the state at the first time point:
sample_path[1]
5-element Array{Int64,1}:
10
0
0
0
0
Access the third component at the second time point:
sample_path[3,2]
1
Access entire history for the third and fourth components:
sample_path[[3,4], :]
2×5 Array{Int64,2}:
0 1 0 0 0
0 16 17 18 17
Running multiple simulations generates an Ensemble
, a collection of SamplePath
objects; that is, Vector{SamplePath}
. These objects also support indexing:
# ensemble of 10 trajectories
ensemble = [simulate(state, model, SortingDirect(), tfinal = 100.0, save_points = 0:25:100.0) for _ in 1:10];
10-element Array{SamplePath{Int64,2,Array{Array{Int64,1},1},Array{Float64,1}},1}:
t: [0.0, 25.0, 50.0, 75.0, 100.0]
x: Array{Int64,1}[[10, 0, 0, 0, 0], [3, 7, 0, 7, 53], [0, 10, 0, 10, 48], [0, 10, 0, 15, 104], [1, 9, 0, 12, 105]]
t: [0.0, 25.0, 50.0, 75.0, 100.0]
x: Array{Int64,1}[[10, 0, 0, 0, 0], [4, 6, 0, 5, 14], [5, 5, 0, 3, 15], [4, 6, 0, 5, 12], [6, 4, 0, 7, 11]]
t: [0.0, 25.0, 50.0, 75.0, 100.0]
x: Array{Int64,1}[[10, 0, 0, 0, 0], [3, 7, 0, 8, 31], [0, 10, 1, 15, 176], [1, 9, 0, 24, 184], [0, 10, 0, 15, 186]]
t: [0.0, 25.0, 50.0, 75.0, 100.0]
x: Array{Int64,1}[[10, 0, 0, 0, 0], [2, 8, 0, 6, 30], [2, 8, 0, 3, 31], [2, 8, 0, 10, 26], [3, 7, 0, 7, 28]]
t: [0.0, 25.0, 50.0, 75.0, 100.0]
x: Array{Int64,1}[[10, 0, 0, 0, 0], [3, 7, 1, 4, 22], [1, 9, 0, 15, 106], [0, 10, 0, 14, 129], [1, 9, 0, 17, 124]]
t: [0.0, 25.0, 50.0, 75.0, 100.0]
x: Array{Int64,1}[[10, 0, 0, 0, 0], [3, 7, 0, 11, 57], [1, 9, 0, 9, 55], [1, 9, 0, 10, 53], [2, 8, 0, 11, 53]]
t: [0.0, 25.0, 50.0, 75.0, 100.0]
x: Array{Int64,1}[[10, 0, 0, 0, 0], [1, 9, 0, 11, 57], [0, 10, 0, 14, 112], [1, 9, 0, 15, 112], [0, 10, 0, 12, 111]]
t: [0.0, 25.0, 50.0, 75.0, 100.0]
x: Array{Int64,1}[[10, 0, 0, 0, 0], [10, 0, 1, 0, 0], [1, 9, 2, 18, 193], [2, 8, 0, 28, 255], [0, 10, 0, 23, 252]]
t: [0.0, 25.0, 50.0, 75.0, 100.0]
x: Array{Int64,1}[[10, 0, 0, 0, 0], [2, 8, 1, 6, 25], [1, 9, 0, 10, 39], [3, 7, 0, 10, 39], [1, 9, 0, 5, 39]]
t: [0.0, 25.0, 50.0, 75.0, 100.0]
x: Array{Int64,1}[[10, 0, 0, 0, 0], [3, 7, 0, 7, 15], [2, 8, 0, 4, 16], [6, 4, 0, 11, 33], [0, 10, 1, 14, 163]]
Retrieve an individual sample path:
ensemble[2]
t: 5-element Array{Float64,1}:
0.0
25.0
50.0
75.0
100.0
x: 5-element Array{Array{Int64,1},1}:
[10, 0, 0, 0, 0]
[4, 6, 0, 5, 14]
[5, 5, 0, 3, 15]
[4, 6, 0, 5, 12]
[6, 4, 0, 7, 11]
Index into a SamplePath:
ensemble[2][[3,4], :]
2×5 Array{Int64,2}:
0 0 0 0 0
0 5 3 5 7
Working with DataFrames
Both SamplePath
and Ensemble
implement the IterableTables.jl interface. This means that they act as data sources that can be converted into any supported data sink. For example, you can convert an Ensemble
into a DataFrame
:
using DataFrames
DataFrame(ensemble)
trial | t | X1 | X2 | X3 | X4 | X5 | |
---|---|---|---|---|---|---|---|
Int64 | Float64 | Int64 | Int64 | Int64 | Int64 | Int64 | |
1 | 1 | 0.0 | 10 | 0 | 0 | 0 | 0 |
2 | 1 | 25.0 | 3 | 7 | 0 | 7 | 53 |
3 | 1 | 50.0 | 0 | 10 | 0 | 10 | 48 |
4 | 1 | 75.0 | 0 | 10 | 0 | 15 | 104 |
5 | 1 | 100.0 | 1 | 9 | 0 | 12 | 105 |
6 | 2 | 0.0 | 10 | 0 | 0 | 0 | 0 |
7 | 2 | 25.0 | 4 | 6 | 0 | 5 | 14 |
8 | 2 | 50.0 | 5 | 5 | 0 | 3 | 15 |
9 | 2 | 75.0 | 4 | 6 | 0 | 5 | 12 |
10 | 2 | 100.0 | 6 | 4 | 0 | 7 | 11 |
11 | 3 | 0.0 | 10 | 0 | 0 | 0 | 0 |
12 | 3 | 25.0 | 3 | 7 | 0 | 8 | 31 |
13 | 3 | 50.0 | 0 | 10 | 1 | 15 | 176 |
14 | 3 | 75.0 | 1 | 9 | 0 | 24 | 184 |
15 | 3 | 100.0 | 0 | 10 | 0 | 15 | 186 |
16 | 4 | 0.0 | 10 | 0 | 0 | 0 | 0 |
17 | 4 | 25.0 | 2 | 8 | 0 | 6 | 30 |
18 | 4 | 50.0 | 2 | 8 | 0 | 3 | 31 |
19 | 4 | 75.0 | 2 | 8 | 0 | 10 | 26 |
20 | 4 | 100.0 | 3 | 7 | 0 | 7 | 28 |
21 | 5 | 0.0 | 10 | 0 | 0 | 0 | 0 |
22 | 5 | 25.0 | 3 | 7 | 1 | 4 | 22 |
23 | 5 | 50.0 | 1 | 9 | 0 | 15 | 106 |
24 | 5 | 75.0 | 0 | 10 | 0 | 14 | 129 |
25 | 5 | 100.0 | 1 | 9 | 0 | 17 | 124 |
26 | 6 | 0.0 | 10 | 0 | 0 | 0 | 0 |
27 | 6 | 25.0 | 3 | 7 | 0 | 11 | 57 |
28 | 6 | 50.0 | 1 | 9 | 0 | 9 | 55 |
29 | 6 | 75.0 | 1 | 9 | 0 | 10 | 53 |
30 | 6 | 100.0 | 2 | 8 | 0 | 11 | 53 |
31 | 7 | 0.0 | 10 | 0 | 0 | 0 | 0 |
32 | 7 | 25.0 | 1 | 9 | 0 | 11 | 57 |
33 | 7 | 50.0 | 0 | 10 | 0 | 14 | 112 |
34 | 7 | 75.0 | 1 | 9 | 0 | 15 | 112 |
35 | 7 | 100.0 | 0 | 10 | 0 | 12 | 111 |
36 | 8 | 0.0 | 10 | 0 | 0 | 0 | 0 |
37 | 8 | 25.0 | 10 | 0 | 1 | 0 | 0 |
38 | 8 | 50.0 | 1 | 9 | 2 | 18 | 193 |
39 | 8 | 75.0 | 2 | 8 | 0 | 28 | 255 |
40 | 8 | 100.0 | 0 | 10 | 0 | 23 | 252 |
41 | 9 | 0.0 | 10 | 0 | 0 | 0 | 0 |
42 | 9 | 25.0 | 2 | 8 | 1 | 6 | 25 |
43 | 9 | 50.0 | 1 | 9 | 0 | 10 | 39 |
44 | 9 | 75.0 | 3 | 7 | 0 | 10 | 39 |
45 | 9 | 100.0 | 1 | 9 | 0 | 5 | 39 |
46 | 10 | 0.0 | 10 | 0 | 0 | 0 | 0 |
47 | 10 | 25.0 | 3 | 7 | 0 | 7 | 15 |
48 | 10 | 50.0 | 2 | 8 | 0 | 4 | 16 |
49 | 10 | 75.0 | 6 | 4 | 0 | 11 | 33 |
50 | 10 | 100.0 | 0 | 10 | 1 | 14 | 163 |
Each component of the state vector appears as a column, along with trajectory trial
and timestamp t
columns. If the conversion does not produce the expected result, one may be able to force the correct behavior using the tablefy
function:
import BioSimulator: tablefy
DataFrame(tablefy(sample_path)) # DataFrame(sample_path) is currently incorrect
t | X1 | X2 | X3 | X4 | X5 | |
---|---|---|---|---|---|---|
Float64 | Int64 | Int64 | Int64 | Int64 | Int64 | |
1 | 0.0 | 10 | 0 | 0 | 0 | 0 |
2 | 25.0 | 1 | 9 | 1 | 16 | 58 |
3 | 50.0 | 0 | 10 | 0 | 17 | 141 |
4 | 75.0 | 0 | 10 | 0 | 18 | 143 |
5 | 100.0 | 0 | 10 | 0 | 17 | 142 |
Saving simulation results
Because SamplePath
and Ensemble
support iteration, you can save simulation data directly using Julia's I/O interface. Alternatively, you can use existing packages such as CSV.jl:
using CSV
CSV.write(file, result, delim = '\t')
Obtaining summary statistics
Summary statistics for ensembles are supported directly:
using Statistics
mean(ensemble)
std(result)
var(result)
Customizing output
It is possible to customize the data stored in a SamplePath
with the keyword argument save_function
in simulate
. This function must accept exactly three arguments:
function myfunction(simulator, state, model)
# your code here
end
It has access to data stored in a simulator
object, the process's state
, and the underlying model
. This customized function should return the data to be stored in a SamplePath
. Data can be any object, including custom Julia types. As an example, the function save_state
is one of the default options that simply saves a copy of the system state at a given time. Note the use of annotations to dispatch on the type of state
# this gets called for well-mixed simulations
save_state(simulator, state::Vector{T}, model) where T <: Int = copy(state)
# this gets called for interacting particle systems
save_state(simulator, state::Lattice, model) = Configuration(state)
Another option is save_rates
which copies the rates for each possible jump
save_rates(simulator, state, model) = copy(jump_rates(simulator))
BioSimulator provides the following high-level interface for accessing data:
BioSimulator.cumulative_intensity
— Functioncumulative_intensity(simulator::ExactSimulator)
Retrieve the cumulative intensity of the process.
BioSimulator.jump_rates
— Functionjump_rates(simulator::ExactSimulator)
Retrieve the current rates for each jump.
If the cache type is HasRates()
, then rate[j]
is the rate for reaction channel j
. Otherwise if the cache type is HasSums()
, then rate[j]
is the cumulative sum(rate[1:j])
.
BioSimulator.next_jump_index
— Functionnext_jump_index(simulator::ExactSimulator)
Retrieve the index of the next jump.
BioSimulator.next_jump_time
— Functionnext_jump_time(simulator::ExactSimulator)
Retrieve the time to the next jump.
Avoid heavy computations inside a save function. It is usually better to store the data needed and do the work outside the simulation loop. Benchmark different implementations to see what approach works best for your application.
Some data, such as
jump_rates(simulator)
andstate
are stored as numerical arrays. These objects are mutable so you should generally return copies rather than the object itself.The generated
SamplePath
object will support multi-dimensional indexing provided its data is stored as an array. For example, if we setsave_function = save_rates
, thentrajectory[k,j]
will returnrate[j]
at timetrajectory.t[k]
.