Large-scale, real-world problem with steady-state data
This page describes the model selection problem available in the PEtab Select test suite, case 0009.
It reproduces the histone H4 acetylation model selection problem of Blasi et al. (2016) [Blasi2016]. This involves a model space of \(2^{32} \approx 4.3\) billion models. The model is formulated as a ordinary differential equation system based on biochemical reactions. The dataset is real experimental data, measured at steady-state. Briefly, the model selection problem is to identify which subset of 32 parameters should exist in the model to explain the data.
The files for this model selection problem are available at test_cases/0009.
Biological background
Histones are the proteins around which DNA is wrapped, and histone N-terminal “tails” have post-translational modifications that regulate chromatin. The N-terminal tail of histone H4 has four lysine (K) residues that can each be acetylated: K5, K8, K12, and K16.
Because each of the four sites is either acetylated or not, there are \(2^4 = 16\) possible acetylation patterns, called motifs, ranging from the fully unacetylated state to the fully acetylated state . The research question in [Blasi2016] is to determine how the acetylation patterns are generated: do the acetylation reactions all occur at the same rate regardless of the current acetylation motif, or are some of them motif-specific?
As described below, there are a total of 32 acetylation reactions. The original publication [Blasi2016] found that the “best” model had 7 of these reactions that occur at a motif-specific rate, and the rest occur at a shared rate.
The mathematical model
The model is a reaction network over the 16 acetylation motifs of the
histone H4 tail. Each motif is a chemical species \(x_m\), and a reaction
adds (acetylation) or removes (deacetylation) a single acetyl group, connecting
motifs that differ at exactly one site. Hence, there are 32 acetylation reactions
and 32 deacetylation reactions. The dynamics use mass-action
kinetics, and the measured abundances are assumed to be at steady state
(acetylation/deacetylation are much faster than the cell cycle); in the PEtab
problem this is encoded by measurements at time = inf.
The 16 motifs (state variables)
The state variables x_<motif> grouped by the number of acetylated lysines (of K5, K8, K12, K16) are:
zero acetylated lysines:
x_0acone acetylated lysine:
x_k05,x_k08,x_k12,x_k16two acetylated lysines:
x_k05k08,x_k05k12,x_k05k16,x_k08k12,x_k08k16,x_k12k16three acetylated lysines:
x_k05k08k12,x_k05k08k16,x_k05k12k16,x_k08k12k16four acetylated lysines:
x_4ac
The 32 acetylation reactions
Each acetylation reaction converts one motif \(p\) into another motif \(q\), \(p \to q\), at rate \(a_{p\to q}\, a_b\, x_p\),
where \(a_b\) is the basal (shared) acetylation rate and \(a_{p\to q} \equiv\) a_<p>_<q> is a factor that represents the research question. This factor is either fixed to one for reactions that occur with the shared basal rate constant, or the factor is estimated, to change the rate constant to be motif-specific.
Each
acetylation reaction also has an associated deacetylation reaction \(q \to p\) with rate
\(d\, x_q\), where \(d \equiv\) da_b is a single, shared basal deacetylation rate constant (fixed to 1).
The acetylation reactions are explicitly:
x_0ac -> x_k05 (rate a_0ac_k05 * a_b * x_0ac)
x_0ac -> x_k08 (rate a_0ac_k08 * a_b * x_0ac)
x_0ac -> x_k12 (rate a_0ac_k12 * a_b * x_0ac)
x_0ac -> x_k16 (rate a_0ac_k16 * a_b * x_0ac)
x_k05 -> x_k05k08 (rate a_k05_k05k08 * a_b * x_k05)
x_k05 -> x_k05k12 (rate a_k05_k05k12 * a_b * x_k05)
x_k05 -> x_k05k16 (rate a_k05_k05k16 * a_b * x_k05)
x_k08 -> x_k05k08 (rate a_k08_k05k08 * a_b * x_k08)
x_k08 -> x_k08k12 (rate a_k08_k08k12 * a_b * x_k08)
x_k08 -> x_k08k16 (rate a_k08_k08k16 * a_b * x_k08)
x_k12 -> x_k05k12 (rate a_k12_k05k12 * a_b * x_k12)
x_k12 -> x_k08k12 (rate a_k12_k08k12 * a_b * x_k12)
x_k12 -> x_k12k16 (rate a_k12_k12k16 * a_b * x_k12)
x_k16 -> x_k05k16 (rate a_k16_k05k16 * a_b * x_k16)
x_k16 -> x_k08k16 (rate a_k16_k08k16 * a_b * x_k16)
x_k16 -> x_k12k16 (rate a_k16_k12k16 * a_b * x_k16)
x_k05k08 -> x_k05k08k12 (rate a_k05k08_k05k08k12 * a_b * x_k05k08)
x_k05k08 -> x_k05k08k16 (rate a_k05k08_k05k08k16 * a_b * x_k05k08)
x_k05k12 -> x_k05k08k12 (rate a_k05k12_k05k08k12 * a_b * x_k05k12)
x_k05k12 -> x_k05k12k16 (rate a_k05k12_k05k12k16 * a_b * x_k05k12)
x_k05k16 -> x_k05k08k16 (rate a_k05k16_k05k08k16 * a_b * x_k05k16)
x_k05k16 -> x_k05k12k16 (rate a_k05k16_k05k12k16 * a_b * x_k05k16)
x_k08k12 -> x_k05k08k12 (rate a_k08k12_k05k08k12 * a_b * x_k08k12)
x_k08k12 -> x_k08k12k16 (rate a_k08k12_k08k12k16 * a_b * x_k08k12)
x_k08k16 -> x_k05k08k16 (rate a_k08k16_k05k08k16 * a_b * x_k08k16)
x_k08k16 -> x_k08k12k16 (rate a_k08k16_k08k12k16 * a_b * x_k08k16)
x_k12k16 -> x_k05k12k16 (rate a_k12k16_k05k12k16 * a_b * x_k12k16)
x_k12k16 -> x_k08k12k16 (rate a_k12k16_k08k12k16 * a_b * x_k12k16)
x_k05k08k12 -> x_4ac (rate a_k05k08k12_4ac * a_b * x_k05k08k12)
x_k05k08k16 -> x_4ac (rate a_k05k08k16_4ac * a_b * x_k05k08k16)
x_k05k12k16 -> x_4ac (rate a_k05k12k16_4ac * a_b * x_k05k12k16)
x_k08k12k16 -> x_4ac (rate a_k08k12k16_4ac * a_b * x_k08k12k16)
The 32 reverse deacetylations occur at rate e.g. da_b * x_k05 for k05 -> 0ac. These are not listed explicitly but are present for every acetylation reaction above.
The ordinary differential equations
Based on these reactions and rates, the ODE system is therefore
The observation model
Each observable is the steady-state abundance of one motif,
for the 15 observed motifs, \(m \in \mathcal{O}\). The observed set \(\mathcal{O}\) excludes the x_k05k16 motif because it was below the quantification limit of the experiment.
The measured are assumed to follow a log-normal (multiplicative) noise distribution:
where \(\bar{y}_{m,r}\) is replicate \(r\) of the measurement of motif
\(m\), and \(\sigma =\) sigma_ is the noise parameter (fixed to
1).
The likelihood function
The likelihood of the estimated parameters \(\theta\) (the basal rate \(a_b\) and the estimated motif-specific factors) is
and the corresponding negative log-likelihood, which PEtab Select uses to compute model selection criteria, is
where \(R_m\) is the number of replicates of motif \(m\).
Experimental data
The model is fitted to the relative abundances of the H4 acetylation motifs measured in Drosophila melanogaster Kc cells. The data come from the quantitative mass-spectrometry study of Feller et al. (2015), as used by Blasi et al. [Blasi2016]:
Briefly, histone H4 was extracted from wild-type Kc cells. The relative abundances of the H4 N-terminal acetylation motifs were quantified by liquid chromatography–mass spectrometry (LC–MS). The measurements represent the steady-state distribution of motifs.
Of the 16 motifs, one (K5K16) lies below the quantification limit and is
not used; hence, the PEtab problem has 15 observables (one per measured motif)
and 252 measurements in total.
The model selection problem
There is one hypothesis per acetylation reaction: its rate is either the shared
basal rate (\(a_{p\to q}\) fixed to 1) or an estimated
motif-specific rate (\(a_{p\to q}\) estimated). With 32
reactions, the model space therefore contains
\(2^{32} \approx 4.3\) billion models. The task is to identify the subset of
reactions that require a motif-specific rate constant to explain the data. The published best model
had seven motif-specific reactions [Blasi2016].
This PEtab Select formulation differs from the original publication in three ways in that it omits 11 additional models that were considered in the original publication. These 11 models add negligible computational cost to the model selection problem and were not amongst the best models in the original publication, so are ignored here. Furthermore, we use the FAMoS search method as a general strategy, instead of the highly-tailored problem-specific approach used in the original publication that enabled them to use the brute-force method.
The PEtab Select files
PEtab Select problem YAML
The problem file specifies the criterion, search method, model space file and additional arguments for the search method.
format_version: beta_1
criterion: AICc
method: famos
model_space_files:
- model_space.tsv
candidate_space_arguments:
critical_parameter_sets: []
consecutive_laterals: true
predecessor_model: predecessor_model.yaml
summary_tsv: output/summary.tsv
swap_parameter_sets:
- - a_0ac_k05
- a_0ac_k08
- a_0ac_k12
- a_0ac_k16
- a_k05_k05k08
- a_k05_k05k12
- a_k05_k05k16
- a_k05k08_k05k08k12
- a_k05k08_k05k08k16
- a_k05k08k12_4ac
- a_k05k08k16_4ac
- a_k05k12_k05k08k12
- a_k05k12_k05k12k16
- a_k05k12k16_4ac
- a_k05k16_k05k08k16
- a_k05k16_k05k12k16
- a_k08_k05k08
- a_k08_k08k12
- a_k08_k08k16
- a_k08k12_k05k08k12
- a_k08k12_k08k12k16
- a_k08k12k16_4ac
- a_k08k16_k05k08k16
- a_k08k16_k08k12k16
- a_k12_k05k12
- a_k12_k08k12
- a_k12_k12k16
- a_k12k16_k05k12k16
- a_k12k16_k08k12k16
- a_k16_k05k16
- a_k16_k08k16
- a_k16_k12k16
The candidate_space_arguments configure FAMoS:
predecessor_model: the initial model the search starts from (see below).critical_parameter_sets: empty here — no reaction is forced to always be motif-specific.swap_parameter_sets: a single set containing all 32 reaction parameters. FAMoS lateral (swap) moves exchange an estimated parameter for a fixed one; restricting swaps to within a set means any motif-specific reaction may be swapped for any other.consecutive_laterals: true: keep performing lateral moves while they keep improving the model.summary_tsv: where to write a summary of the search status and history.
Model space
The model space is a single subspace M with 32 parameter columns, one
per acetylation reaction. Every column has the value 1.0;estimate, meaning
each reaction can be either fixed to the basal rate (1.0) or estimated
(motif-specific). This concisely encodes all ~4.3 billion
models in a single row (the table is wide, with one column per reaction):
model_subspace_id |
model_subspace_petab_yaml |
a_0ac_k05 |
a_0ac_k08 |
a_0ac_k12 |
a_0ac_k16 |
a_k05_k05k08 |
a_k05_k05k12 |
a_k05_k05k16 |
a_k08_k05k08 |
a_k08_k08k12 |
a_k08_k08k16 |
a_k12_k05k12 |
a_k12_k08k12 |
a_k12_k12k16 |
a_k16_k05k16 |
a_k16_k08k16 |
a_k16_k12k16 |
a_k05k08_k05k08k12 |
a_k05k08_k05k08k16 |
a_k05k12_k05k08k12 |
a_k05k12_k05k12k16 |
a_k05k16_k05k08k16 |
a_k05k16_k05k12k16 |
a_k08k12_k05k08k12 |
a_k08k12_k08k12k16 |
a_k08k16_k05k08k16 |
a_k08k16_k08k12k16 |
a_k12k16_k05k12k16 |
a_k12k16_k08k12k16 |
a_k05k08k12_4ac |
a_k05k08k16_4ac |
a_k05k12k16_4ac |
a_k08k12k16_4ac |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
M |
petab/petab_problem.yaml |
1.0;estimate |
1.0;estimate |
1.0;estimate |
1.0;estimate |
1.0;estimate |
1.0;estimate |
1.0;estimate |
1.0;estimate |
1.0;estimate |
1.0;estimate |
1.0;estimate |
1.0;estimate |
1.0;estimate |
1.0;estimate |
1.0;estimate |
1.0;estimate |
1.0;estimate |
1.0;estimate |
1.0;estimate |
1.0;estimate |
1.0;estimate |
1.0;estimate |
1.0;estimate |
1.0;estimate |
1.0;estimate |
1.0;estimate |
1.0;estimate |
1.0;estimate |
1.0;estimate |
1.0;estimate |
1.0;estimate |
1.0;estimate |
The PEtab problem (petab/)
The model space references a standard PEtab problem that defines the superset model (all reactions present):
model.xml: the SBML model with the 16 motif species and the acetylation / deacetylation reactions.parameters.tsv: the 32 reaction-rate factorsa_<reaction>(log10scale, bounds[1e-3, 1e3]), the basal acetylation ratea_b, the deacetylation reference rateda_b(fixed to1), and the noise parametersigma_.observables.tsv: 15 observables, one per measured motif, and the log-normal noise distribution.measurements.tsv: the 252 steady-state (time = inf) abundance measurements.conditions.tsv: a single dummy condition.
Predecessor (initial) model
predecessor_model.yaml is the model the FAMoS search is initialised from, in the PEtab Select model YAML file format. It
is a specific starting model, with 15 motif-specific reactions, that was found
to reproducibly lead to the best model.
Without it, solving the problem from scratch requires on the order of 100
randomly initialised FAMoS searches.
criteria:
NLLH: inf
estimated_parameters: {}
model_hash: M-01000001111110010011010011010001
model_id: M-01000001111110010011010011010001
model_subspace_id: M
model_subspace_indices:
- 0
- 1
- 0
- 0
- 0
- 0
- 0
- 1
- 1
- 1
- 1
- 1
- 1
- 0
- 0
- 1
- 0
- 0
- 1
- 1
- 0
- 1
- 0
- 0
- 1
- 1
- 0
- 1
- 0
- 0
- 0
- 1
parameters:
a_0ac_k05: 1
a_0ac_k08: estimate
a_0ac_k12: 1
a_0ac_k16: 1
a_k05_k05k08: 1
a_k05_k05k12: 1
a_k05_k05k16: 1
a_k05k08_k05k08k12: 1
a_k05k08_k05k08k16: 1
a_k05k08k12_4ac: 1
a_k05k08k16_4ac: 1
a_k05k12_k05k08k12: estimate
a_k05k12_k05k12k16: estimate
a_k05k12k16_4ac: 1
a_k05k16_k05k08k16: 1
a_k05k16_k05k12k16: estimate
a_k08_k05k08: estimate
a_k08_k08k12: estimate
a_k08_k08k16: estimate
a_k08k12_k05k08k12: 1
a_k08k12_k08k12k16: 1
a_k08k12k16_4ac: estimate
a_k08k16_k05k08k16: estimate
a_k08k16_k08k12k16: estimate
a_k12_k05k12: estimate
a_k12_k08k12: estimate
a_k12_k12k16: estimate
a_k12k16_k05k12k16: 1
a_k12k16_k08k12k16: estimate
a_k16_k05k16: 1
a_k16_k08k16: 1
a_k16_k12k16: estimate
model_subspace_petab_yaml: petab/petab_problem.yaml
predecessor_model_hash: virtual_initial_model-
Expected results
expected.yaml is the expected selected model, in the PEtab Select model YAML file format. It has the seven
motif-specific reactions of the published best model
(a_0ac_k08, a_k05_k05k12, a_k12_k05k12, a_k16_k12k16,
a_k05k12_k05k08k12, a_k12k16_k08k12k16, a_k08k12k16_4ac) plus the
basal rate a_b — eight estimated parameters in total — with
AICc ≈ -1708.1.
model_subspace_id: M
model_subspace_indices:
- 0
- 1
- 0
- 0
- 0
- 1
- 0
- 0
- 0
- 0
- 1
- 0
- 0
- 0
- 0
- 1
- 0
- 0
- 1
- 0
- 0
- 0
- 0
- 0
- 0
- 0
- 0
- 1
- 0
- 0
- 0
- 1
criteria:
NLLH: -862.3517925313981
AICc: -1708.1109924702037
model_hash: M-01000100001000010010000000010001
model_subspace_petab_yaml: petab/petab_problem.yaml
estimated_parameters:
a_0ac_k08: 0.40850355273291267
a_k05_k05k12: 30.888150959586138
a_k12_k05k12: 8.267845459216893
a_k16_k12k16: 10.424629099941777
a_k05k12_k05k08k12: 4.872747603868694
a_k12k16_k08k12k16: 33.03769174387633
a_k08k12k16_4ac: 53.80106471593421
a_b: 0.06675819571287103
iteration: 11
model_id: M-01000100001000010010000000010001
parameters:
a_0ac_k05: 1
a_0ac_k08: estimate
a_0ac_k12: 1
a_0ac_k16: 1
a_k05_k05k08: 1
a_k05_k05k12: estimate
a_k05_k05k16: 1
a_k08_k05k08: 1
a_k08_k08k12: 1
a_k08_k08k16: 1
a_k12_k05k12: estimate
a_k12_k08k12: 1
a_k12_k12k16: 1
a_k16_k05k16: 1
a_k16_k08k16: 1
a_k16_k12k16: estimate
a_k05k08_k05k08k12: 1
a_k05k08_k05k08k16: 1
a_k05k12_k05k08k12: estimate
a_k05k12_k05k12k16: 1
a_k05k16_k05k08k16: 1
a_k05k16_k05k12k16: 1
a_k08k12_k05k08k12: 1
a_k08k12_k08k12k16: 1
a_k08k16_k05k08k16: 1
a_k08k16_k08k12k16: 1
a_k12k16_k05k12k16: 1
a_k12k16_k08k12k16: estimate
a_k05k08k12_4ac: 1
a_k05k08k16_4ac: 1
a_k05k12k16_4ac: 1
a_k08k12k16_4ac: estimate
predecessor_model_hash: M-01000100001010010010000000010001
expected_summary.tsv is the FAMoS search trajectory, one row per iteration,
showing how the method switches between forward, backward, and
lateral moves and how the criterion improves at each step:
method |
# candidates |
predecessor change |
current model criterion |
current model |
candidate changes |
|---|---|---|---|---|---|
forward |
17 |
[] |
inf |
[‘a_0ac_k08’, ‘a_b’, ‘a_k05k12_k05k08k12’, ‘a_k05k12_k05k12k16’, ‘a_k05k16_k05k12k16’, ‘a_k08_k05k08’, ‘a_k08_k08k12’, ‘a_k08_k08k16’, ‘a_k08k12k16_4ac’, ‘a_k08k16_k05k08k16’, ‘a_k08k16_k08k12k16’, ‘a_k12_k05k12’, ‘a_k12_k08k12’, ‘a_k12_k12k16’, ‘a_k12k16_k08k12k16’, ‘a_k16_k12k16’] |
[[‘a_0ac_k05’], [‘a_0ac_k12’], [‘a_0ac_k16’], [‘a_k05_k05k08’], [‘a_k05_k05k12’], [‘a_k05_k05k16’], [‘a_k05k08_k05k08k12’], [‘a_k05k08_k05k08k16’], [‘a_k05k08k12_4ac’], [‘a_k05k08k16_4ac’], [‘a_k05k12k16_4ac’], [‘a_k05k16_k05k08k16’], [‘a_k08k12_k05k08k12’], [‘a_k08k12_k08k12k16’], [‘a_k12k16_k05k12k16’], [‘a_k16_k05k16’], [‘a_k16_k08k16’]] |
forward |
16 |
[‘a_k05_k05k12’] |
-1688.4806604640246 |
[‘a_0ac_k08’, ‘a_b’, ‘a_k05_k05k12’, ‘a_k05k12_k05k08k12’, ‘a_k05k12_k05k12k16’, ‘a_k05k16_k05k12k16’, ‘a_k08_k05k08’, ‘a_k08_k08k12’, ‘a_k08_k08k16’, ‘a_k08k12k16_4ac’, ‘a_k08k16_k05k08k16’, ‘a_k08k16_k08k12k16’, ‘a_k12_k05k12’, ‘a_k12_k08k12’, ‘a_k12_k12k16’, ‘a_k12k16_k08k12k16’, ‘a_k16_k12k16’] |
[[‘a_0ac_k05’], [‘a_0ac_k12’], [‘a_0ac_k16’], [‘a_k05_k05k08’], [‘a_k05_k05k16’], [‘a_k05k08_k05k08k12’], [‘a_k05k08_k05k08k16’], [‘a_k05k08k12_4ac’], [‘a_k05k08k16_4ac’], [‘a_k05k12k16_4ac’], [‘a_k05k16_k05k08k16’], [‘a_k08k12_k05k08k12’], [‘a_k08k12_k08k12k16’], [‘a_k12k16_k05k12k16’], [‘a_k16_k05k16’], [‘a_k16_k08k16’]] |
backward |
16 |
[] |
-1688.4806604640246 |
[‘a_0ac_k08’, ‘a_b’, ‘a_k05_k05k12’, ‘a_k05k12_k05k08k12’, ‘a_k05k12_k05k12k16’, ‘a_k05k16_k05k12k16’, ‘a_k08_k05k08’, ‘a_k08_k08k12’, ‘a_k08_k08k16’, ‘a_k08k12k16_4ac’, ‘a_k08k16_k05k08k16’, ‘a_k08k16_k08k12k16’, ‘a_k12_k05k12’, ‘a_k12_k08k12’, ‘a_k12_k12k16’, ‘a_k12k16_k08k12k16’, ‘a_k16_k12k16’] |
[[‘a_0ac_k08’], [‘a_k05_k05k12’], [‘a_k05k12_k05k08k12’], [‘a_k05k12_k05k12k16’], [‘a_k05k16_k05k12k16’], [‘a_k08_k05k08’], [‘a_k08_k08k12’], [‘a_k08_k08k16’], [‘a_k08k12k16_4ac’], [‘a_k08k16_k05k08k16’], [‘a_k08k16_k08k12k16’], [‘a_k12_k05k12’], [‘a_k12_k08k12’], [‘a_k12_k12k16’], [‘a_k12k16_k08k12k16’], [‘a_k16_k12k16’]] |
backward |
15 |
[‘a_k08_k08k16’] |
-1690.7812030034663 |
[‘a_0ac_k08’, ‘a_b’, ‘a_k05_k05k12’, ‘a_k05k12_k05k08k12’, ‘a_k05k12_k05k12k16’, ‘a_k05k16_k05k12k16’, ‘a_k08_k05k08’, ‘a_k08_k08k12’, ‘a_k08k12k16_4ac’, ‘a_k08k16_k05k08k16’, ‘a_k08k16_k08k12k16’, ‘a_k12_k05k12’, ‘a_k12_k08k12’, ‘a_k12_k12k16’, ‘a_k12k16_k08k12k16’, ‘a_k16_k12k16’] |
[[‘a_0ac_k08’], [‘a_k05_k05k12’], [‘a_k05k12_k05k08k12’], [‘a_k05k12_k05k12k16’], [‘a_k05k16_k05k12k16’], [‘a_k08_k05k08’], [‘a_k08_k08k12’], [‘a_k08k12k16_4ac’], [‘a_k08k16_k05k08k16’], [‘a_k08k16_k08k12k16’], [‘a_k12_k05k12’], [‘a_k12_k08k12’], [‘a_k12_k12k16’], [‘a_k12k16_k08k12k16’], [‘a_k16_k12k16’]] |
backward |
14 |
[‘a_k05k16_k05k12k16’] |
-1693.0621897450123 |
[‘a_0ac_k08’, ‘a_b’, ‘a_k05_k05k12’, ‘a_k05k12_k05k08k12’, ‘a_k05k12_k05k12k16’, ‘a_k08_k05k08’, ‘a_k08_k08k12’, ‘a_k08k12k16_4ac’, ‘a_k08k16_k05k08k16’, ‘a_k08k16_k08k12k16’, ‘a_k12_k05k12’, ‘a_k12_k08k12’, ‘a_k12_k12k16’, ‘a_k12k16_k08k12k16’, ‘a_k16_k12k16’] |
[[‘a_0ac_k08’], [‘a_k05_k05k12’], [‘a_k05k12_k05k08k12’], [‘a_k05k12_k05k12k16’], [‘a_k08_k05k08’], [‘a_k08_k08k12’], [‘a_k08k12k16_4ac’], [‘a_k08k16_k05k08k16’], [‘a_k08k16_k08k12k16’], [‘a_k12_k05k12’], [‘a_k12_k08k12’], [‘a_k12_k12k16’], [‘a_k12k16_k08k12k16’], [‘a_k16_k12k16’]] |
backward |
13 |
[‘a_k08_k08k12’] |
-1695.323741885311 |
[‘a_0ac_k08’, ‘a_b’, ‘a_k05_k05k12’, ‘a_k05k12_k05k08k12’, ‘a_k05k12_k05k12k16’, ‘a_k08_k05k08’, ‘a_k08k12k16_4ac’, ‘a_k08k16_k05k08k16’, ‘a_k08k16_k08k12k16’, ‘a_k12_k05k12’, ‘a_k12_k08k12’, ‘a_k12_k12k16’, ‘a_k12k16_k08k12k16’, ‘a_k16_k12k16’] |
[[‘a_0ac_k08’], [‘a_k05_k05k12’], [‘a_k05k12_k05k08k12’], [‘a_k05k12_k05k12k16’], [‘a_k08_k05k08’], [‘a_k08k12k16_4ac’], [‘a_k08k16_k05k08k16’], [‘a_k08k16_k08k12k16’], [‘a_k12_k05k12’], [‘a_k12_k08k12’], [‘a_k12_k12k16’], [‘a_k12k16_k08k12k16’], [‘a_k16_k12k16’]] |
backward |
12 |
[‘a_k05k12_k05k12k16’] |
-1697.5623869086814 |
[‘a_0ac_k08’, ‘a_b’, ‘a_k05_k05k12’, ‘a_k05k12_k05k08k12’, ‘a_k08_k05k08’, ‘a_k08k12k16_4ac’, ‘a_k08k16_k05k08k16’, ‘a_k08k16_k08k12k16’, ‘a_k12_k05k12’, ‘a_k12_k08k12’, ‘a_k12_k12k16’, ‘a_k12k16_k08k12k16’, ‘a_k16_k12k16’] |
[[‘a_0ac_k08’], [‘a_k05_k05k12’], [‘a_k05k12_k05k08k12’], [‘a_k08_k05k08’], [‘a_k08k12k16_4ac’], [‘a_k08k16_k05k08k16’], [‘a_k08k16_k08k12k16’], [‘a_k12_k05k12’], [‘a_k12_k08k12’], [‘a_k12_k12k16’], [‘a_k12k16_k08k12k16’], [‘a_k16_k12k16’]] |
backward |
11 |
[‘a_k12_k08k12’] |
-1699.7744801865597 |
[‘a_0ac_k08’, ‘a_b’, ‘a_k05_k05k12’, ‘a_k05k12_k05k08k12’, ‘a_k08_k05k08’, ‘a_k08k12k16_4ac’, ‘a_k08k16_k05k08k16’, ‘a_k08k16_k08k12k16’, ‘a_k12_k05k12’, ‘a_k12_k12k16’, ‘a_k12k16_k08k12k16’, ‘a_k16_k12k16’] |
[[‘a_0ac_k08’], [‘a_k05_k05k12’], [‘a_k05k12_k05k08k12’], [‘a_k08_k05k08’], [‘a_k08k12k16_4ac’], [‘a_k08k16_k05k08k16’], [‘a_k08k16_k08k12k16’], [‘a_k12_k05k12’], [‘a_k12_k12k16’], [‘a_k12k16_k08k12k16’], [‘a_k16_k12k16’]] |
backward |
10 |
[‘a_k08k16_k08k12k16’] |
-1701.9477568144061 |
[‘a_0ac_k08’, ‘a_b’, ‘a_k05_k05k12’, ‘a_k05k12_k05k08k12’, ‘a_k08_k05k08’, ‘a_k08k12k16_4ac’, ‘a_k08k16_k05k08k16’, ‘a_k12_k05k12’, ‘a_k12_k12k16’, ‘a_k12k16_k08k12k16’, ‘a_k16_k12k16’] |
[[‘a_0ac_k08’], [‘a_k05_k05k12’], [‘a_k05k12_k05k08k12’], [‘a_k08_k05k08’], [‘a_k08k12k16_4ac’], [‘a_k08k16_k05k08k16’], [‘a_k12_k05k12’], [‘a_k12_k12k16’], [‘a_k12k16_k08k12k16’], [‘a_k16_k12k16’]] |
backward |
9 |
[‘a_k08_k05k08’] |
-1704.0989381094057 |
[‘a_0ac_k08’, ‘a_b’, ‘a_k05_k05k12’, ‘a_k05k12_k05k08k12’, ‘a_k08k12k16_4ac’, ‘a_k08k16_k05k08k16’, ‘a_k12_k05k12’, ‘a_k12_k12k16’, ‘a_k12k16_k08k12k16’, ‘a_k16_k12k16’] |
[[‘a_0ac_k08’], [‘a_k05_k05k12’], [‘a_k05k12_k05k08k12’], [‘a_k08k12k16_4ac’], [‘a_k08k16_k05k08k16’], [‘a_k12_k05k12’], [‘a_k12_k12k16’], [‘a_k12k16_k08k12k16’], [‘a_k16_k12k16’]] |
backward |
8 |
[‘a_k08k16_k05k08k16’] |
-1706.240107896267 |
[‘a_0ac_k08’, ‘a_b’, ‘a_k05_k05k12’, ‘a_k05k12_k05k08k12’, ‘a_k08k12k16_4ac’, ‘a_k12_k05k12’, ‘a_k12_k12k16’, ‘a_k12k16_k08k12k16’, ‘a_k16_k12k16’] |
[[‘a_0ac_k08’], [‘a_k05_k05k12’], [‘a_k05k12_k05k08k12’], [‘a_k08k12k16_4ac’], [‘a_k12_k05k12’], [‘a_k12_k12k16’], [‘a_k12k16_k08k12k16’], [‘a_k16_k12k16’]] |
backward |
7 |
[‘a_k12_k12k16’] |
-1708.110992470788 |
[‘a_0ac_k08’, ‘a_b’, ‘a_k05_k05k12’, ‘a_k05k12_k05k08k12’, ‘a_k08k12k16_4ac’, ‘a_k12_k05k12’, ‘a_k12k16_k08k12k16’, ‘a_k16_k12k16’] |
[[‘a_0ac_k08’], [‘a_k05_k05k12’], [‘a_k05k12_k05k08k12’], [‘a_k08k12k16_4ac’], [‘a_k12_k05k12’], [‘a_k12k16_k08k12k16’], [‘a_k16_k12k16’]] |
forward |
23 |
[] |
-1708.110992470788 |
[‘a_0ac_k08’, ‘a_b’, ‘a_k05_k05k12’, ‘a_k05k12_k05k08k12’, ‘a_k08k12k16_4ac’, ‘a_k12_k05k12’, ‘a_k12k16_k08k12k16’, ‘a_k16_k12k16’] |
[[‘a_0ac_k05’], [‘a_0ac_k12’], [‘a_0ac_k16’], [‘a_k05_k05k08’], [‘a_k05_k05k16’], [‘a_k05k08_k05k08k12’], [‘a_k05k08_k05k08k16’], [‘a_k05k08k12_4ac’], [‘a_k05k08k16_4ac’], [‘a_k05k12_k05k12k16’], [‘a_k05k12k16_4ac’], [‘a_k05k16_k05k08k16’], [‘a_k05k16_k05k12k16’], [‘a_k08_k05k08’], [‘a_k08_k08k12’], [‘a_k08_k08k16’], [‘a_k08k12_k05k08k12’], [‘a_k08k12_k08k12k16’], [‘a_k08k16_k08k12k16’], [‘a_k12_k08k12’], [‘a_k12k16_k05k12k16’], [‘a_k16_k05k16’], [‘a_k16_k08k16’]] |
lateral |
168 |
[] |
-1708.110992470788 |
[‘a_0ac_k08’, ‘a_b’, ‘a_k05_k05k12’, ‘a_k05k12_k05k08k12’, ‘a_k08k12k16_4ac’, ‘a_k12_k05k12’, ‘a_k12k16_k08k12k16’, ‘a_k16_k12k16’] |
[[‘a_0ac_k05’, ‘a_0ac_k08’], [‘a_0ac_k05’, ‘a_k05_k05k12’], [‘a_0ac_k05’, ‘a_k05k12_k05k08k12’], [‘a_0ac_k05’, ‘a_k08k12k16_4ac’], [‘a_0ac_k05’, ‘a_k12k16_k08k12k16’], [‘a_0ac_k08’, ‘a_0ac_k12’], [‘a_0ac_k08’, ‘a_k05_k05k08’], [‘a_0ac_k08’, ‘a_k05k08_k05k08k12’], [‘a_0ac_k08’, ‘a_k05k16_k05k12k16’], [‘a_0ac_k08’, ‘a_k08_k05k08’], [‘a_0ac_k08’, ‘a_k08_k08k12’], [‘a_0ac_k08’, ‘a_k08_k08k16’], [‘a_0ac_k08’, ‘a_k08k16_k08k12k16’], [‘a_0ac_k08’, ‘a_k12k16_k05k12k16’], [‘a_0ac_k08’, ‘a_k16_k05k16’], [‘a_0ac_k08’, ‘a_k16_k08k16’], [‘a_0ac_k12’, ‘a_k05k12_k05k08k12’], [‘a_0ac_k12’, ‘a_k12k16_k08k12k16’], [‘a_0ac_k16’, ‘a_0ac_k08’], [‘a_0ac_k16’, ‘a_k05_k05k12’], [‘a_0ac_k16’, ‘a_k05k12_k05k08k12’], [‘a_0ac_k16’, ‘a_k08k12k16_4ac’], [‘a_0ac_k16’, ‘a_k12_k05k12’], [‘a_0ac_k16’, ‘a_k12k16_k08k12k16’], [‘a_k05_k05k08’, ‘a_k05k12_k05k08k12’], [‘a_k05_k05k08’, ‘a_k12k16_k08k12k16’], [‘a_k05_k05k12’, ‘a_0ac_k12’], [‘a_k05_k05k12’, ‘a_k05_k05k08’], [‘a_k05_k05k12’, ‘a_k05k08_k05k08k12’], [‘a_k05_k05k12’, ‘a_k05k16_k05k12k16’], [‘a_k05_k05k12’, ‘a_k08_k05k08’], [‘a_k05_k05k12’, ‘a_k08_k08k12’], [‘a_k05_k05k12’, ‘a_k08_k08k16’], [‘a_k05_k05k12’, ‘a_k08k16_k08k12k16’], [‘a_k05_k05k12’, ‘a_k12k16_k05k12k16’], [‘a_k05_k05k12’, ‘a_k16_k05k16’], [‘a_k05_k05k12’, ‘a_k16_k08k16’], [‘a_k05_k05k16’, ‘a_0ac_k08’], [‘a_k05_k05k16’, ‘a_k05_k05k12’], [‘a_k05_k05k16’, ‘a_k05k12_k05k08k12’], [‘a_k05_k05k16’, ‘a_k08k12k16_4ac’], [‘a_k05_k05k16’, ‘a_k12_k05k12’], [‘a_k05_k05k16’, ‘a_k12k16_k08k12k16’], [‘a_k05k08_k05k08k12’, ‘a_k05k12_k05k08k12’], [‘a_k05k08_k05k08k12’, ‘a_k08k12k16_4ac’], [‘a_k05k08_k05k08k12’, ‘a_k12k16_k08k12k16’], [‘a_k05k08_k05k08k16’, ‘a_0ac_k08’], [‘a_k05k08_k05k08k16’, ‘a_k05_k05k12’], [‘a_k05k08_k05k08k16’, ‘a_k05k12_k05k08k12’], [‘a_k05k08_k05k08k16’, ‘a_k08k12k16_4ac’], [‘a_k05k08_k05k08k16’, ‘a_k12_k05k12’], [‘a_k05k08_k05k08k16’, ‘a_k12k16_k08k12k16’], [‘a_k05k08k12_4ac’, ‘a_0ac_k08’], [‘a_k05k08k12_4ac’, ‘a_k05_k05k12’], [‘a_k05k08k12_4ac’, ‘a_k05k12_k05k08k12’], [‘a_k05k08k12_4ac’, ‘a_k08k12k16_4ac’], [‘a_k05k08k12_4ac’, ‘a_k12_k05k12’], [‘a_k05k08k12_4ac’, ‘a_k12k16_k08k12k16’], [‘a_k05k08k12_4ac’, ‘a_k16_k12k16’], [‘a_k05k08k16_4ac’, ‘a_0ac_k08’], [‘a_k05k08k16_4ac’, ‘a_k05_k05k12’], [‘a_k05k08k16_4ac’, ‘a_k05k12_k05k08k12’], [‘a_k05k08k16_4ac’, ‘a_k08k12k16_4ac’], [‘a_k05k08k16_4ac’, ‘a_k12k16_k08k12k16’], [‘a_k05k12_k05k08k12’, ‘a_k08_k05k08’], [‘a_k05k12_k05k08k12’, ‘a_k08_k08k16’], [‘a_k05k12_k05k12k16’, ‘a_0ac_k08’], [‘a_k05k12_k05k12k16’, ‘a_k05_k05k12’], [‘a_k05k12_k05k12k16’, ‘a_k05k12_k05k08k12’], [‘a_k05k12_k05k12k16’, ‘a_k08k12k16_4ac’], [‘a_k05k12_k05k12k16’, ‘a_k12_k05k12’], [‘a_k05k12_k05k12k16’, ‘a_k12k16_k08k12k16’], [‘a_k05k12k16_4ac’, ‘a_0ac_k08’], [‘a_k05k12k16_4ac’, ‘a_k05_k05k12’], [‘a_k05k12k16_4ac’, ‘a_k05k12_k05k08k12’], [‘a_k05k12k16_4ac’, ‘a_k08k12k16_4ac’], [‘a_k05k12k16_4ac’, ‘a_k12_k05k12’], [‘a_k05k12k16_4ac’, ‘a_k12k16_k08k12k16’], [‘a_k05k12k16_4ac’, ‘a_k16_k12k16’], [‘a_k05k16_k05k08k16’, ‘a_0ac_k08’], [‘a_k05k16_k05k08k16’, ‘a_k05_k05k12’], [‘a_k05k16_k05k08k16’, ‘a_k05k12_k05k08k12’], [‘a_k05k16_k05k08k16’, ‘a_k08k12k16_4ac’], [‘a_k05k16_k05k08k16’, ‘a_k12_k05k12’], [‘a_k05k16_k05k08k16’, ‘a_k12k16_k08k12k16’], [‘a_k05k16_k05k08k16’, ‘a_k16_k12k16’], [‘a_k05k16_k05k12k16’, ‘a_k05k12_k05k08k12’], [‘a_k05k16_k05k12k16’, ‘a_k08k12k16_4ac’], [‘a_k05k16_k05k12k16’, ‘a_k12k16_k08k12k16’], [‘a_k08_k08k12’, ‘a_k05k12_k05k08k12’], [‘a_k08_k08k12’, ‘a_k12k16_k08k12k16’], [‘a_k08k12_k05k08k12’, ‘a_0ac_k08’], [‘a_k08k12_k05k08k12’, ‘a_k05_k05k12’], [‘a_k08k12_k05k08k12’, ‘a_k05k12_k05k08k12’], [‘a_k08k12_k05k08k12’, ‘a_k08k12k16_4ac’], [‘a_k08k12_k05k08k12’, ‘a_k12k16_k08k12k16’], [‘a_k08k12_k08k12k16’, ‘a_0ac_k08’], [‘a_k08k12_k08k12k16’, ‘a_k05_k05k12’], [‘a_k08k12_k08k12k16’, ‘a_k05k12_k05k08k12’], [‘a_k08k12_k08k12k16’, ‘a_k08k12k16_4ac’], [‘a_k08k12_k08k12k16’, ‘a_k12k16_k08k12k16’], [‘a_k08k12k16_4ac’, ‘a_0ac_k12’], [‘a_k08k12k16_4ac’, ‘a_k05_k05k08’], [‘a_k08k12k16_4ac’, ‘a_k08_k05k08’], [‘a_k08k12k16_4ac’, ‘a_k08_k08k12’], [‘a_k08k12k16_4ac’, ‘a_k08_k08k16’], [‘a_k08k12k16_4ac’, ‘a_k08k16_k08k12k16’], [‘a_k08k12k16_4ac’, ‘a_k12k16_k05k12k16’], [‘a_k08k12k16_4ac’, ‘a_k16_k05k16’], [‘a_k08k12k16_4ac’, ‘a_k16_k08k16’], [‘a_k08k16_k05k08k16’, ‘a_0ac_k08’], [‘a_k08k16_k05k08k16’, ‘a_k05_k05k12’], [‘a_k08k16_k05k08k16’, ‘a_k05k12_k05k08k12’], [‘a_k08k16_k05k08k16’, ‘a_k08k12k16_4ac’], [‘a_k08k16_k05k08k16’, ‘a_k12k16_k08k12k16’], [‘a_k08k16_k08k12k16’, ‘a_k05k12_k05k08k12’], [‘a_k12_k05k12’, ‘a_0ac_k05’], [‘a_k12_k05k12’, ‘a_0ac_k12’], [‘a_k12_k05k12’, ‘a_k05_k05k08’], [‘a_k12_k05k12’, ‘a_k05k08_k05k08k12’], [‘a_k12_k05k12’, ‘a_k05k08k16_4ac’], [‘a_k12_k05k12’, ‘a_k05k16_k05k12k16’], [‘a_k12_k05k12’, ‘a_k08_k05k08’], [‘a_k12_k05k12’, ‘a_k08_k08k12’], [‘a_k12_k05k12’, ‘a_k08_k08k16’], [‘a_k12_k05k12’, ‘a_k08k12_k05k08k12’], [‘a_k12_k05k12’, ‘a_k08k12_k08k12k16’], [‘a_k12_k05k12’, ‘a_k08k16_k05k08k16’], [‘a_k12_k05k12’, ‘a_k08k16_k08k12k16’], [‘a_k12_k05k12’, ‘a_k12k16_k05k12k16’], [‘a_k12_k05k12’, ‘a_k16_k05k16’], [‘a_k12_k05k12’, ‘a_k16_k08k16’], [‘a_k12_k08k12’, ‘a_0ac_k08’], [‘a_k12_k08k12’, ‘a_k05_k05k12’], [‘a_k12_k08k12’, ‘a_k05k12_k05k08k12’], [‘a_k12_k08k12’, ‘a_k08k12k16_4ac’], [‘a_k12_k08k12’, ‘a_k12_k05k12’], [‘a_k12_k08k12’, ‘a_k12k16_k08k12k16’], [‘a_k12k16_k05k12k16’, ‘a_k05k12_k05k08k12’], [‘a_k12k16_k05k12k16’, ‘a_k12k16_k08k12k16’], [‘a_k12k16_k08k12k16’, ‘a_k08_k05k08’], [‘a_k12k16_k08k12k16’, ‘a_k08_k08k16’], [‘a_k12k16_k08k12k16’, ‘a_k08k16_k08k12k16’], [‘a_k16_k05k16’, ‘a_k05k12_k05k08k12’], [‘a_k16_k05k16’, ‘a_k12k16_k08k12k16’], [‘a_k16_k08k16’, ‘a_k05k12_k05k08k12’], [‘a_k16_k08k16’, ‘a_k12k16_k08k12k16’], [‘a_k16_k12k16’, ‘a_0ac_k05’], [‘a_k16_k12k16’, ‘a_0ac_k12’], [‘a_k16_k12k16’, ‘a_0ac_k16’], [‘a_k16_k12k16’, ‘a_k05_k05k08’], [‘a_k16_k12k16’, ‘a_k05_k05k16’], [‘a_k16_k12k16’, ‘a_k05k08_k05k08k12’], [‘a_k16_k12k16’, ‘a_k05k08_k05k08k16’], [‘a_k16_k12k16’, ‘a_k05k08k16_4ac’], [‘a_k16_k12k16’, ‘a_k05k12_k05k12k16’], [‘a_k16_k12k16’, ‘a_k05k16_k05k12k16’], [‘a_k16_k12k16’, ‘a_k08_k05k08’], [‘a_k16_k12k16’, ‘a_k08_k08k12’], [‘a_k16_k12k16’, ‘a_k08_k08k16’], [‘a_k16_k12k16’, ‘a_k08k12_k05k08k12’], [‘a_k16_k12k16’, ‘a_k08k12_k08k12k16’], [‘a_k16_k12k16’, ‘a_k08k16_k05k08k16’], [‘a_k16_k12k16’, ‘a_k08k16_k08k12k16’], [‘a_k16_k12k16’, ‘a_k12_k08k12’], [‘a_k16_k12k16’, ‘a_k12k16_k05k12k16’], [‘a_k16_k12k16’, ‘a_k16_k05k16’], [‘a_k16_k12k16’, ‘a_k16_k08k16’]] |
lateral |
0 |
[] |
-1708.110992470788 |
[‘a_0ac_k08’, ‘a_b’, ‘a_k05_k05k12’, ‘a_k05k12_k05k08k12’, ‘a_k08k12k16_4ac’, ‘a_k12_k05k12’, ‘a_k12k16_k08k12k16’, ‘a_k16_k12k16’] |
[] |
Why this problem is challenging
This problem is difficult to solve. The search space is large (~4.3 billion models); hence, many models have criterion values that differ by less than numerical noise, so the ranking of near-optimal models is effectively non-deterministic across machines and tolerances. Plain forward or backward selection mostly fails to reach the optimum.
The FAMoS method reproducibly converges to models with markedly better criterion values. Multi-start FAMoS searches recover the published best model while calibrating only a small fraction (~0.002 %) of the full model space, making this large-scale problem computationally feasible.
Because of the numerical-noise sensitivity noted above, when running this test
case you should expect to obtain a similar (but not necessarily identical)
expected_summary.tsv (a few rows may be reordered, or the path through model space
may differ). However, the improvement in criterion value over consecutive rows should be conserved, and the same select model should be found.