Large-scale, real-world problem with steady-state data

This page describes the model selection problem available in the PEtab Select test suite, case 0009.

It reproduces the histone H4 acetylation model selection problem of Blasi et al. (2016) [Blasi2016]. This involves a model space of \(2^{32} \approx 4.3\) billion models. The model is formulated as a ordinary differential equation system based on biochemical reactions. The dataset is real experimental data, measured at steady-state. Briefly, the model selection problem is to identify which subset of 32 parameters should exist in the model to explain the data.

The files for this model selection problem are available at test_cases/0009.

Biological background

Histones are the proteins around which DNA is wrapped, and histone N-terminal “tails” have post-translational modifications that regulate chromatin. The N-terminal tail of histone H4 has four lysine (K) residues that can each be acetylated: K5, K8, K12, and K16.

Because each of the four sites is either acetylated or not, there are \(2^4 = 16\) possible acetylation patterns, called motifs, ranging from the fully unacetylated state to the fully acetylated state . The research question in [Blasi2016] is to determine how the acetylation patterns are generated: do the acetylation reactions all occur at the same rate regardless of the current acetylation motif, or are some of them motif-specific?

As described below, there are a total of 32 acetylation reactions. The original publication [Blasi2016] found that the “best” model had 7 of these reactions that occur at a motif-specific rate, and the rest occur at a shared rate.

The mathematical model

The model is a reaction network over the 16 acetylation motifs of the histone H4 tail. Each motif is a chemical species \(x_m\), and a reaction adds (acetylation) or removes (deacetylation) a single acetyl group, connecting motifs that differ at exactly one site. Hence, there are 32 acetylation reactions and 32 deacetylation reactions. The dynamics use mass-action kinetics, and the measured abundances are assumed to be at steady state (acetylation/deacetylation are much faster than the cell cycle); in the PEtab problem this is encoded by measurements at time = inf.

The 16 motifs (state variables)

The state variables x_<motif> grouped by the number of acetylated lysines (of K5, K8, K12, K16) are:

  • zero acetylated lysines: x_0ac

  • one acetylated lysine: x_k05, x_k08, x_k12, x_k16

  • two acetylated lysines: x_k05k08, x_k05k12, x_k05k16, x_k08k12, x_k08k16, x_k12k16

  • three acetylated lysines: x_k05k08k12, x_k05k08k16, x_k05k12k16, x_k08k12k16

  • four acetylated lysines: x_4ac

The 32 acetylation reactions

Each acetylation reaction converts one motif \(p\) into another motif \(q\), \(p \to q\), at rate \(a_{p\to q}\, a_b\, x_p\), where \(a_b\) is the basal (shared) acetylation rate and \(a_{p\to q} \equiv\) a_<p>_<q> is a factor that represents the research question. This factor is either fixed to one for reactions that occur with the shared basal rate constant, or the factor is estimated, to change the rate constant to be motif-specific. Each acetylation reaction also has an associated deacetylation reaction \(q \to p\) with rate \(d\, x_q\), where \(d \equiv\) da_b is a single, shared basal deacetylation rate constant (fixed to 1).

The acetylation reactions are explicitly:

      x_0ac -> x_k05            (rate  a_0ac_k05 * a_b * x_0ac)
      x_0ac -> x_k08            (rate  a_0ac_k08 * a_b * x_0ac)
      x_0ac -> x_k12            (rate  a_0ac_k12 * a_b * x_0ac)
      x_0ac -> x_k16            (rate  a_0ac_k16 * a_b * x_0ac)
      x_k05 -> x_k05k08         (rate  a_k05_k05k08 * a_b * x_k05)
      x_k05 -> x_k05k12         (rate  a_k05_k05k12 * a_b * x_k05)
      x_k05 -> x_k05k16         (rate  a_k05_k05k16 * a_b * x_k05)
      x_k08 -> x_k05k08         (rate  a_k08_k05k08 * a_b * x_k08)
      x_k08 -> x_k08k12         (rate  a_k08_k08k12 * a_b * x_k08)
      x_k08 -> x_k08k16         (rate  a_k08_k08k16 * a_b * x_k08)
      x_k12 -> x_k05k12         (rate  a_k12_k05k12 * a_b * x_k12)
      x_k12 -> x_k08k12         (rate  a_k12_k08k12 * a_b * x_k12)
      x_k12 -> x_k12k16         (rate  a_k12_k12k16 * a_b * x_k12)
      x_k16 -> x_k05k16         (rate  a_k16_k05k16 * a_b * x_k16)
      x_k16 -> x_k08k16         (rate  a_k16_k08k16 * a_b * x_k16)
      x_k16 -> x_k12k16         (rate  a_k16_k12k16 * a_b * x_k16)
   x_k05k08 -> x_k05k08k12      (rate  a_k05k08_k05k08k12 * a_b * x_k05k08)
   x_k05k08 -> x_k05k08k16      (rate  a_k05k08_k05k08k16 * a_b * x_k05k08)
   x_k05k12 -> x_k05k08k12      (rate  a_k05k12_k05k08k12 * a_b * x_k05k12)
   x_k05k12 -> x_k05k12k16      (rate  a_k05k12_k05k12k16 * a_b * x_k05k12)
   x_k05k16 -> x_k05k08k16      (rate  a_k05k16_k05k08k16 * a_b * x_k05k16)
   x_k05k16 -> x_k05k12k16      (rate  a_k05k16_k05k12k16 * a_b * x_k05k16)
   x_k08k12 -> x_k05k08k12      (rate  a_k08k12_k05k08k12 * a_b * x_k08k12)
   x_k08k12 -> x_k08k12k16      (rate  a_k08k12_k08k12k16 * a_b * x_k08k12)
   x_k08k16 -> x_k05k08k16      (rate  a_k08k16_k05k08k16 * a_b * x_k08k16)
   x_k08k16 -> x_k08k12k16      (rate  a_k08k16_k08k12k16 * a_b * x_k08k16)
   x_k12k16 -> x_k05k12k16      (rate  a_k12k16_k05k12k16 * a_b * x_k12k16)
   x_k12k16 -> x_k08k12k16      (rate  a_k12k16_k08k12k16 * a_b * x_k12k16)
x_k05k08k12 -> x_4ac            (rate  a_k05k08k12_4ac * a_b * x_k05k08k12)
x_k05k08k16 -> x_4ac            (rate  a_k05k08k16_4ac * a_b * x_k05k08k16)
x_k05k12k16 -> x_4ac            (rate  a_k05k12k16_4ac * a_b * x_k05k12k16)
x_k08k12k16 -> x_4ac            (rate  a_k08k12k16_4ac * a_b * x_k08k12k16)

The 32 reverse deacetylations occur at rate e.g. da_b * x_k05 for k05 -> 0ac. These are not listed explicitly but are present for every acetylation reaction above.

The ordinary differential equations

Based on these reactions and rates, the ODE system is therefore

\[\begin{split}\frac{\mathrm{d}x_{\mathrm{0ac}}}{\mathrm{d}t} &= d\left(x_{\mathrm{k05}} + x_{\mathrm{k08}} + x_{\mathrm{k12}} + x_{\mathrm{k16}}\right) - a_b\left(a_{\mathrm{0ac}\to\mathrm{k05}} + a_{\mathrm{0ac}\to\mathrm{k08}} + a_{\mathrm{0ac}\to\mathrm{k12}} + a_{\mathrm{0ac}\to\mathrm{k16}}\right)x_{\mathrm{0ac}} \\ \frac{\mathrm{d}x_{\mathrm{k05}}}{\mathrm{d}t} &= a_b\,a_{\mathrm{0ac}\to\mathrm{k05}} x_{\mathrm{0ac}} + d\left(x_{\mathrm{k05k08}} + x_{\mathrm{k05k12}} + x_{\mathrm{k05k16}}\right) - a_b\left(a_{\mathrm{k05}\to\mathrm{k05k08}} + a_{\mathrm{k05}\to\mathrm{k05k12}} + a_{\mathrm{k05}\to\mathrm{k05k16}}\right)x_{\mathrm{k05}} - d\,x_{\mathrm{k05}} \\ \frac{\mathrm{d}x_{\mathrm{k08}}}{\mathrm{d}t} &= a_b\,a_{\mathrm{0ac}\to\mathrm{k08}} x_{\mathrm{0ac}} + d\left(x_{\mathrm{k05k08}} + x_{\mathrm{k08k12}} + x_{\mathrm{k08k16}}\right) - a_b\left(a_{\mathrm{k08}\to\mathrm{k05k08}} + a_{\mathrm{k08}\to\mathrm{k08k12}} + a_{\mathrm{k08}\to\mathrm{k08k16}}\right)x_{\mathrm{k08}} - d\,x_{\mathrm{k08}} \\ \frac{\mathrm{d}x_{\mathrm{k12}}}{\mathrm{d}t} &= a_b\,a_{\mathrm{0ac}\to\mathrm{k12}} x_{\mathrm{0ac}} + d\left(x_{\mathrm{k05k12}} + x_{\mathrm{k08k12}} + x_{\mathrm{k12k16}}\right) - a_b\left(a_{\mathrm{k12}\to\mathrm{k05k12}} + a_{\mathrm{k12}\to\mathrm{k08k12}} + a_{\mathrm{k12}\to\mathrm{k12k16}}\right)x_{\mathrm{k12}} - d\,x_{\mathrm{k12}} \\ \frac{\mathrm{d}x_{\mathrm{k16}}}{\mathrm{d}t} &= a_b\,a_{\mathrm{0ac}\to\mathrm{k16}} x_{\mathrm{0ac}} + d\left(x_{\mathrm{k05k16}} + x_{\mathrm{k08k16}} + x_{\mathrm{k12k16}}\right) - a_b\left(a_{\mathrm{k16}\to\mathrm{k05k16}} + a_{\mathrm{k16}\to\mathrm{k08k16}} + a_{\mathrm{k16}\to\mathrm{k12k16}}\right)x_{\mathrm{k16}} - d\,x_{\mathrm{k16}} \\ \frac{\mathrm{d}x_{\mathrm{k05k08}}}{\mathrm{d}t} &= a_b\left(a_{\mathrm{k05}\to\mathrm{k05k08}} x_{\mathrm{k05}} + a_{\mathrm{k08}\to\mathrm{k05k08}} x_{\mathrm{k08}}\right) + d\left(x_{\mathrm{k05k08k12}} + x_{\mathrm{k05k08k16}}\right) - a_b\left(a_{\mathrm{k05k08}\to\mathrm{k05k08k12}} + a_{\mathrm{k05k08}\to\mathrm{k05k08k16}}\right)x_{\mathrm{k05k08}} - 2\,d\,x_{\mathrm{k05k08}} \\ \frac{\mathrm{d}x_{\mathrm{k05k12}}}{\mathrm{d}t} &= a_b\left(a_{\mathrm{k05}\to\mathrm{k05k12}} x_{\mathrm{k05}} + a_{\mathrm{k12}\to\mathrm{k05k12}} x_{\mathrm{k12}}\right) + d\left(x_{\mathrm{k05k08k12}} + x_{\mathrm{k05k12k16}}\right) - a_b\left(a_{\mathrm{k05k12}\to\mathrm{k05k08k12}} + a_{\mathrm{k05k12}\to\mathrm{k05k12k16}}\right)x_{\mathrm{k05k12}} - 2\,d\,x_{\mathrm{k05k12}} \\ \frac{\mathrm{d}x_{\mathrm{k05k16}}}{\mathrm{d}t} &= a_b\left(a_{\mathrm{k05}\to\mathrm{k05k16}} x_{\mathrm{k05}} + a_{\mathrm{k16}\to\mathrm{k05k16}} x_{\mathrm{k16}}\right) + d\left(x_{\mathrm{k05k08k16}} + x_{\mathrm{k05k12k16}}\right) - a_b\left(a_{\mathrm{k05k16}\to\mathrm{k05k08k16}} + a_{\mathrm{k05k16}\to\mathrm{k05k12k16}}\right)x_{\mathrm{k05k16}} - 2\,d\,x_{\mathrm{k05k16}} \\ \frac{\mathrm{d}x_{\mathrm{k08k12}}}{\mathrm{d}t} &= a_b\left(a_{\mathrm{k08}\to\mathrm{k08k12}} x_{\mathrm{k08}} + a_{\mathrm{k12}\to\mathrm{k08k12}} x_{\mathrm{k12}}\right) + d\left(x_{\mathrm{k05k08k12}} + x_{\mathrm{k08k12k16}}\right) - a_b\left(a_{\mathrm{k08k12}\to\mathrm{k05k08k12}} + a_{\mathrm{k08k12}\to\mathrm{k08k12k16}}\right)x_{\mathrm{k08k12}} - 2\,d\,x_{\mathrm{k08k12}} \\ \frac{\mathrm{d}x_{\mathrm{k08k16}}}{\mathrm{d}t} &= a_b\left(a_{\mathrm{k08}\to\mathrm{k08k16}} x_{\mathrm{k08}} + a_{\mathrm{k16}\to\mathrm{k08k16}} x_{\mathrm{k16}}\right) + d\left(x_{\mathrm{k05k08k16}} + x_{\mathrm{k08k12k16}}\right) - a_b\left(a_{\mathrm{k08k16}\to\mathrm{k05k08k16}} + a_{\mathrm{k08k16}\to\mathrm{k08k12k16}}\right)x_{\mathrm{k08k16}} - 2\,d\,x_{\mathrm{k08k16}} \\ \frac{\mathrm{d}x_{\mathrm{k12k16}}}{\mathrm{d}t} &= a_b\left(a_{\mathrm{k12}\to\mathrm{k12k16}} x_{\mathrm{k12}} + a_{\mathrm{k16}\to\mathrm{k12k16}} x_{\mathrm{k16}}\right) + d\left(x_{\mathrm{k05k12k16}} + x_{\mathrm{k08k12k16}}\right) - a_b\left(a_{\mathrm{k12k16}\to\mathrm{k05k12k16}} + a_{\mathrm{k12k16}\to\mathrm{k08k12k16}}\right)x_{\mathrm{k12k16}} - 2\,d\,x_{\mathrm{k12k16}} \\ \frac{\mathrm{d}x_{\mathrm{k05k08k12}}}{\mathrm{d}t} &= a_b\left(a_{\mathrm{k05k08}\to\mathrm{k05k08k12}} x_{\mathrm{k05k08}} + a_{\mathrm{k05k12}\to\mathrm{k05k08k12}} x_{\mathrm{k05k12}} + a_{\mathrm{k08k12}\to\mathrm{k05k08k12}} x_{\mathrm{k08k12}}\right) + d\,x_{\mathrm{4ac}} - a_b\,a_{\mathrm{k05k08k12}\to\mathrm{4ac}}x_{\mathrm{k05k08k12}} - 3\,d\,x_{\mathrm{k05k08k12}} \\ \frac{\mathrm{d}x_{\mathrm{k05k08k16}}}{\mathrm{d}t} &= a_b\left(a_{\mathrm{k05k08}\to\mathrm{k05k08k16}} x_{\mathrm{k05k08}} + a_{\mathrm{k05k16}\to\mathrm{k05k08k16}} x_{\mathrm{k05k16}} + a_{\mathrm{k08k16}\to\mathrm{k05k08k16}} x_{\mathrm{k08k16}}\right) + d\,x_{\mathrm{4ac}} - a_b\,a_{\mathrm{k05k08k16}\to\mathrm{4ac}}x_{\mathrm{k05k08k16}} - 3\,d\,x_{\mathrm{k05k08k16}} \\ \frac{\mathrm{d}x_{\mathrm{k05k12k16}}}{\mathrm{d}t} &= a_b\left(a_{\mathrm{k05k12}\to\mathrm{k05k12k16}} x_{\mathrm{k05k12}} + a_{\mathrm{k05k16}\to\mathrm{k05k12k16}} x_{\mathrm{k05k16}} + a_{\mathrm{k12k16}\to\mathrm{k05k12k16}} x_{\mathrm{k12k16}}\right) + d\,x_{\mathrm{4ac}} - a_b\,a_{\mathrm{k05k12k16}\to\mathrm{4ac}}x_{\mathrm{k05k12k16}} - 3\,d\,x_{\mathrm{k05k12k16}} \\ \frac{\mathrm{d}x_{\mathrm{k08k12k16}}}{\mathrm{d}t} &= a_b\left(a_{\mathrm{k08k12}\to\mathrm{k08k12k16}} x_{\mathrm{k08k12}} + a_{\mathrm{k08k16}\to\mathrm{k08k12k16}} x_{\mathrm{k08k16}} + a_{\mathrm{k12k16}\to\mathrm{k08k12k16}} x_{\mathrm{k12k16}}\right) + d\,x_{\mathrm{4ac}} - a_b\,a_{\mathrm{k08k12k16}\to\mathrm{4ac}}x_{\mathrm{k08k12k16}} - 3\,d\,x_{\mathrm{k08k12k16}} \\ \frac{\mathrm{d}x_{\mathrm{4ac}}}{\mathrm{d}t} &= a_b\left(a_{\mathrm{k05k08k12}\to\mathrm{4ac}} x_{\mathrm{k05k08k12}} + a_{\mathrm{k05k08k16}\to\mathrm{4ac}} x_{\mathrm{k05k08k16}} + a_{\mathrm{k05k12k16}\to\mathrm{4ac}} x_{\mathrm{k05k12k16}} + a_{\mathrm{k08k12k16}\to\mathrm{4ac}} x_{\mathrm{k08k12k16}}\right) - 4\,d\,x_{\mathrm{4ac}}\end{split}\]

The observation model

Each observable is the steady-state abundance of one motif,

\[y_m(\theta) = x_m^{\mathrm{ss}}(\theta),\]

for the 15 observed motifs, \(m \in \mathcal{O}\). The observed set \(\mathcal{O}\) excludes the x_k05k16 motif because it was below the quantification limit of the experiment. The measured are assumed to follow a log-normal (multiplicative) noise distribution:

\[\ln \bar{y}_{m,r} = \ln y_m(\theta) + \varepsilon_{m,r}, \qquad \varepsilon_{m,r} \sim \mathcal{N}(0, \sigma^2),\]

where \(\bar{y}_{m,r}\) is replicate \(r\) of the measurement of motif \(m\), and \(\sigma =\) sigma_ is the noise parameter (fixed to 1).

The likelihood function

The likelihood of the estimated parameters \(\theta\) (the basal rate \(a_b\) and the estimated motif-specific factors) is

\[\mathcal{L}(\theta) = \prod_{m \in \mathcal{O}} \prod_{r=1}^{R_m} \frac{1}{\bar{y}_{m,r}\, \sigma \sqrt{2\pi}} \exp\!\left( -\frac{\left(\ln \bar{y}_{m,r} - \ln y_m(\theta)\right)^2}{2\sigma^2} \right),\]

and the corresponding negative log-likelihood, which PEtab Select uses to compute model selection criteria, is

\[\mathrm{NLLH}(\theta) = \sum_{m \in \mathcal{O}} \sum_{r=1}^{R_m} \left[ \ln\!\left(\bar{y}_{m,r}\, \sigma \sqrt{2\pi}\right) + \frac{\left(\ln \bar{y}_{m,r} - \ln y_m(\theta)\right)^2}{2\sigma^2} \right],\]

where \(R_m\) is the number of replicates of motif \(m\).

Experimental data

The model is fitted to the relative abundances of the H4 acetylation motifs measured in Drosophila melanogaster Kc cells. The data come from the quantitative mass-spectrometry study of Feller et al. (2015), as used by Blasi et al. [Blasi2016]:

Briefly, histone H4 was extracted from wild-type Kc cells. The relative abundances of the H4 N-terminal acetylation motifs were quantified by liquid chromatography–mass spectrometry (LC–MS). The measurements represent the steady-state distribution of motifs.

Of the 16 motifs, one (K5K16) lies below the quantification limit and is not used; hence, the PEtab problem has 15 observables (one per measured motif) and 252 measurements in total.

The model selection problem

There is one hypothesis per acetylation reaction: its rate is either the shared basal rate (\(a_{p\to q}\) fixed to 1) or an estimated motif-specific rate (\(a_{p\to q}\) estimated). With 32 reactions, the model space therefore contains \(2^{32} \approx 4.3\) billion models. The task is to identify the subset of reactions that require a motif-specific rate constant to explain the data. The published best model had seven motif-specific reactions [Blasi2016].

This PEtab Select formulation differs from the original publication in three ways in that it omits 11 additional models that were considered in the original publication. These 11 models add negligible computational cost to the model selection problem and were not amongst the best models in the original publication, so are ignored here. Furthermore, we use the FAMoS search method as a general strategy, instead of the highly-tailored problem-specific approach used in the original publication that enabled them to use the brute-force method.

The PEtab Select files

PEtab Select problem YAML

The problem file specifies the criterion, search method, model space file and additional arguments for the search method.

format_version: beta_1
criterion: AICc
method: famos
model_space_files:
- model_space.tsv
candidate_space_arguments:
  critical_parameter_sets: []
  consecutive_laterals: true
  predecessor_model: predecessor_model.yaml
  summary_tsv: output/summary.tsv
  swap_parameter_sets:
  - - a_0ac_k05
    - a_0ac_k08
    - a_0ac_k12
    - a_0ac_k16
    - a_k05_k05k08
    - a_k05_k05k12
    - a_k05_k05k16
    - a_k05k08_k05k08k12
    - a_k05k08_k05k08k16
    - a_k05k08k12_4ac
    - a_k05k08k16_4ac
    - a_k05k12_k05k08k12
    - a_k05k12_k05k12k16
    - a_k05k12k16_4ac
    - a_k05k16_k05k08k16
    - a_k05k16_k05k12k16
    - a_k08_k05k08
    - a_k08_k08k12
    - a_k08_k08k16
    - a_k08k12_k05k08k12
    - a_k08k12_k08k12k16
    - a_k08k12k16_4ac
    - a_k08k16_k05k08k16
    - a_k08k16_k08k12k16
    - a_k12_k05k12
    - a_k12_k08k12
    - a_k12_k12k16
    - a_k12k16_k05k12k16
    - a_k12k16_k08k12k16
    - a_k16_k05k16
    - a_k16_k08k16
    - a_k16_k12k16

The candidate_space_arguments configure FAMoS:

  • predecessor_model: the initial model the search starts from (see below).

  • critical_parameter_sets: empty here — no reaction is forced to always be motif-specific.

  • swap_parameter_sets: a single set containing all 32 reaction parameters. FAMoS lateral (swap) moves exchange an estimated parameter for a fixed one; restricting swaps to within a set means any motif-specific reaction may be swapped for any other.

  • consecutive_laterals: true: keep performing lateral moves while they keep improving the model.

  • summary_tsv: where to write a summary of the search status and history.

Model space

The model space is a single subspace M with 32 parameter columns, one per acetylation reaction. Every column has the value 1.0;estimate, meaning each reaction can be either fixed to the basal rate (1.0) or estimated (motif-specific). This concisely encodes all ~4.3 billion models in a single row (the table is wide, with one column per reaction):

model_subspace_id

model_subspace_petab_yaml

a_0ac_k05

a_0ac_k08

a_0ac_k12

a_0ac_k16

a_k05_k05k08

a_k05_k05k12

a_k05_k05k16

a_k08_k05k08

a_k08_k08k12

a_k08_k08k16

a_k12_k05k12

a_k12_k08k12

a_k12_k12k16

a_k16_k05k16

a_k16_k08k16

a_k16_k12k16

a_k05k08_k05k08k12

a_k05k08_k05k08k16

a_k05k12_k05k08k12

a_k05k12_k05k12k16

a_k05k16_k05k08k16

a_k05k16_k05k12k16

a_k08k12_k05k08k12

a_k08k12_k08k12k16

a_k08k16_k05k08k16

a_k08k16_k08k12k16

a_k12k16_k05k12k16

a_k12k16_k08k12k16

a_k05k08k12_4ac

a_k05k08k16_4ac

a_k05k12k16_4ac

a_k08k12k16_4ac

M

petab/petab_problem.yaml

1.0;estimate

1.0;estimate

1.0;estimate

1.0;estimate

1.0;estimate

1.0;estimate

1.0;estimate

1.0;estimate

1.0;estimate

1.0;estimate

1.0;estimate

1.0;estimate

1.0;estimate

1.0;estimate

1.0;estimate

1.0;estimate

1.0;estimate

1.0;estimate

1.0;estimate

1.0;estimate

1.0;estimate

1.0;estimate

1.0;estimate

1.0;estimate

1.0;estimate

1.0;estimate

1.0;estimate

1.0;estimate

1.0;estimate

1.0;estimate

1.0;estimate

1.0;estimate

The PEtab problem (petab/)

The model space references a standard PEtab problem that defines the superset model (all reactions present):

  • model.xml: the SBML model with the 16 motif species and the acetylation / deacetylation reactions.

  • parameters.tsv: the 32 reaction-rate factors a_<reaction> (log10 scale, bounds [1e-3, 1e3]), the basal acetylation rate a_b, the deacetylation reference rate da_b (fixed to 1), and the noise parameter sigma_.

  • observables.tsv: 15 observables, one per measured motif, and the log-normal noise distribution.

  • measurements.tsv: the 252 steady-state (time = inf) abundance measurements.

  • conditions.tsv: a single dummy condition.

Predecessor (initial) model

predecessor_model.yaml is the model the FAMoS search is initialised from, in the PEtab Select model YAML file format. It is a specific starting model, with 15 motif-specific reactions, that was found to reproducibly lead to the best model. Without it, solving the problem from scratch requires on the order of 100 randomly initialised FAMoS searches.

criteria:
  NLLH: inf
estimated_parameters: {}
model_hash: M-01000001111110010011010011010001
model_id: M-01000001111110010011010011010001
model_subspace_id: M
model_subspace_indices:
- 0
- 1
- 0
- 0
- 0
- 0
- 0
- 1
- 1
- 1
- 1
- 1
- 1
- 0
- 0
- 1
- 0
- 0
- 1
- 1
- 0
- 1
- 0
- 0
- 1
- 1
- 0
- 1
- 0
- 0
- 0
- 1
parameters:
  a_0ac_k05: 1
  a_0ac_k08: estimate
  a_0ac_k12: 1
  a_0ac_k16: 1
  a_k05_k05k08: 1
  a_k05_k05k12: 1
  a_k05_k05k16: 1
  a_k05k08_k05k08k12: 1
  a_k05k08_k05k08k16: 1
  a_k05k08k12_4ac: 1
  a_k05k08k16_4ac: 1
  a_k05k12_k05k08k12: estimate
  a_k05k12_k05k12k16: estimate
  a_k05k12k16_4ac: 1
  a_k05k16_k05k08k16: 1
  a_k05k16_k05k12k16: estimate
  a_k08_k05k08: estimate
  a_k08_k08k12: estimate
  a_k08_k08k16: estimate
  a_k08k12_k05k08k12: 1
  a_k08k12_k08k12k16: 1
  a_k08k12k16_4ac: estimate
  a_k08k16_k05k08k16: estimate
  a_k08k16_k08k12k16: estimate
  a_k12_k05k12: estimate
  a_k12_k08k12: estimate
  a_k12_k12k16: estimate
  a_k12k16_k05k12k16: 1
  a_k12k16_k08k12k16: estimate
  a_k16_k05k16: 1
  a_k16_k08k16: 1
  a_k16_k12k16: estimate
model_subspace_petab_yaml: petab/petab_problem.yaml
predecessor_model_hash: virtual_initial_model-

Expected results

expected.yaml is the expected selected model, in the PEtab Select model YAML file format. It has the seven motif-specific reactions of the published best model (a_0ac_k08, a_k05_k05k12, a_k12_k05k12, a_k16_k12k16, a_k05k12_k05k08k12, a_k12k16_k08k12k16, a_k08k12k16_4ac) plus the basal rate a_b — eight estimated parameters in total — with AICc -1708.1.

model_subspace_id: M
model_subspace_indices:
- 0
- 1
- 0
- 0
- 0
- 1
- 0
- 0
- 0
- 0
- 1
- 0
- 0
- 0
- 0
- 1
- 0
- 0
- 1
- 0
- 0
- 0
- 0
- 0
- 0
- 0
- 0
- 1
- 0
- 0
- 0
- 1
criteria:
  NLLH: -862.3517925313981
  AICc: -1708.1109924702037
model_hash: M-01000100001000010010000000010001
model_subspace_petab_yaml: petab/petab_problem.yaml
estimated_parameters:
  a_0ac_k08: 0.40850355273291267
  a_k05_k05k12: 30.888150959586138
  a_k12_k05k12: 8.267845459216893
  a_k16_k12k16: 10.424629099941777
  a_k05k12_k05k08k12: 4.872747603868694
  a_k12k16_k08k12k16: 33.03769174387633
  a_k08k12k16_4ac: 53.80106471593421
  a_b: 0.06675819571287103
iteration: 11
model_id: M-01000100001000010010000000010001
parameters:
  a_0ac_k05: 1
  a_0ac_k08: estimate
  a_0ac_k12: 1
  a_0ac_k16: 1
  a_k05_k05k08: 1
  a_k05_k05k12: estimate
  a_k05_k05k16: 1
  a_k08_k05k08: 1
  a_k08_k08k12: 1
  a_k08_k08k16: 1
  a_k12_k05k12: estimate
  a_k12_k08k12: 1
  a_k12_k12k16: 1
  a_k16_k05k16: 1
  a_k16_k08k16: 1
  a_k16_k12k16: estimate
  a_k05k08_k05k08k12: 1
  a_k05k08_k05k08k16: 1
  a_k05k12_k05k08k12: estimate
  a_k05k12_k05k12k16: 1
  a_k05k16_k05k08k16: 1
  a_k05k16_k05k12k16: 1
  a_k08k12_k05k08k12: 1
  a_k08k12_k08k12k16: 1
  a_k08k16_k05k08k16: 1
  a_k08k16_k08k12k16: 1
  a_k12k16_k05k12k16: 1
  a_k12k16_k08k12k16: estimate
  a_k05k08k12_4ac: 1
  a_k05k08k16_4ac: 1
  a_k05k12k16_4ac: 1
  a_k08k12k16_4ac: estimate
predecessor_model_hash: M-01000100001010010010000000010001

expected_summary.tsv is the FAMoS search trajectory, one row per iteration, showing how the method switches between forward, backward, and lateral moves and how the criterion improves at each step:

method

# candidates

predecessor change

current model criterion

current model

candidate changes

forward

17

[]

inf

[‘a_0ac_k08’, ‘a_b’, ‘a_k05k12_k05k08k12’, ‘a_k05k12_k05k12k16’, ‘a_k05k16_k05k12k16’, ‘a_k08_k05k08’, ‘a_k08_k08k12’, ‘a_k08_k08k16’, ‘a_k08k12k16_4ac’, ‘a_k08k16_k05k08k16’, ‘a_k08k16_k08k12k16’, ‘a_k12_k05k12’, ‘a_k12_k08k12’, ‘a_k12_k12k16’, ‘a_k12k16_k08k12k16’, ‘a_k16_k12k16’]

[[‘a_0ac_k05’], [‘a_0ac_k12’], [‘a_0ac_k16’], [‘a_k05_k05k08’], [‘a_k05_k05k12’], [‘a_k05_k05k16’], [‘a_k05k08_k05k08k12’], [‘a_k05k08_k05k08k16’], [‘a_k05k08k12_4ac’], [‘a_k05k08k16_4ac’], [‘a_k05k12k16_4ac’], [‘a_k05k16_k05k08k16’], [‘a_k08k12_k05k08k12’], [‘a_k08k12_k08k12k16’], [‘a_k12k16_k05k12k16’], [‘a_k16_k05k16’], [‘a_k16_k08k16’]]

forward

16

[‘a_k05_k05k12’]

-1688.4806604640246

[‘a_0ac_k08’, ‘a_b’, ‘a_k05_k05k12’, ‘a_k05k12_k05k08k12’, ‘a_k05k12_k05k12k16’, ‘a_k05k16_k05k12k16’, ‘a_k08_k05k08’, ‘a_k08_k08k12’, ‘a_k08_k08k16’, ‘a_k08k12k16_4ac’, ‘a_k08k16_k05k08k16’, ‘a_k08k16_k08k12k16’, ‘a_k12_k05k12’, ‘a_k12_k08k12’, ‘a_k12_k12k16’, ‘a_k12k16_k08k12k16’, ‘a_k16_k12k16’]

[[‘a_0ac_k05’], [‘a_0ac_k12’], [‘a_0ac_k16’], [‘a_k05_k05k08’], [‘a_k05_k05k16’], [‘a_k05k08_k05k08k12’], [‘a_k05k08_k05k08k16’], [‘a_k05k08k12_4ac’], [‘a_k05k08k16_4ac’], [‘a_k05k12k16_4ac’], [‘a_k05k16_k05k08k16’], [‘a_k08k12_k05k08k12’], [‘a_k08k12_k08k12k16’], [‘a_k12k16_k05k12k16’], [‘a_k16_k05k16’], [‘a_k16_k08k16’]]

backward

16

[]

-1688.4806604640246

[‘a_0ac_k08’, ‘a_b’, ‘a_k05_k05k12’, ‘a_k05k12_k05k08k12’, ‘a_k05k12_k05k12k16’, ‘a_k05k16_k05k12k16’, ‘a_k08_k05k08’, ‘a_k08_k08k12’, ‘a_k08_k08k16’, ‘a_k08k12k16_4ac’, ‘a_k08k16_k05k08k16’, ‘a_k08k16_k08k12k16’, ‘a_k12_k05k12’, ‘a_k12_k08k12’, ‘a_k12_k12k16’, ‘a_k12k16_k08k12k16’, ‘a_k16_k12k16’]

[[‘a_0ac_k08’], [‘a_k05_k05k12’], [‘a_k05k12_k05k08k12’], [‘a_k05k12_k05k12k16’], [‘a_k05k16_k05k12k16’], [‘a_k08_k05k08’], [‘a_k08_k08k12’], [‘a_k08_k08k16’], [‘a_k08k12k16_4ac’], [‘a_k08k16_k05k08k16’], [‘a_k08k16_k08k12k16’], [‘a_k12_k05k12’], [‘a_k12_k08k12’], [‘a_k12_k12k16’], [‘a_k12k16_k08k12k16’], [‘a_k16_k12k16’]]

backward

15

[‘a_k08_k08k16’]

-1690.7812030034663

[‘a_0ac_k08’, ‘a_b’, ‘a_k05_k05k12’, ‘a_k05k12_k05k08k12’, ‘a_k05k12_k05k12k16’, ‘a_k05k16_k05k12k16’, ‘a_k08_k05k08’, ‘a_k08_k08k12’, ‘a_k08k12k16_4ac’, ‘a_k08k16_k05k08k16’, ‘a_k08k16_k08k12k16’, ‘a_k12_k05k12’, ‘a_k12_k08k12’, ‘a_k12_k12k16’, ‘a_k12k16_k08k12k16’, ‘a_k16_k12k16’]

[[‘a_0ac_k08’], [‘a_k05_k05k12’], [‘a_k05k12_k05k08k12’], [‘a_k05k12_k05k12k16’], [‘a_k05k16_k05k12k16’], [‘a_k08_k05k08’], [‘a_k08_k08k12’], [‘a_k08k12k16_4ac’], [‘a_k08k16_k05k08k16’], [‘a_k08k16_k08k12k16’], [‘a_k12_k05k12’], [‘a_k12_k08k12’], [‘a_k12_k12k16’], [‘a_k12k16_k08k12k16’], [‘a_k16_k12k16’]]

backward

14

[‘a_k05k16_k05k12k16’]

-1693.0621897450123

[‘a_0ac_k08’, ‘a_b’, ‘a_k05_k05k12’, ‘a_k05k12_k05k08k12’, ‘a_k05k12_k05k12k16’, ‘a_k08_k05k08’, ‘a_k08_k08k12’, ‘a_k08k12k16_4ac’, ‘a_k08k16_k05k08k16’, ‘a_k08k16_k08k12k16’, ‘a_k12_k05k12’, ‘a_k12_k08k12’, ‘a_k12_k12k16’, ‘a_k12k16_k08k12k16’, ‘a_k16_k12k16’]

[[‘a_0ac_k08’], [‘a_k05_k05k12’], [‘a_k05k12_k05k08k12’], [‘a_k05k12_k05k12k16’], [‘a_k08_k05k08’], [‘a_k08_k08k12’], [‘a_k08k12k16_4ac’], [‘a_k08k16_k05k08k16’], [‘a_k08k16_k08k12k16’], [‘a_k12_k05k12’], [‘a_k12_k08k12’], [‘a_k12_k12k16’], [‘a_k12k16_k08k12k16’], [‘a_k16_k12k16’]]

backward

13

[‘a_k08_k08k12’]

-1695.323741885311

[‘a_0ac_k08’, ‘a_b’, ‘a_k05_k05k12’, ‘a_k05k12_k05k08k12’, ‘a_k05k12_k05k12k16’, ‘a_k08_k05k08’, ‘a_k08k12k16_4ac’, ‘a_k08k16_k05k08k16’, ‘a_k08k16_k08k12k16’, ‘a_k12_k05k12’, ‘a_k12_k08k12’, ‘a_k12_k12k16’, ‘a_k12k16_k08k12k16’, ‘a_k16_k12k16’]

[[‘a_0ac_k08’], [‘a_k05_k05k12’], [‘a_k05k12_k05k08k12’], [‘a_k05k12_k05k12k16’], [‘a_k08_k05k08’], [‘a_k08k12k16_4ac’], [‘a_k08k16_k05k08k16’], [‘a_k08k16_k08k12k16’], [‘a_k12_k05k12’], [‘a_k12_k08k12’], [‘a_k12_k12k16’], [‘a_k12k16_k08k12k16’], [‘a_k16_k12k16’]]

backward

12

[‘a_k05k12_k05k12k16’]

-1697.5623869086814

[‘a_0ac_k08’, ‘a_b’, ‘a_k05_k05k12’, ‘a_k05k12_k05k08k12’, ‘a_k08_k05k08’, ‘a_k08k12k16_4ac’, ‘a_k08k16_k05k08k16’, ‘a_k08k16_k08k12k16’, ‘a_k12_k05k12’, ‘a_k12_k08k12’, ‘a_k12_k12k16’, ‘a_k12k16_k08k12k16’, ‘a_k16_k12k16’]

[[‘a_0ac_k08’], [‘a_k05_k05k12’], [‘a_k05k12_k05k08k12’], [‘a_k08_k05k08’], [‘a_k08k12k16_4ac’], [‘a_k08k16_k05k08k16’], [‘a_k08k16_k08k12k16’], [‘a_k12_k05k12’], [‘a_k12_k08k12’], [‘a_k12_k12k16’], [‘a_k12k16_k08k12k16’], [‘a_k16_k12k16’]]

backward

11

[‘a_k12_k08k12’]

-1699.7744801865597

[‘a_0ac_k08’, ‘a_b’, ‘a_k05_k05k12’, ‘a_k05k12_k05k08k12’, ‘a_k08_k05k08’, ‘a_k08k12k16_4ac’, ‘a_k08k16_k05k08k16’, ‘a_k08k16_k08k12k16’, ‘a_k12_k05k12’, ‘a_k12_k12k16’, ‘a_k12k16_k08k12k16’, ‘a_k16_k12k16’]

[[‘a_0ac_k08’], [‘a_k05_k05k12’], [‘a_k05k12_k05k08k12’], [‘a_k08_k05k08’], [‘a_k08k12k16_4ac’], [‘a_k08k16_k05k08k16’], [‘a_k08k16_k08k12k16’], [‘a_k12_k05k12’], [‘a_k12_k12k16’], [‘a_k12k16_k08k12k16’], [‘a_k16_k12k16’]]

backward

10

[‘a_k08k16_k08k12k16’]

-1701.9477568144061

[‘a_0ac_k08’, ‘a_b’, ‘a_k05_k05k12’, ‘a_k05k12_k05k08k12’, ‘a_k08_k05k08’, ‘a_k08k12k16_4ac’, ‘a_k08k16_k05k08k16’, ‘a_k12_k05k12’, ‘a_k12_k12k16’, ‘a_k12k16_k08k12k16’, ‘a_k16_k12k16’]

[[‘a_0ac_k08’], [‘a_k05_k05k12’], [‘a_k05k12_k05k08k12’], [‘a_k08_k05k08’], [‘a_k08k12k16_4ac’], [‘a_k08k16_k05k08k16’], [‘a_k12_k05k12’], [‘a_k12_k12k16’], [‘a_k12k16_k08k12k16’], [‘a_k16_k12k16’]]

backward

9

[‘a_k08_k05k08’]

-1704.0989381094057

[‘a_0ac_k08’, ‘a_b’, ‘a_k05_k05k12’, ‘a_k05k12_k05k08k12’, ‘a_k08k12k16_4ac’, ‘a_k08k16_k05k08k16’, ‘a_k12_k05k12’, ‘a_k12_k12k16’, ‘a_k12k16_k08k12k16’, ‘a_k16_k12k16’]

[[‘a_0ac_k08’], [‘a_k05_k05k12’], [‘a_k05k12_k05k08k12’], [‘a_k08k12k16_4ac’], [‘a_k08k16_k05k08k16’], [‘a_k12_k05k12’], [‘a_k12_k12k16’], [‘a_k12k16_k08k12k16’], [‘a_k16_k12k16’]]

backward

8

[‘a_k08k16_k05k08k16’]

-1706.240107896267

[‘a_0ac_k08’, ‘a_b’, ‘a_k05_k05k12’, ‘a_k05k12_k05k08k12’, ‘a_k08k12k16_4ac’, ‘a_k12_k05k12’, ‘a_k12_k12k16’, ‘a_k12k16_k08k12k16’, ‘a_k16_k12k16’]

[[‘a_0ac_k08’], [‘a_k05_k05k12’], [‘a_k05k12_k05k08k12’], [‘a_k08k12k16_4ac’], [‘a_k12_k05k12’], [‘a_k12_k12k16’], [‘a_k12k16_k08k12k16’], [‘a_k16_k12k16’]]

backward

7

[‘a_k12_k12k16’]

-1708.110992470788

[‘a_0ac_k08’, ‘a_b’, ‘a_k05_k05k12’, ‘a_k05k12_k05k08k12’, ‘a_k08k12k16_4ac’, ‘a_k12_k05k12’, ‘a_k12k16_k08k12k16’, ‘a_k16_k12k16’]

[[‘a_0ac_k08’], [‘a_k05_k05k12’], [‘a_k05k12_k05k08k12’], [‘a_k08k12k16_4ac’], [‘a_k12_k05k12’], [‘a_k12k16_k08k12k16’], [‘a_k16_k12k16’]]

forward

23

[]

-1708.110992470788

[‘a_0ac_k08’, ‘a_b’, ‘a_k05_k05k12’, ‘a_k05k12_k05k08k12’, ‘a_k08k12k16_4ac’, ‘a_k12_k05k12’, ‘a_k12k16_k08k12k16’, ‘a_k16_k12k16’]

[[‘a_0ac_k05’], [‘a_0ac_k12’], [‘a_0ac_k16’], [‘a_k05_k05k08’], [‘a_k05_k05k16’], [‘a_k05k08_k05k08k12’], [‘a_k05k08_k05k08k16’], [‘a_k05k08k12_4ac’], [‘a_k05k08k16_4ac’], [‘a_k05k12_k05k12k16’], [‘a_k05k12k16_4ac’], [‘a_k05k16_k05k08k16’], [‘a_k05k16_k05k12k16’], [‘a_k08_k05k08’], [‘a_k08_k08k12’], [‘a_k08_k08k16’], [‘a_k08k12_k05k08k12’], [‘a_k08k12_k08k12k16’], [‘a_k08k16_k08k12k16’], [‘a_k12_k08k12’], [‘a_k12k16_k05k12k16’], [‘a_k16_k05k16’], [‘a_k16_k08k16’]]

lateral

168

[]

-1708.110992470788

[‘a_0ac_k08’, ‘a_b’, ‘a_k05_k05k12’, ‘a_k05k12_k05k08k12’, ‘a_k08k12k16_4ac’, ‘a_k12_k05k12’, ‘a_k12k16_k08k12k16’, ‘a_k16_k12k16’]

[[‘a_0ac_k05’, ‘a_0ac_k08’], [‘a_0ac_k05’, ‘a_k05_k05k12’], [‘a_0ac_k05’, ‘a_k05k12_k05k08k12’], [‘a_0ac_k05’, ‘a_k08k12k16_4ac’], [‘a_0ac_k05’, ‘a_k12k16_k08k12k16’], [‘a_0ac_k08’, ‘a_0ac_k12’], [‘a_0ac_k08’, ‘a_k05_k05k08’], [‘a_0ac_k08’, ‘a_k05k08_k05k08k12’], [‘a_0ac_k08’, ‘a_k05k16_k05k12k16’], [‘a_0ac_k08’, ‘a_k08_k05k08’], [‘a_0ac_k08’, ‘a_k08_k08k12’], [‘a_0ac_k08’, ‘a_k08_k08k16’], [‘a_0ac_k08’, ‘a_k08k16_k08k12k16’], [‘a_0ac_k08’, ‘a_k12k16_k05k12k16’], [‘a_0ac_k08’, ‘a_k16_k05k16’], [‘a_0ac_k08’, ‘a_k16_k08k16’], [‘a_0ac_k12’, ‘a_k05k12_k05k08k12’], [‘a_0ac_k12’, ‘a_k12k16_k08k12k16’], [‘a_0ac_k16’, ‘a_0ac_k08’], [‘a_0ac_k16’, ‘a_k05_k05k12’], [‘a_0ac_k16’, ‘a_k05k12_k05k08k12’], [‘a_0ac_k16’, ‘a_k08k12k16_4ac’], [‘a_0ac_k16’, ‘a_k12_k05k12’], [‘a_0ac_k16’, ‘a_k12k16_k08k12k16’], [‘a_k05_k05k08’, ‘a_k05k12_k05k08k12’], [‘a_k05_k05k08’, ‘a_k12k16_k08k12k16’], [‘a_k05_k05k12’, ‘a_0ac_k12’], [‘a_k05_k05k12’, ‘a_k05_k05k08’], [‘a_k05_k05k12’, ‘a_k05k08_k05k08k12’], [‘a_k05_k05k12’, ‘a_k05k16_k05k12k16’], [‘a_k05_k05k12’, ‘a_k08_k05k08’], [‘a_k05_k05k12’, ‘a_k08_k08k12’], [‘a_k05_k05k12’, ‘a_k08_k08k16’], [‘a_k05_k05k12’, ‘a_k08k16_k08k12k16’], [‘a_k05_k05k12’, ‘a_k12k16_k05k12k16’], [‘a_k05_k05k12’, ‘a_k16_k05k16’], [‘a_k05_k05k12’, ‘a_k16_k08k16’], [‘a_k05_k05k16’, ‘a_0ac_k08’], [‘a_k05_k05k16’, ‘a_k05_k05k12’], [‘a_k05_k05k16’, ‘a_k05k12_k05k08k12’], [‘a_k05_k05k16’, ‘a_k08k12k16_4ac’], [‘a_k05_k05k16’, ‘a_k12_k05k12’], [‘a_k05_k05k16’, ‘a_k12k16_k08k12k16’], [‘a_k05k08_k05k08k12’, ‘a_k05k12_k05k08k12’], [‘a_k05k08_k05k08k12’, ‘a_k08k12k16_4ac’], [‘a_k05k08_k05k08k12’, ‘a_k12k16_k08k12k16’], [‘a_k05k08_k05k08k16’, ‘a_0ac_k08’], [‘a_k05k08_k05k08k16’, ‘a_k05_k05k12’], [‘a_k05k08_k05k08k16’, ‘a_k05k12_k05k08k12’], [‘a_k05k08_k05k08k16’, ‘a_k08k12k16_4ac’], [‘a_k05k08_k05k08k16’, ‘a_k12_k05k12’], [‘a_k05k08_k05k08k16’, ‘a_k12k16_k08k12k16’], [‘a_k05k08k12_4ac’, ‘a_0ac_k08’], [‘a_k05k08k12_4ac’, ‘a_k05_k05k12’], [‘a_k05k08k12_4ac’, ‘a_k05k12_k05k08k12’], [‘a_k05k08k12_4ac’, ‘a_k08k12k16_4ac’], [‘a_k05k08k12_4ac’, ‘a_k12_k05k12’], [‘a_k05k08k12_4ac’, ‘a_k12k16_k08k12k16’], [‘a_k05k08k12_4ac’, ‘a_k16_k12k16’], [‘a_k05k08k16_4ac’, ‘a_0ac_k08’], [‘a_k05k08k16_4ac’, ‘a_k05_k05k12’], [‘a_k05k08k16_4ac’, ‘a_k05k12_k05k08k12’], [‘a_k05k08k16_4ac’, ‘a_k08k12k16_4ac’], [‘a_k05k08k16_4ac’, ‘a_k12k16_k08k12k16’], [‘a_k05k12_k05k08k12’, ‘a_k08_k05k08’], [‘a_k05k12_k05k08k12’, ‘a_k08_k08k16’], [‘a_k05k12_k05k12k16’, ‘a_0ac_k08’], [‘a_k05k12_k05k12k16’, ‘a_k05_k05k12’], [‘a_k05k12_k05k12k16’, ‘a_k05k12_k05k08k12’], [‘a_k05k12_k05k12k16’, ‘a_k08k12k16_4ac’], [‘a_k05k12_k05k12k16’, ‘a_k12_k05k12’], [‘a_k05k12_k05k12k16’, ‘a_k12k16_k08k12k16’], [‘a_k05k12k16_4ac’, ‘a_0ac_k08’], [‘a_k05k12k16_4ac’, ‘a_k05_k05k12’], [‘a_k05k12k16_4ac’, ‘a_k05k12_k05k08k12’], [‘a_k05k12k16_4ac’, ‘a_k08k12k16_4ac’], [‘a_k05k12k16_4ac’, ‘a_k12_k05k12’], [‘a_k05k12k16_4ac’, ‘a_k12k16_k08k12k16’], [‘a_k05k12k16_4ac’, ‘a_k16_k12k16’], [‘a_k05k16_k05k08k16’, ‘a_0ac_k08’], [‘a_k05k16_k05k08k16’, ‘a_k05_k05k12’], [‘a_k05k16_k05k08k16’, ‘a_k05k12_k05k08k12’], [‘a_k05k16_k05k08k16’, ‘a_k08k12k16_4ac’], [‘a_k05k16_k05k08k16’, ‘a_k12_k05k12’], [‘a_k05k16_k05k08k16’, ‘a_k12k16_k08k12k16’], [‘a_k05k16_k05k08k16’, ‘a_k16_k12k16’], [‘a_k05k16_k05k12k16’, ‘a_k05k12_k05k08k12’], [‘a_k05k16_k05k12k16’, ‘a_k08k12k16_4ac’], [‘a_k05k16_k05k12k16’, ‘a_k12k16_k08k12k16’], [‘a_k08_k08k12’, ‘a_k05k12_k05k08k12’], [‘a_k08_k08k12’, ‘a_k12k16_k08k12k16’], [‘a_k08k12_k05k08k12’, ‘a_0ac_k08’], [‘a_k08k12_k05k08k12’, ‘a_k05_k05k12’], [‘a_k08k12_k05k08k12’, ‘a_k05k12_k05k08k12’], [‘a_k08k12_k05k08k12’, ‘a_k08k12k16_4ac’], [‘a_k08k12_k05k08k12’, ‘a_k12k16_k08k12k16’], [‘a_k08k12_k08k12k16’, ‘a_0ac_k08’], [‘a_k08k12_k08k12k16’, ‘a_k05_k05k12’], [‘a_k08k12_k08k12k16’, ‘a_k05k12_k05k08k12’], [‘a_k08k12_k08k12k16’, ‘a_k08k12k16_4ac’], [‘a_k08k12_k08k12k16’, ‘a_k12k16_k08k12k16’], [‘a_k08k12k16_4ac’, ‘a_0ac_k12’], [‘a_k08k12k16_4ac’, ‘a_k05_k05k08’], [‘a_k08k12k16_4ac’, ‘a_k08_k05k08’], [‘a_k08k12k16_4ac’, ‘a_k08_k08k12’], [‘a_k08k12k16_4ac’, ‘a_k08_k08k16’], [‘a_k08k12k16_4ac’, ‘a_k08k16_k08k12k16’], [‘a_k08k12k16_4ac’, ‘a_k12k16_k05k12k16’], [‘a_k08k12k16_4ac’, ‘a_k16_k05k16’], [‘a_k08k12k16_4ac’, ‘a_k16_k08k16’], [‘a_k08k16_k05k08k16’, ‘a_0ac_k08’], [‘a_k08k16_k05k08k16’, ‘a_k05_k05k12’], [‘a_k08k16_k05k08k16’, ‘a_k05k12_k05k08k12’], [‘a_k08k16_k05k08k16’, ‘a_k08k12k16_4ac’], [‘a_k08k16_k05k08k16’, ‘a_k12k16_k08k12k16’], [‘a_k08k16_k08k12k16’, ‘a_k05k12_k05k08k12’], [‘a_k12_k05k12’, ‘a_0ac_k05’], [‘a_k12_k05k12’, ‘a_0ac_k12’], [‘a_k12_k05k12’, ‘a_k05_k05k08’], [‘a_k12_k05k12’, ‘a_k05k08_k05k08k12’], [‘a_k12_k05k12’, ‘a_k05k08k16_4ac’], [‘a_k12_k05k12’, ‘a_k05k16_k05k12k16’], [‘a_k12_k05k12’, ‘a_k08_k05k08’], [‘a_k12_k05k12’, ‘a_k08_k08k12’], [‘a_k12_k05k12’, ‘a_k08_k08k16’], [‘a_k12_k05k12’, ‘a_k08k12_k05k08k12’], [‘a_k12_k05k12’, ‘a_k08k12_k08k12k16’], [‘a_k12_k05k12’, ‘a_k08k16_k05k08k16’], [‘a_k12_k05k12’, ‘a_k08k16_k08k12k16’], [‘a_k12_k05k12’, ‘a_k12k16_k05k12k16’], [‘a_k12_k05k12’, ‘a_k16_k05k16’], [‘a_k12_k05k12’, ‘a_k16_k08k16’], [‘a_k12_k08k12’, ‘a_0ac_k08’], [‘a_k12_k08k12’, ‘a_k05_k05k12’], [‘a_k12_k08k12’, ‘a_k05k12_k05k08k12’], [‘a_k12_k08k12’, ‘a_k08k12k16_4ac’], [‘a_k12_k08k12’, ‘a_k12_k05k12’], [‘a_k12_k08k12’, ‘a_k12k16_k08k12k16’], [‘a_k12k16_k05k12k16’, ‘a_k05k12_k05k08k12’], [‘a_k12k16_k05k12k16’, ‘a_k12k16_k08k12k16’], [‘a_k12k16_k08k12k16’, ‘a_k08_k05k08’], [‘a_k12k16_k08k12k16’, ‘a_k08_k08k16’], [‘a_k12k16_k08k12k16’, ‘a_k08k16_k08k12k16’], [‘a_k16_k05k16’, ‘a_k05k12_k05k08k12’], [‘a_k16_k05k16’, ‘a_k12k16_k08k12k16’], [‘a_k16_k08k16’, ‘a_k05k12_k05k08k12’], [‘a_k16_k08k16’, ‘a_k12k16_k08k12k16’], [‘a_k16_k12k16’, ‘a_0ac_k05’], [‘a_k16_k12k16’, ‘a_0ac_k12’], [‘a_k16_k12k16’, ‘a_0ac_k16’], [‘a_k16_k12k16’, ‘a_k05_k05k08’], [‘a_k16_k12k16’, ‘a_k05_k05k16’], [‘a_k16_k12k16’, ‘a_k05k08_k05k08k12’], [‘a_k16_k12k16’, ‘a_k05k08_k05k08k16’], [‘a_k16_k12k16’, ‘a_k05k08k16_4ac’], [‘a_k16_k12k16’, ‘a_k05k12_k05k12k16’], [‘a_k16_k12k16’, ‘a_k05k16_k05k12k16’], [‘a_k16_k12k16’, ‘a_k08_k05k08’], [‘a_k16_k12k16’, ‘a_k08_k08k12’], [‘a_k16_k12k16’, ‘a_k08_k08k16’], [‘a_k16_k12k16’, ‘a_k08k12_k05k08k12’], [‘a_k16_k12k16’, ‘a_k08k12_k08k12k16’], [‘a_k16_k12k16’, ‘a_k08k16_k05k08k16’], [‘a_k16_k12k16’, ‘a_k08k16_k08k12k16’], [‘a_k16_k12k16’, ‘a_k12_k08k12’], [‘a_k16_k12k16’, ‘a_k12k16_k05k12k16’], [‘a_k16_k12k16’, ‘a_k16_k05k16’], [‘a_k16_k12k16’, ‘a_k16_k08k16’]]

lateral

0

[]

-1708.110992470788

[‘a_0ac_k08’, ‘a_b’, ‘a_k05_k05k12’, ‘a_k05k12_k05k08k12’, ‘a_k08k12k16_4ac’, ‘a_k12_k05k12’, ‘a_k12k16_k08k12k16’, ‘a_k16_k12k16’]

[]

Why this problem is challenging

This problem is difficult to solve. The search space is large (~4.3 billion models); hence, many models have criterion values that differ by less than numerical noise, so the ranking of near-optimal models is effectively non-deterministic across machines and tolerances. Plain forward or backward selection mostly fails to reach the optimum.

The FAMoS method reproducibly converges to models with markedly better criterion values. Multi-start FAMoS searches recover the published best model while calibrating only a small fraction (~0.002 %) of the full model space, making this large-scale problem computationally feasible.

Because of the numerical-noise sensitivity noted above, when running this test case you should expect to obtain a similar (but not necessarily identical) expected_summary.tsv (a few rows may be reordered, or the path through model space may differ). However, the improvement in criterion value over consecutive rows should be conserved, and the same select model should be found.

References

[Blasi2016] (1,2,3,4,5)

Blasi T, Feller C, Feigelman J, Hasenauer J, Imhof A, Theis FJ, Becker PB, Marr C. Combinatorial Histone Acetylation Patterns Are Generated by Motif-Specific Reactions. Cell Systems, 2016, 2(1):49–58. https://doi.org/10.1016/j.cels.2016.01.002