Example usage with Python 3

This notebook demonstrates usage of petab_select to perform forward selection in a Python 3 script.

Problem setup with initial model

Dependencies are imported. A model selection problem is loaded from the specification files. Some helper methods are defined.

[1]:
import petab_select
from petab_select import ForwardCandidateSpace, Model

# Load the PEtab Select problem.
select_problem = petab_select.Problem.from_yaml(
    'model_selection/petab_select_problem.yaml'
)
# Fake criterion values as a surrogate for a model calibration tool.
fake_criterion = {
    'M1_0': 200,
    'M1_1': 150,
    'M1_2': 140,
    'M1_3': 130,
    'M1_4': -40,
    'M1_5': -70,
    'M1_6': -110,
    'M1_7': 50,
}


def print_model(model: Model) -> None:
    """Helper method to view model attributes."""
    print(
        f"""\
Model subspace ID: {model.model_subspace_id}
PEtab YAML location: {model.petab_yaml}
Custom model parameters: {model.parameters}
Model hash: {model.get_hash()}
Model ID: {model.model_id}
{select_problem.criterion}: {model.get_criterion(select_problem.criterion, compute=False)}
"""
    )


def calibrate(model: Model, fake_criterion=fake_criterion) -> None:
    """Set model criterion values to fake values that could be the output of a calibration tool.

    Each model subspace in this problem contains only one model, so a model-specific criterion can
    be indexed by the model subspace ID.
    """
    model.set_criterion(
        select_problem.criterion, fake_criterion[model.model_subspace_id]
    )


# Calibrated and newly calibrated models should be tracked between iterations.
calibrated_models = {}
newly_calibrated_models = {}

print(
    f"""Information about the model selection problem.

YAML path: {select_problem.yaml_path}
Method: {select_problem.method}
Criterion: {select_problem.criterion}
"""
)
Information about the model selection problem.

YAML path: model_selection/petab_select_problem.yaml
Method: forward
Criterion: AIC

First iteration

Neighbors of the initial model in the model space are identified for testing. Here, no initial model is specified. If an initial model is required for the algorithm, PEtab Select can automatically use a virtual initial model, if such a model is defined. For example, for the forward and backward methods, the virtual initial model defaults to a model with no parameters estimated, and all parameters estimated, respectively.

The model candidate space is setup with the initial model. The model space is then used to find neighbors to the initial model. The candidate space is used to calculate distances between models, and whether a candidate model represents a valid move in model space.

The in-built ForwardCandidateSpace uses the following properties to identify candidate models: - previously estimated parameters must not be fixed; - the number of estimated parameters must increase; and - the increase in the number of estimated parameters must be minimal.

The model space keeps a history of identified neighbors, such that subsequent calls ignore previously identified neighbors. This can be disabled by changing usage to petab_select.ModelSpace.search(..., exclude=False), or reset to forget all history with petab_select.ModelSpace.reset().

[2]:
candidate_space = petab_select.ui.candidates(problem=select_problem)

Model IDs default to the model hash, which is generated from hashing the model subspace ID and model parameterization.

Here, the model identified is a model with all possible parameters fixed. This is because the default virtual initial model is the same parameterization, and the closest model in the “real” model subspace is the same parameterization. If the initial model was from the “real” model subspace, then candidate models would be true forward steps in the subspace (e.g. an increase in the number of estimated parameters).

Each of the candidate models includes information that should be sufficient for model calibration with any suitable tool that supports PEtab.

NB: the petab_yaml is for the original PEtab problem, and would need to be customized by parameters to be the actual candidate model.

[3]:
for candidate_model in candidate_space.models:
    print_model(candidate_model)
Model subspace ID: M1_0
PEtab YAML location: model_selection/petab_problem.yaml
Custom model parameters: {'k1': 0, 'k2': 0, 'k3': 0}
Model hash: 65f94e223024ef684fe3e1a3aa2a54cc3ffd08895fbe4539512522d49d703ceda483aff4aa207b160dc358c458b76b25d88fbd94cacfc78bd0c70f4a46a42191
Model ID: 65f94e223024ef684fe3e1a3aa2a54cc3ffd08895fbe4539512522d49d703ceda483aff4aa207b160dc358c458b76b25d88fbd94cacfc78bd0c70f4a46a42191
AIC: None

At this point, a model calibration tool is used to find the best of the test models, according to some criterion. PEtab select can select the best model from a collection of models that provide a value for this criterion, or a specific model can be supplied. Here, PEtab Select will be used to select the best model from multiple models. At the end of the following iterations, a specific model will be provided.

[4]:
# Set fake criterion values that might be the output of a model calibration tool.
for candidate_model in candidate_space.models:
    calibrate(candidate_model)

newly_calibrated_models = {
    model.get_hash(): model for model in candidate_space.models
}
calibrated_models.update(newly_calibrated_models)

select_problem.exclude_models(newly_calibrated_models.values())
[5]:
local_best_model = select_problem.get_best(newly_calibrated_models.values())
print_model(local_best_model)
Model subspace ID: M1_0
PEtab YAML location: model_selection/petab_problem.yaml
Custom model parameters: {'k1': 0, 'k2': 0, 'k3': 0}
Model hash: 65f94e223024ef684fe3e1a3aa2a54cc3ffd08895fbe4539512522d49d703ceda483aff4aa207b160dc358c458b76b25d88fbd94cacfc78bd0c70f4a46a42191
Model ID: 65f94e223024ef684fe3e1a3aa2a54cc3ffd08895fbe4539512522d49d703ceda483aff4aa207b160dc358c458b76b25d88fbd94cacfc78bd0c70f4a46a42191
AIC: 200

Second iteration

The process then repeats.

The chosen model is used as the predecessor model, such that neighboring models are identified with respect to the chosen model.

[6]:
petab_select.ui.candidates(
    problem=select_problem,
    candidate_space=candidate_space,
    newly_calibrated_models=newly_calibrated_models,
);
[7]:
for candidate_model in candidate_space.models:
    print_model(candidate_model)
Model subspace ID: M1_1
PEtab YAML location: model_selection/petab_problem.yaml
Custom model parameters: {'k1': 0.2, 'k2': 0.1, 'k3': 'estimate'}
Model hash: 112c344171a01874a0b400640c2e0f72f2924b91712966cb868bf53b6d8ce2d09bb8e56f52b5aaca506a64754629147047646ea0c0cf568d76e74df2c5e2487a
Model ID: 112c344171a01874a0b400640c2e0f72f2924b91712966cb868bf53b6d8ce2d09bb8e56f52b5aaca506a64754629147047646ea0c0cf568d76e74df2c5e2487a
AIC: None

Model subspace ID: M1_2
PEtab YAML location: model_selection/petab_problem.yaml
Custom model parameters: {'k1': 0.2, 'k2': 'estimate', 'k3': 0}
Model hash: df2e1cd0744275a154036b1e1b09eaa67a76f4c08615b3e36849e3eaddcb05d1ccaedb62d148abcc41579314b2e8bec2871a8f925e3d53b90c0a4c6e9ea098ab
Model ID: df2e1cd0744275a154036b1e1b09eaa67a76f4c08615b3e36849e3eaddcb05d1ccaedb62d148abcc41579314b2e8bec2871a8f925e3d53b90c0a4c6e9ea098ab
AIC: None

Model subspace ID: M1_3
PEtab YAML location: model_selection/petab_problem.yaml
Custom model parameters: {'k1': 'estimate', 'k2': 0.1, 'k3': 0}
Model hash: b7584bfd6f35206dfe32fa0143e53cea808faf965e0c0547bf6ee1cdce7a75cd3ff0aa2bcb1faa27625166454f83e3fcac52cdf43b28e8186fff9a01ac3f8006
Model ID: b7584bfd6f35206dfe32fa0143e53cea808faf965e0c0547bf6ee1cdce7a75cd3ff0aa2bcb1faa27625166454f83e3fcac52cdf43b28e8186fff9a01ac3f8006
AIC: None

[8]:
# Set fake criterion values that might be the output of a model calibration tool.
for candidate_model in candidate_space.models:
    calibrate(candidate_model)

newly_calibrated_models = {
    model.get_hash(): model for model in candidate_space.models
}
calibrated_models.update(newly_calibrated_models)

select_problem.exclude_models(newly_calibrated_models.values())
[9]:
local_best_model = select_problem.get_best(newly_calibrated_models.values())
print_model(local_best_model)
Model subspace ID: M1_3
PEtab YAML location: model_selection/petab_problem.yaml
Custom model parameters: {'k1': 'estimate', 'k2': 0.1, 'k3': 0}
Model hash: b7584bfd6f35206dfe32fa0143e53cea808faf965e0c0547bf6ee1cdce7a75cd3ff0aa2bcb1faa27625166454f83e3fcac52cdf43b28e8186fff9a01ac3f8006
Model ID: b7584bfd6f35206dfe32fa0143e53cea808faf965e0c0547bf6ee1cdce7a75cd3ff0aa2bcb1faa27625166454f83e3fcac52cdf43b28e8186fff9a01ac3f8006
AIC: 130

Third iteration

[10]:
petab_select.ui.candidates(
    problem=select_problem,
    candidate_space=candidate_space,
    newly_calibrated_models=newly_calibrated_models,
);
[11]:
for candidate_model in candidate_space.models:
    print_model(candidate_model)
Model subspace ID: M1_5
PEtab YAML location: model_selection/petab_problem.yaml
Custom model parameters: {'k1': 'estimate', 'k2': 0.1, 'k3': 'estimate'}
Model hash: de4a2f17d8b0228a31d7451631cf3662d0ecf4dc7738ab6ca3d1de65e817844c9c1df806ec9daf81644b9c10f00185dc8c8de880d9db23a98acadb817f5d481c
Model ID: de4a2f17d8b0228a31d7451631cf3662d0ecf4dc7738ab6ca3d1de65e817844c9c1df806ec9daf81644b9c10f00185dc8c8de880d9db23a98acadb817f5d481c
AIC: None

Model subspace ID: M1_6
PEtab YAML location: model_selection/petab_problem.yaml
Custom model parameters: {'k1': 'estimate', 'k2': 'estimate', 'k3': 0}
Model hash: db8700c079c8347123adc89b7f5112256c4aaebd2af0f6e32e7582f398b2c1e5e85e588cdcc56bab054c001b96a9b42b02174266927f879d7f78e8ac5d2c33e6
Model ID: db8700c079c8347123adc89b7f5112256c4aaebd2af0f6e32e7582f398b2c1e5e85e588cdcc56bab054c001b96a9b42b02174266927f879d7f78e8ac5d2c33e6
AIC: None

[12]:
# Set fake criterion values that might be the output of a model calibration tool.
for candidate_model in candidate_space.models:
    calibrate(candidate_model)

newly_calibrated_models = {
    model.get_hash(): model for model in candidate_space.models
}
calibrated_models.update(newly_calibrated_models)

select_problem.exclude_models(newly_calibrated_models.values())
[13]:
local_best_model = select_problem.get_best(newly_calibrated_models.values())
print_model(local_best_model)
Model subspace ID: M1_6
PEtab YAML location: model_selection/petab_problem.yaml
Custom model parameters: {'k1': 'estimate', 'k2': 'estimate', 'k3': 0}
Model hash: db8700c079c8347123adc89b7f5112256c4aaebd2af0f6e32e7582f398b2c1e5e85e588cdcc56bab054c001b96a9b42b02174266927f879d7f78e8ac5d2c33e6
Model ID: db8700c079c8347123adc89b7f5112256c4aaebd2af0f6e32e7582f398b2c1e5e85e588cdcc56bab054c001b96a9b42b02174266927f879d7f78e8ac5d2c33e6
AIC: -110

Fourth iteration

[14]:
petab_select.ui.candidates(
    problem=select_problem,
    candidate_space=candidate_space,
    newly_calibrated_models=newly_calibrated_models,
);
[15]:
for candidate_model in candidate_space.models:
    print_model(candidate_model)
Model subspace ID: M1_7
PEtab YAML location: model_selection/petab_problem.yaml
Custom model parameters: {'k1': 'estimate', 'k2': 'estimate', 'k3': 'estimate'}
Model hash: 7c105406ec11716473939a0bbb5281066c1014b54e2480ba126030f5c18a597a27a2ca9247aa60d8262f488165079d1c9e040f9d712ec4e19c2d2122a586f3e5
Model ID: 7c105406ec11716473939a0bbb5281066c1014b54e2480ba126030f5c18a597a27a2ca9247aa60d8262f488165079d1c9e040f9d712ec4e19c2d2122a586f3e5
AIC: None

[16]:
# Set fake criterion values that might be the output of a model calibration tool.
for candidate_model in candidate_space.models:
    calibrate(candidate_model)

newly_calibrated_models = {
    model.get_hash(): model for model in candidate_space.models
}
calibrated_models.update(newly_calibrated_models)

select_problem.exclude_models(newly_calibrated_models.values())
[17]:
local_best_model = select_problem.get_best(newly_calibrated_models.values())
print_model(local_best_model)
Model subspace ID: M1_7
PEtab YAML location: model_selection/petab_problem.yaml
Custom model parameters: {'k1': 'estimate', 'k2': 'estimate', 'k3': 'estimate'}
Model hash: 7c105406ec11716473939a0bbb5281066c1014b54e2480ba126030f5c18a597a27a2ca9247aa60d8262f488165079d1c9e040f9d712ec4e19c2d2122a586f3e5
Model ID: 7c105406ec11716473939a0bbb5281066c1014b54e2480ba126030f5c18a597a27a2ca9247aa60d8262f488165079d1c9e040f9d712ec4e19c2d2122a586f3e5
AIC: 50

Sixth iteration

[18]:
petab_select.ui.candidates(
    problem=select_problem,
    candidate_space=candidate_space,
);

The M1_7 model is the most complex model in the model space (all parameters in the space are estimated), so no valid neighbors are identified for the forward selection method.

[19]:
print(f'Number of candidate models: {len(candidate_space.models)}.')
Number of candidate models: 1.

At this point, the results of the model calibration tool for the different models can be used to select the best model.

[20]:
best_model = select_problem.get_best(calibrated_models.values())
print_model(best_model)
Model subspace ID: M1_6
PEtab YAML location: model_selection/petab_problem.yaml
Custom model parameters: {'k1': 'estimate', 'k2': 'estimate', 'k3': 0}
Model hash: db8700c079c8347123adc89b7f5112256c4aaebd2af0f6e32e7582f398b2c1e5e85e588cdcc56bab054c001b96a9b42b02174266927f879d7f78e8ac5d2c33e6
Model ID: db8700c079c8347123adc89b7f5112256c4aaebd2af0f6e32e7582f398b2c1e5e85e588cdcc56bab054c001b96a9b42b02174266927f879d7f78e8ac5d2c33e6
AIC: -110

Seventh iteration

Note that there can exist additional, uncalibrated models in the model space, after a single forward algorithm terminates. These additional models can be identified with the brute-force method.

[21]:
candidate_space = petab_select.BruteForceCandidateSpace()
petab_select.ui.candidates(
    problem=select_problem,
    candidate_space=candidate_space,
    calibrated_models=calibrated_models,
);
[22]:
for candidate_model in candidate_space.models:
    print_model(candidate_model)
Model subspace ID: M1_4
PEtab YAML location: model_selection/petab_problem.yaml
Custom model parameters: {'k1': 0.2, 'k2': 'estimate', 'k3': 'estimate'}
Model hash: 38c95dd428b3e31da6969a50db4a1ccbcefe6d8824617d27ec2360e57d55647a25f3fdd45e5f0270786698606cbe496cd94be9495986dade4d1f1d166a4bf911
Model ID: 38c95dd428b3e31da6969a50db4a1ccbcefe6d8824617d27ec2360e57d55647a25f3fdd45e5f0270786698606cbe496cd94be9495986dade4d1f1d166a4bf911
AIC: None