{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "6c1bc86d",
   "metadata": {},
   "source": [
    "# Example usage with Python 3\n",
    "This notebook demonstrates usage of `petab_select` to perform forward selection in a Python 3 script."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9e42ea66",
   "metadata": {},
   "source": [
    "## Problem setup with initial model"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "427cbd52",
   "metadata": {},
   "source": [
    "Dependencies are imported. A model selection problem is loaded from the specification files. Some helper methods are defined."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "eab391ee",
   "metadata": {},
   "outputs": [],
   "source": [
    "import petab_select\n",
    "from petab_select import Model\n",
    "from petab_select.constants import (\n",
    "    CANDIDATE_SPACE,\n",
    "    MODELS,\n",
    "    UNCALIBRATED_MODELS,\n",
    ")\n",
    "\n",
    "BOLD_TEXT = \"\\033[1m\"\n",
    "NORMAL_TEXT = \"\\033[0m\"\n",
    "\n",
    "# Load the PEtab Select problem.\n",
    "select_problem = petab_select.Problem.from_yaml(\n",
    "    \"model_selection/petab_select_problem.yaml\"\n",
    ")\n",
    "# Fake criterion values as a surrogate for a model calibration tool.\n",
    "fake_criterion = {\n",
    "    \"M1_0\": 200,\n",
    "    \"M1_1\": 150,\n",
    "    \"M1_2\": 140,\n",
    "    \"M1_3\": 130,\n",
    "    \"M1_4\": -40,\n",
    "    \"M1_5\": -70,\n",
    "    \"M1_6\": -110,\n",
    "    \"M1_7\": 50,\n",
    "}\n",
    "\n",
    "\n",
    "def print_model(model: Model) -> None:\n",
    "    \"\"\"Helper method to view model attributes.\"\"\"\n",
    "    print(\n",
    "        f\"\"\"\\\n",
    "Model subspace ID: {model.model_subspace_id}\n",
    "PEtab YAML location: {model.model_subspace_petab_yaml}\n",
    "Custom model parameters: {model.parameters}\n",
    "Model hash: {model.hash}\n",
    "Model ID: {model.model_id}\n",
    "{select_problem.criterion}: {model.get_criterion(select_problem.criterion, compute=False)}\n",
    "Model was calibrated in iteration: {model.iteration}\n",
    "\"\"\"\n",
    "    )\n",
    "\n",
    "\n",
    "def calibrate(model: Model, fake_criterion=fake_criterion) -> None:\n",
    "    \"\"\"Set model criterion values to fake values that could be the output of a calibration tool.\n",
    "\n",
    "    Each model subspace in this problem contains only one model, so a model-specific criterion can\n",
    "    be indexed by the model subspace ID.\n",
    "    \"\"\"\n",
    "    model.set_criterion(\n",
    "        select_problem.criterion, fake_criterion[model.model_subspace_id]\n",
    "    )\n",
    "\n",
    "\n",
    "print(\"Information about the model selection problem:\")\n",
    "print(select_problem)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "88172819",
   "metadata": {},
   "source": [
    "## First iteration\n",
    "\n",
    "Neighbors of the initial predecessor model in the model space are identified for testing. Here, no initial predecessor model is specified. If an initial predecessor model is required for the algorithm, PEtab Select can automatically use the `VIRTUAL_INITIAL_MODEL`. With the forward and backward methods, the virtual initial model defaults to a model with no parameters estimated, and all parameters estimated, respectively."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "080d08d4",
   "metadata": {},
   "source": [
    "The model space is then used to find neighbors to the initial model. A candidate space is used to calculate distances between models, and whether a candidate model represents a valid move in model space.\n",
    "\n",
    "The built-in `ForwardCandidateSpace` uses the following properties to identify candidate models:\n",
    "\n",
    "- previously estimated parameters must remain estimated;\n",
    "- the number of estimated parameters must increase; and\n",
    "- this increase must be minimal.\n",
    "\n",
    "The model space keeps a history of calibrated models, such that subsequent calls ignore previously identified neighbors. This can be disabled by changing usage to `petab_select.ModelSpace.search(..., exclude=False)`, or reset to forget all history with `petab_select.ModelSpace.reset()`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f0f327ad",
   "metadata": {},
   "outputs": [],
   "source": [
    "iteration = petab_select.ui.start_iteration(problem=select_problem)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b9245832",
   "metadata": {},
   "source": [
    "Model IDs default to the model hash, which is generated from the model subspace ID, the model parameterization, and the \"PEtab hash\". The PEtab hash is generated from the location of the PEtab problem YAML file, and the nominal values and list of estimated parameters from the model's PEtab parameter table.\n",
    "\n",
    "Here, the model identified is a model with all possible parameters fixed, because it matches the virtual initial model. If the initial model was from the \"real\" model subspace, then candidate models would be true forward steps in the subspace (e.g. an increase in the number of estimated parameters)."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6996aed0",
   "metadata": {},
   "source": [
    "Each of the candidate models includes information that should be sufficient for model calibration with any suitable tool that supports PEtab.\n",
    "\n",
    "NB: the `petab_yaml` is for the original PEtab problem, and would need to be customized by `parameters` to be the actual candidate model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "edefa697",
   "metadata": {},
   "outputs": [],
   "source": [
    "for candidate_model in iteration[UNCALIBRATED_MODELS]:\n",
    "    print_model(candidate_model)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e7104888",
   "metadata": {},
   "source": [
    "At this point, a model calibration tool is used to find the best of the test models, according to some criterion. PEtab select can select the best model from a collection of models that provide a value for this criterion, or a specific model can be supplied. Here, PEtab Select will be used to select the best model from multiple models. At the end of the following iterations, a specific model will be provided."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0f027ef2",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Set fake criterion values that might be the output of a model calibration tool.\n",
    "for candidate_model in iteration[UNCALIBRATED_MODELS]:\n",
    "    calibrate(candidate_model)\n",
    "\n",
    "iteration_results = petab_select.ui.end_iteration(\n",
    "    problem=select_problem,\n",
    "    candidate_space=iteration[CANDIDATE_SPACE],\n",
    "    calibrated_models=iteration[UNCALIBRATED_MODELS],\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1c51dd49",
   "metadata": {},
   "outputs": [],
   "source": [
    "local_best_model = petab_select.ui.get_best(\n",
    "    problem=select_problem, models=iteration_results[MODELS]\n",
    ")\n",
    "print_model(local_best_model)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "031b2eaf",
   "metadata": {},
   "source": [
    "## Second iteration\n",
    "The process then repeats."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d5b167b4",
   "metadata": {},
   "source": [
    "The chosen model is used as the predecessor model, such that neighboring models are identified with respect to the chosen model. Here, we define a dummy calibration tool that performs all parts of the model selection iteration."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b15c30ea",
   "metadata": {},
   "outputs": [],
   "source": [
    "def dummy_calibration_tool(\n",
    "    problem: petab_select.Problem,\n",
    "    candidate_space: petab_select.CandidateSpace = None,\n",
    "):\n",
    "    # Initialize iteration\n",
    "    iteration = petab_select.ui.start_iteration(\n",
    "        problem=problem,\n",
    "        candidate_space=candidate_space,\n",
    "    )\n",
    "\n",
    "    # \"Calibrate\": set fake criterion values that might be the output of a model calibration tool.\n",
    "    for candidate_model in iteration[UNCALIBRATED_MODELS]:\n",
    "        calibrate(candidate_model)\n",
    "\n",
    "    # Finalize iteration\n",
    "    iteration_results = petab_select.ui.end_iteration(\n",
    "        problem=select_problem,\n",
    "        candidate_space=iteration[CANDIDATE_SPACE],\n",
    "        calibrated_models=iteration[UNCALIBRATED_MODELS],\n",
    "    )\n",
    "\n",
    "    return iteration_results"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5b6969ca",
   "metadata": {},
   "outputs": [],
   "source": [
    "iteration_results = dummy_calibration_tool(\n",
    "    problem=select_problem, candidate_space=iteration_results[CANDIDATE_SPACE]\n",
    ")\n",
    "local_best_model = petab_select.ui.get_best(\n",
    "    problem=select_problem, models=iteration_results[MODELS]\n",
    ")\n",
    "\n",
    "for candidate_model in iteration_results[MODELS]:\n",
    "    if candidate_model.hash == local_best_model.hash:\n",
    "        print(BOLD_TEXT + \"BEST MODEL OF CURRENT ITERATION\" + NORMAL_TEXT)\n",
    "    print_model(candidate_model)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ab3e8077",
   "metadata": {},
   "source": [
    "## Third iteration"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6d3468d3",
   "metadata": {},
   "outputs": [],
   "source": [
    "iteration_results = dummy_calibration_tool(\n",
    "    problem=select_problem, candidate_space=iteration_results[CANDIDATE_SPACE]\n",
    ")\n",
    "local_best_model = petab_select.ui.get_best(\n",
    "    problem=select_problem, models=iteration_results[MODELS]\n",
    ")\n",
    "\n",
    "for candidate_model in iteration_results[MODELS]:\n",
    "    if candidate_model.hash == local_best_model.hash:\n",
    "        print(BOLD_TEXT + \"BEST MODEL OF CURRENT ITERATION\" + NORMAL_TEXT)\n",
    "    print_model(candidate_model)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4dca7bff",
   "metadata": {},
   "source": [
    "## Fourth iteration"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9f9c438c",
   "metadata": {},
   "outputs": [],
   "source": [
    "iteration_results = dummy_calibration_tool(\n",
    "    problem=select_problem, candidate_space=iteration_results[CANDIDATE_SPACE]\n",
    ")\n",
    "local_best_model = petab_select.ui.get_best(\n",
    "    problem=select_problem, models=iteration_results[MODELS]\n",
    ")\n",
    "\n",
    "for candidate_model in iteration_results[MODELS]:\n",
    "    if candidate_model.hash == local_best_model.hash:\n",
    "        print(BOLD_TEXT + \"BEST MODEL OF CURRENT ITERATION\" + NORMAL_TEXT)\n",
    "    print_model(candidate_model)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "44320e4e",
   "metadata": {},
   "source": [
    "## Fifth iteration"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "30344b30",
   "metadata": {},
   "outputs": [],
   "source": [
    "iteration_results = dummy_calibration_tool(\n",
    "    problem=select_problem, candidate_space=iteration_results[CANDIDATE_SPACE]\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "23317d8d",
   "metadata": {},
   "source": [
    "The `M1_7` model is the most complex model in the model space (all parameters in the space are estimated), so no valid neighbors are identified for the forward selection method."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7843fcb6",
   "metadata": {},
   "outputs": [],
   "source": [
    "print(f\"Number of candidate models: {len(iteration_results[MODELS])}.\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c2d1090e",
   "metadata": {},
   "source": [
    "At this point, the results of the model calibration tool for the different models can be used to select the best model. You can collect all calibrated models from `iteration_results`. Alternatively, you can access the `CandidateSpace.calibrated_models` attribute."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "219d27e4",
   "metadata": {},
   "outputs": [],
   "source": [
    "best_model = petab_select.ui.get_best(\n",
    "    problem=select_problem,\n",
    "    models=iteration_results[CANDIDATE_SPACE].calibrated_models,\n",
    ")\n",
    "print_model(best_model)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "65d1f895",
   "metadata": {},
   "source": [
    "## Sixth iteration\n",
    "Note that there can exist additional, uncalibrated models in the model space, after a single forward algorithm terminates. These additional models can be identified with the brute-force method."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cacda13d",
   "metadata": {},
   "outputs": [],
   "source": [
    "candidate_space = petab_select.BruteForceCandidateSpace(\n",
    "    criterion=select_problem.criterion\n",
    ")\n",
    "# candidate_space.calibrated_models = iteration_results[CANDIDATE_SPACE].calibrated_models\n",
    "iteration = petab_select.ui.start_iteration(\n",
    "    problem=select_problem,\n",
    "    candidate_space=candidate_space,\n",
    ");"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7440cc69",
   "metadata": {},
   "outputs": [],
   "source": [
    "for candidate_model in iteration[UNCALIBRATED_MODELS]:\n",
    "    calibrate(candidate_model)\n",
    "    print_model(candidate_model)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "924ee6af-f72a-455d-8985-13a6f00a14a1",
   "metadata": {},
   "source": [
    "All calibrated models across all iterations can be retrieved via `Problem.state.models`, which is set by `ui.end_iteration`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "10b62dd7-a3c3-420e-a88d-9ac44e367145",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Add latest model to all models\n",
    "petab_select.ui.end_iteration(\n",
    "    problem=select_problem,\n",
    "    candidate_space=iteration[CANDIDATE_SPACE],\n",
    "    calibrated_models=iteration[UNCALIBRATED_MODELS],\n",
    ");"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "68d1f89a-31a2-4f96-8fe0-82d1ad130832",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Print all models\n",
    "select_problem.state.models.df"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}