mlwiz.evaluation
evaluation.config
- class mlwiz.evaluation.config.Config(config_dict: dict)
Bases:
objectSimple class to manage the configuration dictionary as a Python object with fields.
- Parameters:
config_dict (dict) – the configuration dictionary
- get(key: str, default: object | None = None) object
Returns the key from the dictionary if present, otherwise the default value specified
- Parameters:
key (str) – the key to look up in the dictionary
default (object) – the default object
- Returns:
a value from the dictionary
- items() list
Invokes the items() method of the configuration dictionary
- Returns:
a list of (key, value) pairs
- keys() set
Invokes the keys() method of the configuration dictionary
- Returns:
the set of keys in the dictionary
evaluation.evaluator
- class mlwiz.evaluation.evaluator.RiskAssesser(outer_folds: int, inner_folds: int, experiment_class: Callable[[...], Experiment], exp_path: str, splits_filepath: str, model_configs: Grid | RandomSearch, risk_assessment_training_runs: int, model_selection_training_runs: int, higher_is_better: bool, gpus_per_task: float, base_seed: int = 42, training_timeout_seconds: int = -1)
Bases:
objectClass implementing a K-Fold technique to do Risk Assessment (estimate of the true generalization performances) and K-Fold Model Selection (select the best hyper-parameters for each external fold
- Parameters:
outer_folds (int) – The number K of outer TEST folds. You should have generated the splits accordingly
outer_folds – The number K of inner VALIDATION folds. You should have generated the splits accordingly
experiment_class – (Callable[…,
Experiment]): the experiment class to be instantiatedexp_path (str) – The folder in which to store all results
splits_filepath (str) – The splits filepath with additional meta information
model_configs – (Union[
Grid,RandomSearch]): an object storing all possible model configurations, e.g., config.base.Gridrisk_assessment_training_runs (int) – no of final training runs to mitigate bad initializations
risk_assessment_training_runs – no of training runs to mitigate bad initializations at model selection time
higher_is_better (bool) – whether the best model for each external fold should be selected by higher or lower score values
gpus_per_task (float) – Number of gpus to assign to each experiment. Can be <
1.base_seed (int) – Seed used to generate experiments seeds. Used to replicate results. Default is
42training_timeout_seconds (int) – optional timeout limit per experiment in seconds
- _request_termination()
Signals all workers and the UI to terminate gracefully.
- compute_best_hyperparameters(folder: str, outer_k: int, no_configurations: int, skip_config_ids: List[int])
Chooses the best hyper-parameters configuration using the proper validation mean score.
- Parameters:
folder (str) – the model selection folder associated with outer fold k
outer_k (int) – the current outer fold to consider. Used for telegram updates
no_configurations (int) – number of possible configurations
skip_config_ids – list of configuration ids to skip
- compute_final_runs_score_per_fold(outer_k: int)
Computes the average scores for the final runs of a specific outer fold
- Parameters:
outer_k (int) – id of the outer fold from 0 to K-1
- compute_risk_assessment_result()
Aggregates Outer Folds results and compute Training and Test mean/std
- model_selection(kfold_folder: str, outer_k: int, debug: bool, execute_config_id: int | None, skip_config_ids: List[int])
Performs model selection.
- Parameters:
kfold_folder – The root folder for model selection
outer_k – the current outer fold to consider
debug – if
True, sequential execution is performed and logs are printed to screenexecute_config_id – if debug mode is enabled, it will prioritize the execution of this configuration. It assumes indices start from 1. Use this to debug specific configurations.
skip_config_ids – if provided, the provided list of configurations will not be considered for model selection. Use it, for instance, when a run is taking too long to execute and you decide it is not worth to wait for it.
- process_config_results_across_inner_folds(config_folder: str, config: Config)
Averages the results for each configuration across inner folds and stores it into a file.
- Parameters:
config_folder (str)
config (
Config) – the configuration object
- process_model_selection_runs(inner_fold_exp_folder: str, inner_k: int)
- Computes the average performances for the training runs about
a specific configuration and a specific inner_fold split
- Parameters:
inner_fold_exp_folder (str) – an inner fold experiment folder of a specific configuration
inner_k (int) – the inner fold id
- risk_assessment(debug: bool, execute_config_id: int | None = None, skip_config_ids: List[int] | None = None)
Performs risk assessment to evaluate the performances of a model.
- Parameters:
debug – if
True, sequential execution is performed and logs are printed to screenexecute_config_id – if debug mode is enabled, it will prioritize the execution of this configuration for each model selection procedure. It assumes indices start from 1. Use this to debug specific configurations.
skip_config_ids – if provided, the provided list of configurations will not be considered for model selection. Use it, for instance, when a run is taking too long to execute and you decide it is not worth to wait for it.
- run_final_model(outer_k: int, debug: bool)
Performs the final runs once the best model for outer fold
outer_khas been chosen.- Parameters:
outer_k (int) – the current outer fold to consider
debug (bool) – if
True, sequential execution is performed and logs are printed to screen
- wait_configs(skip_config_ids: List[int]) bool
Waits for configurations to terminate and updates the state of the progress manager
- Returns:
Trueif all runs completed successfully,Falseotherwise.- Return type:
bool
- mlwiz.evaluation.evaluator._mean_std_ci(values: numpy.ndarray) Tuple[float, float, float]
Computes mean, std, and 95% confidence interval for the provided values.
- mlwiz.evaluation.evaluator._push_progress_update(progress_actor, payload: dict)
Safely forwards progress updates to the shared actor.
- mlwiz.evaluation.evaluator.extract_and_sum_elapsed_seconds(file_path)
- mlwiz.evaluation.evaluator.run_test(experiment_class: Callable[[...], Experiment], dataset_getter: Callable[[...], DataProvider], best_config: dict, outer_k: int, run_id: int, final_run_exp_path: str, final_run_torch_path: str, exp_seed: int, training_timeout_seconds: int, logger: Logger, progress_actor=None) Tuple[int, int, float]
Ray job that performs a risk assessment run and returns bookkeeping information for the progress manager.
- Parameters:
experiment_class – (Callable[…,
Experiment]): the class of the experiment to instantiatedataset_getter – (Callable[…,
DataProvider]): the class of the data provider to instantiatebest_config (dict) – the best configuration to use for this specific outer fold
run_id (int) – the id of the final run (for bookkeeping reasons)
final_run_exp_path (str) – path of the experiment root folder
final_run_torch_path (str) – path where to store the results of the experiment
exp_seed (int) – seed of the experiment
training_timeout_seconds (int) – timeout for the experiment in seconds
logger (
Logger) – a logger to log information in the appropriate file
- Returns:
a tuple with outer fold id, final run id, and time elapsed
- mlwiz.evaluation.evaluator.run_valid(experiment_class: Callable[[...], Experiment], dataset_getter: Callable[[...], DataProvider], config: dict, config_id: int, run_id: int, fold_exp_folder: str, fold_results_torch_path: str, exp_seed: int, training_timeout_seconds: int, logger: Logger, progress_actor=None) Tuple[int, int, int, int, float]
Ray job that performs a model selection run and returns bookkeeping information for the progress manager.
- Parameters:
experiment_class – (Callable[…,
Experiment]): the class of the experiment to instantiatedataset_getter – (Callable[…,
DataProvider]): the class of the data provider to instantiateconfig (dict) – the configuration of this specific experiment
config_id (int) – the id of the configuration (for bookkeeping reasons)
run_id (int) – the id of the training run (for bookkeeping reasons)
fold_exp_folder (str) – path of the experiment root folder
fold_results_torch_path (str) – path where to store the results of the experiment
exp_seed (int) – seed of the experiment
training_timeout_seconds (int) – timeout for the experiment in seconds
logger (
Logger) – a logger to log information in the appropriate file
- Returns:
- a tuple with outer fold id, inner fold id, config id, run id,
and time elapsed
- mlwiz.evaluation.evaluator.send_telegram_update(bot_token: str, bot_chat_ID: str, bot_message: str)
Sends a message using Telegram APIs. Markdown can be used.
- Parameters:
bot_token (str) – token of the user’s bot
bot_chat_ID (str) – identifier of the chat where to write the message
bot_message (str) – the message to be sent
evaluation.grid
- class mlwiz.evaluation.grid.Grid(configs_dict: dict)
Bases:
objectClass that implements grid-search. It computes all possible configurations starting from a suitable config file.
- Parameters:
configs_dict (dict) – the configuration dictionary specifying the different configurations to try
- _gen_configs() List[dict]
Takes a dictionary of key:list pairs and computes all possible combinations.
- Returns:
A list of al possible configurations in the form of dictionaries
- _gen_helper(cfgs_dict: dict) dict
Helper generator that yields one possible configuration at a time.
- _list_helper(values: object) object
Recursively parses lists of possible options for a given hyper-parameter.
- property exp_name: str
Computes the name of the root folder
- Returns:
the name of the root folder as made of
EXP-NAME_DATASET-NAME
- property num_configs: int
Computes the number of configurations to try during model selection
- Returns:
the number of configurations
evaluation.random_search
- class mlwiz.evaluation.random_search.RandomSearch(configs_dict: dict)
Bases:
GridClass that implements random-search. It computes all possible configurations starting from a suitable config file.
- Parameters:
configs_dict (dict) – the configuration dictionary specifying the different configurations to try
- _dict_helper(configs: dict)
Recursively parses a dictionary
- Returns:
A dictionary
- _gen_helper(cfgs_dict: dict) Iterator[Dict[str, Any]]
Takes a dictionary of key:list pairs and computes all possible combinations.
- Returns:
A list of all possible configurations in the form of dictionaries
- _sampler_helper(configs: dict)
Samples possible hyperparameter(s) and returns it (them, in this case as a dict)
- Returns:
A dictionary