Skip to content

Environment

coleman4hcs.environment

Environment for agent-bandit interactions in Coleman4HCS.

This module contains the Environment class which simulates the agent's interactions and collects results during the reinforcement learning process. The environment provides a platform where agents learn from interactions with the bandit to make better decisions over time.

The module also contains the following features:

  • Mechanism to reset the environment and agent's memories.
  • Ability to run a single experiment or multiple experiments.
  • Support for both simple and contextual agents.
  • Periodic saving of experiments via the checkpoint store.
  • Facilities for creating and storing results obtained during experiments.
  • Support for scenarios with variants, commonly found in Heterogeneous Computing Systems (HCS).
  • Helper methods for loading and saving experiment states for recovery purposes.

Classes:

Name Description
Environment

Represents the learning environment where agents interact with bandits.

Environment

The environment class that simulates the agent's interactions and collects results.

Parameters:

Name Type Description Default
agents list

List of agent instances participating in the simulation.

required
scenario_provider object

The scenario provider supplying test case data.

required
evaluation_metric EvaluationMetric

The evaluation metric used to assess prioritization performance.

required
results_config dict or None

Configuration for the results sink (from config.toml [results]). When None, a NullSink is used.

None
checkpoint_config dict or None

Configuration for checkpoints (from config.toml [checkpoint]). When None, a NullCheckpointStore is used.

None
telemetry_config dict or None

Configuration for telemetry (from config.toml [telemetry]). When None, a NoOpTelemetry is used.

None

Attributes:

Name Type Description
agents list

List of agent instances.

scenario_provider object

The scenario provider instance.

evaluation_metric EvaluationMetric

The evaluation metric instance.

monitor MonitorCollector

Monitor for collecting feedback during the process.

variant_monitors dict

Dictionary of monitors for each variant.

checkpoint_store CheckpointStore

Store for saving/loading experiment checkpoints.

telemetry Telemetry or NoOpTelemetry

Telemetry facade for metrics and traces.

__init__

__init__(agents, scenario_provider, evaluation_metric, results_config=None, checkpoint_config=None, telemetry_config=None, runtime_metadata=None)

Initialize the Environment.

Parameters:

Name Type Description Default
agents list

List of agent instances participating in the simulation.

required
scenario_provider object

The scenario provider supplying test case data.

required
evaluation_metric EvaluationMetric

The evaluation metric used to assess prioritization performance.

required
results_config dict or None

Results configuration dict.

None
checkpoint_config dict or None

Checkpoint configuration dict.

None
telemetry_config dict or None

Telemetry configuration dict.

None

load_experiment

load_experiment(experiment)

Load a backup of the experiment from the checkpoint store.

Parameters:

Name Type Description Default
experiment int

The current experiment number.

required

Returns:

Type Description
CheckpointPayload or None

The loaded checkpoint payload, or None if not found.

reset

reset()

Reset the environment for a new simulation.

reset_agents_memory

reset_agents_memory()

Reset all agents' memory to an initial state.

run

run(experiments=1, trials=100, bandit_type=EvaluationMetricBandit, restore=True)

Execute a simulation over multiple experiments.

Parameters:

Name Type Description Default
experiments int

Number of experiments. Default is 1.

1
trials int

The max number of scenarios that will be analyzed. Default is 100.

100
bandit_type type

The bandit class to use. Default is EvaluationMetricBandit.

EvaluationMetricBandit
restore bool

Restore the experiment if it fails. Default is True.

True

run_prioritization

run_prioritization(agent, bandit, bandit_duration, experiment, t, virtual_scenario)

Run the prioritization process for a given agent and scenario.

Parameters:

Name Type Description Default
agent Agent

The agent that is being used for the prioritization.

required
bandit Bandit

The bandit mechanism used for choosing actions.

required
bandit_duration float

Time taken by the bandit process.

required
experiment int

The current experiment number.

required
t int

The current step or iteration of the simulation.

required
virtual_scenario VirtualScenario

The virtual scenario being considered.

required

Returns:

Type Description
tuple

A tuple containing the chosen action, the ending time, the experiment name, and the starting time.

run_prioritization_hcs

run_prioritization_hcs(agent, action, avail_time_ratio, bandit_duration, end, exp_name, experiment, start, t, virtual_scenario)

Run the prioritization process for a given agent and HCS scenario.

Parameters:

Name Type Description Default
agent Agent

The agent that is being used for the prioritization.

required
action list of str

The chosen action by the agent.

required
avail_time_ratio float

The available time ratio for the experiment.

required
bandit_duration float

Time taken by the bandit process.

required
end float

The ending time of the process.

required
exp_name str

The name of the experiment.

required
experiment int

The current experiment number.

required
start float

The starting time of the process.

required
t int

The current step or iteration of the simulation.

required
virtual_scenario VirtualHCSScenario

The virtual HCS scenario being considered.

required

run_single

run_single(experiment, trials=100, bandit_type=EvaluationMetricBandit, restore=True)

Execute a single simulation experiment.

Parameters:

Name Type Description Default
experiment int

Current experiment number.

required
trials int

The max number of scenarios that will be analyzed. Default is 100.

100
bandit_type type

The bandit class to use. Default is EvaluationMetricBandit.

EvaluationMetricBandit
restore bool

Restore the experiment if it fails (e.g., energy down). Default is True.

True

save_experiment

save_experiment(experiment, t, bandit)

Save a checkpoint for the experiment.

Parameters:

Name Type Description Default
experiment int

The current experiment number.

required
t int

The current step or iteration of the simulation.

required
bandit Bandit

The current bandit being used in the simulation.

required

Raises:

Type Description
Exception

If there is an error saving the experiment.

save_periodically

save_periodically(restore, t, experiment, bandit, interval=None)

Save the experiment periodically based on a predefined interval.

Parameters:

Name Type Description Default
restore bool

Flag to indicate if the experiment should be restored.

required
t int

The current step or iteration of the simulation.

required
experiment int

The current experiment number.

required
bandit Bandit

The current bandit being used in the simulation.

required
interval int or None

The interval at which the experiment should be saved. When None, uses the configured checkpoint interval.

None

set_runtime_metadata

set_runtime_metadata(runtime_metadata=None)

Set execution-scoped metadata used for telemetry and persisted results.

store_experiment

store_experiment()

Flush and persist the results collected during the experiment.