Skip to content

Agent

coleman4hcs.agent

Agent classes for the Coleman4HCS framework.

This module provides an abstract representation of an agent in the Coleman4HCS framework.

An Agent represents an entity that interacts with the environment to perform test case prioritization. The agent uses a policy to decide on an action (i.e., a prioritized list of test cases) and then observes the environment to receive a reward. The agent updates its internal state or knowledge based on the reward, allowing it to improve its decisions over time.

Classes:

Name Description
Agent

Base class for agents. Defines common methods and properties all agents should have.

RewardAgent

An agent that learns using a reward function. Inherits from Agent.

ContextualAgent

Extends the RewardAgent to incorporate contextual information for decision-making.

RewardSlidingWindowAgent

An agent that learns using a sliding window mechanism and a reward function. Inherits from RewardAgent.

SlidingWindowContextualAgent

Combines the sliding window mechanism with contextual information. Inherits from RewardAgent.

Notes

Common attributes across agent types:

  • policy: The policy used by the agent to choose an action.
  • bandit: An instance of the Bandit class that the agent interacts with.
  • actions: A DataFrame that tracks the agent's actions and their respective outcomes.
  • last_prioritization: Stores the last action chosen by the agent.
  • t: Represents the time or the number of steps the agent has taken.
  • context_features: (For contextual agents) Contains the features of the context.
  • history: (For sliding window agents) Maintains a history of actions taken by the agent.
  • window_size: (For sliding window agents) Determines the size of the sliding window.

Agent

An agent that selects one of a set of actions at each time step.

The action is chosen using a strategy based on the history of prior actions and outcome observations.

Parameters:

Name Type Description Default
policy object

The policy used by the agent to choose an action. For instance, FRRMAB.

required
bandit Bandit

The bandit instance the agent interacts with.

None

Attributes:

Name Type Description
policy object

The policy used by the agent to choose an action.

bandit Bandit or None

The bandit instance the agent interacts with.

last_prioritization list or None

The last action (test case ordering) chosen by the agent.

t int

The number of steps the agent has taken.

actions DataFrame

A DataFrame tracking the agent's actions and their respective outcomes.

__init__

__init__(policy, bandit=None)

Initialize the Agent.

Parameters:

Name Type Description Default
policy object

The policy used by the agent to choose an action. For instance, FRRMAB.

required
bandit Bandit

The bandit instance the agent interacts with.

None

__str__

__str__()

Return a string representation of the agent.

Returns:

Type Description
str

String representation of the agent's policy.

add_action

add_action(action)

Add a new action if it does not already exist.

Parameters:

Name Type Description Default
action str

The name of the action (test case) to add.

required

choose

choose()

Choose an action using the agent's policy.

An action is the prioritized test suite.

Returns:

Type Description
list of str

List of test cases in ascending order of priority.

observe

observe(reward)

Update Q action-value estimates.

Uses the update rule: Q(a) <- Q(a) + 1/(k+1) * (r(a) - Q(a))

Parameters:

Name Type Description Default
reward array - like

The reward values for each action in the last prioritization.

required

reset

reset()

Reset the agent's memory to an initial state.

update_action_attempts

update_action_attempts()

Update action counter k -> k+1.

A weight is given to counterbalance the order of choice, since all tests are selected.

update_actions

update_actions(actions)

Update the agent's action set.

This method performs several tasks: 1. Removes actions that are no longer available. 2. Identifies and adds new actions that were not previously in the agent's set. 3. Notifies the agent's policy of the new actions.

Parameters:

Name Type Description Default
actions list of str

List of available actions.

required

update_bandit

update_bandit(bandit)

Update the agent's associated bandit.

This method sets the agent's bandit to the provided instance and then updates the agent's action set based on the arms available in the new bandit.

Parameters:

Name Type Description Default
bandit Bandit

The new bandit instance to be associated with the agent.

required

ContextualAgent

Bases: RewardAgent

An agent that learns using a reward function and contextual information.

The contextual information can be chosen by the user to guide decision-making.

Parameters:

Name Type Description Default
policy object

The policy used by the agent to choose an action.

required
reward_function object

The reward function used by the agent to evaluate outcomes.

required

Attributes:

Name Type Description
context_features object or None

The features of the current context.

features object or None

The features used for decision-making.

__init__

__init__(policy, reward_function)

Initialize the ContextualAgent.

Parameters:

Name Type Description Default
policy object

The policy used by the agent to choose an action.

required
reward_function object

The reward function used by the agent to evaluate outcomes.

required

__str__

__str__()

Return a string representation of the contextual agent.

Returns:

Type Description
str

String representation of the agent's policy.

choose

choose()

Choose an action using the agent's policy.

An action is the prioritized test suite.

Returns:

Type Description
list of str

List of test cases in ascending order of priority.

update_actions

update_actions(actions)

Update the set of available actions based on the current context.

This method adjusts the agent's possible actions based on the current context. In some situations, the available actions might change based on the state of the environment or other contextual information. This method ensures that the agent always has an up-to-date set of actions to choose from.

Parameters:

Name Type Description Default
actions list of str

List of available action names.

required

update_bandit

update_bandit(bandit)

Update the internal bandit instance used by the agent.

This method updates the agent's internal bandit to the provided instance. This can be useful when the agent needs to adapt to changes in the environment or when the bandit's state changes over time.

Parameters:

Name Type Description Default
bandit Bandit

The new bandit instance to be used by the agent.

required

update_context

update_context(context_features)

Update the agent's current context information.

The context provides additional information that can help the agent in making decisions. This might include external factors or environmental states that could influence the agent's strategy.

Parameters:

Name Type Description Default
context_features object

A collection or dataframe containing the contextual information.

required

update_features

update_features(features)

Update the features used by the agent for decision-making.

Features represent specific characteristics or properties of data that the agent uses to make its decisions.

Parameters:

Name Type Description Default
features list

A list or collection of features.

required

RewardAgent

Bases: Agent

An agent that learns using a reward function.

Parameters:

Name Type Description Default
policy object

The policy used by the agent to choose an action.

required
reward_function object

The reward function used by the agent to evaluate outcomes.

required

Attributes:

Name Type Description
reward_function object

The reward function used by the agent.

last_reward float

The last reward received by the agent.

__init__

__init__(policy, reward_function)

Initialize the RewardAgent.

Parameters:

Name Type Description Default
policy object

The policy used by the agent to choose an action.

required
reward_function object

The reward function used by the agent to evaluate outcomes.

required

get_reward_function

get_reward_function()

Retrieve the reward function associated with the agent.

Returns:

Type Description
object

The reward function of the agent.

observe

observe(reward)

Observe the reward and update value estimates.

Parameters:

Name Type Description Default
reward EvaluationMetric

The reward (result) obtained by the evaluation metric.

required

RewardSlidingWindowAgent

Bases: RewardAgent

An agent that learns using a sliding window and a reward function.

Parameters:

Name Type Description Default
policy object

The policy used by the agent to choose an action.

required
reward_function object

The reward function used by the agent.

required
window_size int

The size of the sliding window.

required

Attributes:

Name Type Description
window_size int

The size of the sliding window.

history DataFrame

History of actions taken by the agent.

__init__

__init__(policy, reward_function, window_size)

Initialize the RewardSlidingWindowAgent.

Parameters:

Name Type Description Default
policy object

The policy used by the agent to choose an action.

required
reward_function object

The reward function used by the agent.

required
window_size int

The size of the sliding window.

required

__str__

__str__()

Return a string representation of the sliding window agent.

Returns:

Type Description
str

String representation including policy and window size.

observe

observe(reward)

Observe the reward and update value estimates using the sliding window.

Parameters:

Name Type Description Default
reward EvaluationMetric

The reward (result) obtained by the evaluation metric.

required

update_history

update_history()

Update the agent's history of actions and outcomes.

Adds the current action and its outcome to the agent's history. If the length of the history exceeds the window size, the oldest entries are removed to maintain the specified window size.

SlidingWindowContextualAgent

Bases: ContextualAgent

An agent that learns using a reward function, contextual information, and a sliding window.

Combines contextual decision-making with a sliding window mechanism.

Parameters:

Name Type Description Default
policy object

The policy used by the agent to choose an action.

required
reward_function object

The reward function used by the agent.

required
window_size int

The size of the sliding window.

required

Attributes:

Name Type Description
window_size int

The size of the sliding window.

context_features object or None

The features of the current context.

features object or None

The features used for decision-making.

history DataFrame

History of actions taken by the agent.

__init__

__init__(policy, reward_function, window_size)

Initialize the SlidingWindowContextualAgent.

Parameters:

Name Type Description Default
policy object

The policy used by the agent to choose an action.

required
reward_function object

The reward function used by the agent.

required
window_size int

The size of the sliding window.

required

__str__

__str__()

Return a string representation of the sliding window contextual agent.

Returns:

Type Description
str

String representation including policy and window size.

observe

observe(reward)

Observe the reward and update value estimates using the sliding window.

Parameters:

Name Type Description Default
reward EvaluationMetric

The reward (result) obtained by the evaluation metric.

required

update_history

update_history()

Update the agent's history of actions and outcomes.

Adds the current action and its outcome to the agent's history. If the length of the history exceeds the window size, the oldest entries are removed to maintain the specified window size.