Agent¶
coleman4hcs.agent ¶
Agent classes for the Coleman4HCS framework.
This module provides an abstract representation of an agent in the Coleman4HCS framework.
An Agent represents an entity that interacts with the environment to perform
test case prioritization. The agent uses a policy to decide on an action (i.e.,
a prioritized list of test cases) and then observes the environment to receive a reward.
The agent updates its internal state or knowledge based on the reward, allowing it to
improve its decisions over time.
Classes:
| Name | Description |
|---|---|
Agent |
Base class for agents. Defines common methods and properties all agents should have. |
RewardAgent |
An agent that learns using a reward function. Inherits from |
ContextualAgent |
Extends the |
RewardSlidingWindowAgent |
An agent that learns using a sliding window mechanism and a reward function.
Inherits from |
SlidingWindowContextualAgent |
Combines the sliding window mechanism with contextual information.
Inherits from |
Notes
Common attributes across agent types:
policy: The policy used by the agent to choose an action.bandit: An instance of the Bandit class that the agent interacts with.actions: A DataFrame that tracks the agent's actions and their respective outcomes.last_prioritization: Stores the last action chosen by the agent.t: Represents the time or the number of steps the agent has taken.context_features: (For contextual agents) Contains the features of the context.history: (For sliding window agents) Maintains a history of actions taken by the agent.window_size: (For sliding window agents) Determines the size of the sliding window.
Agent ¶
An agent that selects one of a set of actions at each time step.
The action is chosen using a strategy based on the history of prior actions and outcome observations.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
policy
|
object
|
The policy used by the agent to choose an action. For instance, FRRMAB. |
required |
bandit
|
Bandit
|
The bandit instance the agent interacts with. |
None
|
Attributes:
| Name | Type | Description |
|---|---|---|
policy |
object
|
The policy used by the agent to choose an action. |
bandit |
Bandit or None
|
The bandit instance the agent interacts with. |
last_prioritization |
list or None
|
The last action (test case ordering) chosen by the agent. |
t |
int
|
The number of steps the agent has taken. |
actions |
DataFrame
|
A DataFrame tracking the agent's actions and their respective outcomes. |
__init__ ¶
Initialize the Agent.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
policy
|
object
|
The policy used by the agent to choose an action. For instance, FRRMAB. |
required |
bandit
|
Bandit
|
The bandit instance the agent interacts with. |
None
|
__str__ ¶
Return a string representation of the agent.
Returns:
| Type | Description |
|---|---|
str
|
String representation of the agent's policy. |
add_action ¶
Add a new action if it does not already exist.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
action
|
str
|
The name of the action (test case) to add. |
required |
choose ¶
Choose an action using the agent's policy.
An action is the prioritized test suite.
Returns:
| Type | Description |
|---|---|
list of str
|
List of test cases in ascending order of priority. |
observe ¶
Update Q action-value estimates.
Uses the update rule: Q(a) <- Q(a) + 1/(k+1) * (r(a) - Q(a))
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
reward
|
array - like
|
The reward values for each action in the last prioritization. |
required |
update_action_attempts ¶
Update action counter k -> k+1.
A weight is given to counterbalance the order of choice, since all tests are selected.
update_actions ¶
Update the agent's action set.
This method performs several tasks: 1. Removes actions that are no longer available. 2. Identifies and adds new actions that were not previously in the agent's set. 3. Notifies the agent's policy of the new actions.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
actions
|
list of str
|
List of available actions. |
required |
update_bandit ¶
Update the agent's associated bandit.
This method sets the agent's bandit to the provided instance and then updates the agent's action set based on the arms available in the new bandit.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
bandit
|
Bandit
|
The new bandit instance to be associated with the agent. |
required |
ContextualAgent ¶
Bases: RewardAgent
An agent that learns using a reward function and contextual information.
The contextual information can be chosen by the user to guide decision-making.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
policy
|
object
|
The policy used by the agent to choose an action. |
required |
reward_function
|
object
|
The reward function used by the agent to evaluate outcomes. |
required |
Attributes:
| Name | Type | Description |
|---|---|---|
context_features |
object or None
|
The features of the current context. |
features |
object or None
|
The features used for decision-making. |
__init__ ¶
Initialize the ContextualAgent.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
policy
|
object
|
The policy used by the agent to choose an action. |
required |
reward_function
|
object
|
The reward function used by the agent to evaluate outcomes. |
required |
__str__ ¶
Return a string representation of the contextual agent.
Returns:
| Type | Description |
|---|---|
str
|
String representation of the agent's policy. |
choose ¶
Choose an action using the agent's policy.
An action is the prioritized test suite.
Returns:
| Type | Description |
|---|---|
list of str
|
List of test cases in ascending order of priority. |
update_actions ¶
Update the set of available actions based on the current context.
This method adjusts the agent's possible actions based on the current context. In some situations, the available actions might change based on the state of the environment or other contextual information. This method ensures that the agent always has an up-to-date set of actions to choose from.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
actions
|
list of str
|
List of available action names. |
required |
update_bandit ¶
Update the internal bandit instance used by the agent.
This method updates the agent's internal bandit to the provided instance. This can be useful when the agent needs to adapt to changes in the environment or when the bandit's state changes over time.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
bandit
|
Bandit
|
The new bandit instance to be used by the agent. |
required |
update_context ¶
Update the agent's current context information.
The context provides additional information that can help the agent in making decisions. This might include external factors or environmental states that could influence the agent's strategy.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
context_features
|
object
|
A collection or dataframe containing the contextual information. |
required |
update_features ¶
Update the features used by the agent for decision-making.
Features represent specific characteristics or properties of data that the agent uses to make its decisions.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
features
|
list
|
A list or collection of features. |
required |
RewardAgent ¶
Bases: Agent
An agent that learns using a reward function.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
policy
|
object
|
The policy used by the agent to choose an action. |
required |
reward_function
|
object
|
The reward function used by the agent to evaluate outcomes. |
required |
Attributes:
| Name | Type | Description |
|---|---|---|
reward_function |
object
|
The reward function used by the agent. |
last_reward |
float
|
The last reward received by the agent. |
__init__ ¶
Initialize the RewardAgent.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
policy
|
object
|
The policy used by the agent to choose an action. |
required |
reward_function
|
object
|
The reward function used by the agent to evaluate outcomes. |
required |
get_reward_function ¶
Retrieve the reward function associated with the agent.
Returns:
| Type | Description |
|---|---|
object
|
The reward function of the agent. |
observe ¶
Observe the reward and update value estimates.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
reward
|
EvaluationMetric
|
The reward (result) obtained by the evaluation metric. |
required |
RewardSlidingWindowAgent ¶
Bases: RewardAgent
An agent that learns using a sliding window and a reward function.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
policy
|
object
|
The policy used by the agent to choose an action. |
required |
reward_function
|
object
|
The reward function used by the agent. |
required |
window_size
|
int
|
The size of the sliding window. |
required |
Attributes:
| Name | Type | Description |
|---|---|---|
window_size |
int
|
The size of the sliding window. |
history |
DataFrame
|
History of actions taken by the agent. |
__init__ ¶
Initialize the RewardSlidingWindowAgent.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
policy
|
object
|
The policy used by the agent to choose an action. |
required |
reward_function
|
object
|
The reward function used by the agent. |
required |
window_size
|
int
|
The size of the sliding window. |
required |
__str__ ¶
Return a string representation of the sliding window agent.
Returns:
| Type | Description |
|---|---|
str
|
String representation including policy and window size. |
observe ¶
Observe the reward and update value estimates using the sliding window.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
reward
|
EvaluationMetric
|
The reward (result) obtained by the evaluation metric. |
required |
update_history ¶
Update the agent's history of actions and outcomes.
Adds the current action and its outcome to the agent's history. If the length of the history exceeds the window size, the oldest entries are removed to maintain the specified window size.
SlidingWindowContextualAgent ¶
Bases: ContextualAgent
An agent that learns using a reward function, contextual information, and a sliding window.
Combines contextual decision-making with a sliding window mechanism.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
policy
|
object
|
The policy used by the agent to choose an action. |
required |
reward_function
|
object
|
The reward function used by the agent. |
required |
window_size
|
int
|
The size of the sliding window. |
required |
Attributes:
| Name | Type | Description |
|---|---|---|
window_size |
int
|
The size of the sliding window. |
context_features |
object or None
|
The features of the current context. |
features |
object or None
|
The features used for decision-making. |
history |
DataFrame
|
History of actions taken by the agent. |
__init__ ¶
Initialize the SlidingWindowContextualAgent.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
policy
|
object
|
The policy used by the agent to choose an action. |
required |
reward_function
|
object
|
The reward function used by the agent. |
required |
window_size
|
int
|
The size of the sliding window. |
required |
__str__ ¶
Return a string representation of the sliding window contextual agent.
Returns:
| Type | Description |
|---|---|
str
|
String representation including policy and window size. |
observe ¶
Observe the reward and update value estimates using the sliding window.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
reward
|
EvaluationMetric
|
The reward (result) obtained by the evaluation metric. |
required |
update_history ¶
Update the agent's history of actions and outcomes.
Adds the current action and its outcome to the agent's history. If the length of the history exceeds the window size, the oldest entries are removed to maintain the specified window size.