Reward¶
Coleman supports reward formulations inspired by TCP, ranking, and cost-aware literature:
- RNFail (binary fault signal)
- TimeRank (failure-aware order-sensitive reward)
- ReciprocalRank (inverse-rank gain)
- TopKRNFail (prefix-constrained binary reward; precision-style)
- DiscountedFailure (DCG-like logarithmic discount)
- APFDc (cost-aware reward using execution time and detected failure positions)
Literature references:
- Spieker, H.; Gotlieb, A.; Marijan, D.; Mossige, M. (2017). Reinforcement Learning for Automatic Test Case Prioritization and Selection in Continuous Integration. ISSTA.
- Jarvelin, K.; Kekalainen, J. (2002). Cumulated gain-based evaluation of IR techniques. ACM TOIS.
- Rothermel, G.; Untch, R. H.; Chu, C.; Harrold, M. J. (2001). Prioritizing test cases for regression testing. IEEE TSE.
- Elbaum, S.; Malishevsky, A. G.; Rothermel, G. (2002). Test case prioritization: a family of empirical studies. IEEE TSE.
- Manning, C. D.; Raghavan, P.; Schutze, H. (2008). Introduction to Information Retrieval. Cambridge University Press.
coleman.reward ¶
Reward functions for bandit-based test case prioritization.
Defines reward functions for agents in a multi-armed bandit framework in the context of software testing. These reward functions help agents to prioritize software test cases based on various strategies.
The module provides an abstract base class Reward that serves as a blueprint for all
reward functions. Derived classes implement specific reward strategies based on the number
of failures and the order of test cases.
Classes:
| Name | Description |
|---|---|
Reward |
An abstract base class that defines the structure and interface of a reward function. |
TimeRankReward |
A reward function that considers the order of test cases and the number of failures. |
RNFailReward |
A reward function that rewards based on the number of failures associated with test cases. |
ReciprocalRankReward |
A reward that gives inverse-rank gain to failing tests. |
TopKRNFailReward |
A binary top-k reward that estimates failure rate among first k tests. |
DiscountedFailureReward |
A logarithmically discounted rank gain for failing tests (DCG-like). |
Notes
Reward functions are essential components of the bandit-based test case prioritization framework. They guide agents to make better decisions about which test cases to prioritize. Ensure that the evaluation metric provides necessary details like detection ranks for the reward functions to work correctly.
References
- Spieker, H.; Gotlieb, A.; Marijan, D.; Mossige, M. (2017). Reinforcement Learning for Automatic Test Case Prioritization and Selection in Continuous Integration. ISSTA.
- Jarvelin, K.; Kekalainen, J. (2002). Cumulated gain-based evaluation of IR techniques. ACM TOIS, 20(4), 422-446.
APFDcReward ¶
Bases: Reward
Reward failing tests by their APFDc contribution.
The reward for each failing test at rank r is its APFDc contribution,
normalized by total execution cost and the total number of failing tests.
Non-failing tests receive 0.
evaluate ¶
Evaluate APFDc-style rewards using stored test execution costs.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
reward
|
EvaluationMetric
|
Evaluation metric containing detection ranks and testcase costs. |
required |
last_prioritization
|
list of str
|
Test case names in prioritization order. |
required |
Returns:
| Type | Description |
|---|---|
list of float
|
Reward values aligned with |
DiscountedFailureReward ¶
Bases: Reward
Reward failures with logarithmic discount by rank.
For each failing test at rank r (1-indexed), the reward is:
.. math:: gain(r) = 1 / \log_2(r + 1)
Non-failing tests receive 0.
evaluate ¶
Evaluate discounted rewards for failing positions.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
reward
|
EvaluationMetric
|
Evaluation metric containing detection ranks. |
required |
last_prioritization
|
list of str
|
Test case names in prioritization order. |
required |
Returns:
| Type | Description |
|---|---|
list of float
|
Discounted reward values aligned with |
RNFailReward ¶
Bases: Reward
Reward Based on Failures (RNFail).
This reward function is based on the number of failures associated with test cases t' in T': 1 if t' failed; 0 otherwise.
__str__ ¶
Return a string representation of the reward function.
Returns:
| Type | Description |
|---|---|
str
|
The reward function name. |
evaluate ¶
Evaluate rewards based on failures.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
reward
|
EvaluationMetric
|
Evaluation metric containing detection ranks and scheduled test cases. |
required |
last_prioritization
|
list of str
|
Test case names in prioritization order. |
required |
Returns:
| Type | Description |
|---|---|
list of float
|
List of rewards for each test case in the prioritization. |
get_name ¶
Return the identifier of the reward function.
Returns:
| Type | Description |
|---|---|
str
|
The reward function identifier. |
ReciprocalRankReward ¶
Bases: Reward
Reciprocal-Rank reward.
Rewards failing tests by the inverse of their rank, following a classic information-retrieval signal that strongly favors earlier detections.
evaluate ¶
Evaluate rewards based on reciprocal failing ranks.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
reward
|
EvaluationMetric
|
Evaluation metric containing detection ranks. |
required |
last_prioritization
|
list of str
|
Test case names in prioritization order. |
required |
Returns:
| Type | Description |
|---|---|
list of float
|
Reciprocal-rank reward values aligned with |
Reward ¶
Bases: ABC
Abstract base class for reward functions.
A reward function is used by the agent in the observe method to evaluate bandit results and return a reward.
evaluate
abstractmethod
¶
Evaluate a bandit result and return a reward.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
reward
|
EvaluationMetric
|
The evaluation metric result. |
required |
last_prioritization
|
list of str
|
The last prioritized test suite list. |
required |
Returns:
| Type | Description |
|---|---|
list of float
|
The computed rewards for each test case. |
get_name ¶
Retrieve the name or identifier of the reward function.
Returns:
| Type | Description |
|---|---|
str
|
The name or identifier of the reward function. |
TimeRankReward ¶
Bases: Reward
Time-ranked Reward (TimeRank).
This reward function explicitly includes the order of test cases and rewards each test case based on its rank in the test schedule and whether it failed. As a good schedule executes failing test cases early, every passed test case reduces the schedule's quality if it precedes a failing test case. Each test case is rewarded by the total number of failed test cases; for failed test cases it is the same as reward function 'RNFailReward'. For passed test cases, the reward is further decreased by the number of failed test cases ranked after the passed test case to penalize scheduling passing test cases early.
__str__ ¶
Return a string representation of the reward function.
Returns:
| Type | Description |
|---|---|
str
|
The reward function name. |
evaluate ¶
Evaluate rewards based on the prioritization rank of test cases.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
reward
|
EvaluationMetric
|
The evaluation metric containing detection ranks and scheduled test cases. |
required |
last_prioritization
|
list of str
|
The list of test case names in the prioritization order. |
required |
Returns:
| Type | Description |
|---|---|
list of float
|
A list of rewards for each test case in the prioritization. |
get_name ¶
Return the identifier of the reward function.
Returns:
| Type | Description |
|---|---|
str
|
The reward function identifier. |
TopKRNFailReward ¶
Bases: Reward
Top-k binary failure reward.
Considers only whether a test failed (binary signal) inside the first
top_k positions. Each failing test within top-k receives 1 / k_eff,
where k_eff = min(top_k, len(last_prioritization)). The reward sum over
the selected prefix is therefore the failure percentage in top-k.
When use_time_budget is enabled, the prefix is also capped by the
number of tests scheduled by the metric under the active time budget.
evaluate ¶
Evaluate top-k binary rewards.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
reward
|
EvaluationMetric
|
Evaluation metric containing detection ranks. |
required |
last_prioritization
|
list of str
|
Test case names in prioritization order. |
required |
Returns:
| Type | Description |
|---|---|
list of float
|
Reward vector aligned to |