Reward¶

Coleman supports reward formulations inspired by TCP, ranking, and cost-aware literature:

RNFail (binary fault signal)
TimeRank (failure-aware order-sensitive reward)
ReciprocalRank (inverse-rank gain)
TopKRNFail (prefix-constrained binary reward; precision-style)
DiscountedFailure (DCG-like logarithmic discount)
APFDc (cost-aware reward using execution time and detected failure positions)

Literature references:

Spieker, H.; Gotlieb, A.; Marijan, D.; Mossige, M. (2017). Reinforcement Learning for Automatic Test Case Prioritization and Selection in Continuous Integration. ISSTA.
Jarvelin, K.; Kekalainen, J. (2002). Cumulated gain-based evaluation of IR techniques. ACM TOIS.
Rothermel, G.; Untch, R. H.; Chu, C.; Harrold, M. J. (2001). Prioritizing test cases for regression testing. IEEE TSE.
Elbaum, S.; Malishevsky, A. G.; Rothermel, G. (2002). Test case prioritization: a family of empirical studies. IEEE TSE.
Manning, C. D.; Raghavan, P.; Schutze, H. (2008). Introduction to Information Retrieval. Cambridge University Press.

coleman.reward ¶

Reward functions for bandit-based test case prioritization.

Defines reward functions for agents in a multi-armed bandit framework in the context of software testing. These reward functions help agents to prioritize software test cases based on various strategies.

The module provides an abstract base class Reward that serves as a blueprint for all reward functions. Derived classes implement specific reward strategies based on the number of failures and the order of test cases.

Classes:

Name	Description
`Reward`	An abstract base class that defines the structure and interface of a reward function.
`TimeRankReward`	A reward function that considers the order of test cases and the number of failures.
`RNFailReward`	A reward function that rewards based on the number of failures associated with test cases.
`ReciprocalRankReward`	A reward that gives inverse-rank gain to failing tests.
`TopKRNFailReward`	A binary top-k reward that estimates failure rate among first k tests.
`DiscountedFailureReward`	A logarithmically discounted rank gain for failing tests (DCG-like).

Notes

Reward functions are essential components of the bandit-based test case prioritization framework. They guide agents to make better decisions about which test cases to prioritize. Ensure that the evaluation metric provides necessary details like detection ranks for the reward functions to work correctly.

References

Spieker, H.; Gotlieb, A.; Marijan, D.; Mossige, M. (2017). Reinforcement Learning for Automatic Test Case Prioritization and Selection in Continuous Integration. ISSTA.
Jarvelin, K.; Kekalainen, J. (2002). Cumulated gain-based evaluation of IR techniques. ACM TOIS, 20(4), 422-446.

APFDcReward ¶

Bases: Reward

Reward failing tests by their APFDc contribution.

The reward for each failing test at rank r is its APFDc contribution, normalized by total execution cost and the total number of failing tests. Non-failing tests receive 0.

str ¶

__str__()

Return a string representation of the reward function.

evaluate ¶

evaluate(reward, last_prioritization)

Evaluate APFDc-style rewards using stored test execution costs.

Parameters:

Name	Type	Description	Default
`reward`	`EvaluationMetric`	Evaluation metric containing detection ranks and testcase costs.	required
`last_prioritization`	`list of str`	Test case names in prioritization order.	required

Returns:

Type	Description
`list of float`	Reward values aligned with `last_prioritization`.

get_name ¶

get_name()

Return the identifier of the reward function.

DiscountedFailureReward ¶

Bases: Reward

Reward failures with logarithmic discount by rank.

For each failing test at rank r (1-indexed), the reward is:

.. math:: gain(r) = 1 / \log_2(r + 1)

Non-failing tests receive 0.

str ¶

__str__()

Return a string representation of the reward function.

evaluate ¶

evaluate(reward, last_prioritization)

Evaluate discounted rewards for failing positions.

Parameters:

Name	Type	Description	Default
`reward`	`EvaluationMetric`	Evaluation metric containing detection ranks.	required
`last_prioritization`	`list of str`	Test case names in prioritization order.	required

Returns:

Type	Description
`list of float`	Discounted reward values aligned with `last_prioritization`.

get_name ¶

get_name()

Return the identifier of the reward function.

RNFailReward ¶

Bases: Reward

Reward Based on Failures (RNFail).

This reward function is based on the number of failures associated with test cases t' in T': 1 if t' failed; 0 otherwise.

str ¶

__str__()

Return a string representation of the reward function.

Returns:

Type	Description
`str`	The reward function name.

evaluate ¶

evaluate(reward, last_prioritization)

Evaluate rewards based on failures.

Parameters:

Name	Type	Description	Default
`reward`	`EvaluationMetric`	Evaluation metric containing detection ranks and scheduled test cases.	required
`last_prioritization`	`list of str`	Test case names in prioritization order.	required

Returns:

Type	Description
`list of float`	List of rewards for each test case in the prioritization.

get_name ¶

get_name()

Return the identifier of the reward function.

Returns:

Type	Description
`str`	The reward function identifier.

ReciprocalRankReward ¶

Bases: Reward

Reciprocal-Rank reward.

Rewards failing tests by the inverse of their rank, following a classic information-retrieval signal that strongly favors earlier detections.

str ¶

__str__()

Return a string representation of the reward function.

evaluate ¶

evaluate(reward, last_prioritization)

Evaluate rewards based on reciprocal failing ranks.

Parameters:

Name	Type	Description	Default
`reward`	`EvaluationMetric`	Evaluation metric containing detection ranks.	required
`last_prioritization`	`list of str`	Test case names in prioritization order.	required

Returns:

Type	Description
`list of float`	Reciprocal-rank reward values aligned with `last_prioritization`.

get_name ¶

get_name()

Return the identifier of the reward function.

Reward ¶

Bases: ABC

Abstract base class for reward functions.

A reward function is used by the agent in the observe method to evaluate bandit results and return a reward.

evaluate `abstractmethod` ¶

evaluate(reward, last_prioritization)

Evaluate a bandit result and return a reward.

Parameters:

Name	Type	Description	Default
`reward`	`EvaluationMetric`	The evaluation metric result.	required
`last_prioritization`	`list of str`	The last prioritized test suite list.	required

Returns:

Type	Description
`list of float`	The computed rewards for each test case.

get_name ¶

get_name()

Retrieve the name or identifier of the reward function.

Returns:

Type	Description
`str`	The name or identifier of the reward function.

TimeRankReward ¶

Bases: Reward

Time-ranked Reward (TimeRank).

This reward function explicitly includes the order of test cases and rewards each test case based on its rank in the test schedule and whether it failed. As a good schedule executes failing test cases early, every passed test case reduces the schedule's quality if it precedes a failing test case. Each test case is rewarded by the total number of failed test cases; for failed test cases it is the same as reward function 'RNFailReward'. For passed test cases, the reward is further decreased by the number of failed test cases ranked after the passed test case to penalize scheduling passing test cases early.

str ¶

__str__()

Return a string representation of the reward function.

Returns:

Type	Description
`str`	The reward function name.

evaluate ¶

evaluate(reward, last_prioritization)

Evaluate rewards based on the prioritization rank of test cases.

Parameters:

Name	Type	Description	Default
`reward`	`EvaluationMetric`	The evaluation metric containing detection ranks and scheduled test cases.	required
`last_prioritization`	`list of str`	The list of test case names in the prioritization order.	required

Returns:

Type	Description
`list of float`	A list of rewards for each test case in the prioritization.

get_name ¶

get_name()

Return the identifier of the reward function.

Returns:

Type	Description
`str`	The reward function identifier.

TopKRNFailReward ¶

Bases: Reward

Top-k binary failure reward.

Considers only whether a test failed (binary signal) inside the first top_k positions. Each failing test within top-k receives 1 / k_eff, where k_eff = min(top_k, len(last_prioritization)). The reward sum over the selected prefix is therefore the failure percentage in top-k.

When use_time_budget is enabled, the prefix is also capped by the number of tests scheduled by the metric under the active time budget.

init ¶

__init__(top_k=6, use_time_budget=False)

Initialize the reward with a top-k cutoff.

str ¶

__str__()

Return a string representation of the reward function.

evaluate ¶

evaluate(reward, last_prioritization)

Evaluate top-k binary rewards.

Parameters:

Name	Type	Description	Default
`reward`	`EvaluationMetric`	Evaluation metric containing detection ranks.	required
`last_prioritization`	`list of str`	Test case names in prioritization order.	required

Returns:

Type	Description
`list of float`	Reward vector aligned to `last_prioritization`.

get_name ¶

get_name()

Return the identifier of the reward function.

Reward¶

coleman.reward ¶

APFDcReward ¶

__str__ ¶

evaluate ¶

get_name ¶

DiscountedFailureReward ¶

__str__ ¶

evaluate ¶

get_name ¶

RNFailReward ¶

__str__ ¶

evaluate ¶

get_name ¶

ReciprocalRankReward ¶

__str__ ¶

evaluate ¶

get_name ¶

Reward ¶

evaluate abstractmethod ¶

get_name ¶

TimeRankReward ¶

__str__ ¶

evaluate ¶

get_name ¶

TopKRNFailReward ¶

__init__ ¶

__str__ ¶

evaluate ¶

get_name ¶

str ¶

str ¶

str ¶

str ¶

evaluate `abstractmethod` ¶

str ¶

init ¶

str ¶