Evaluation¶
coleman4hcs.evaluation ¶
coleman4hcs.evaluation - Evaluation Metrics for COLEMAN.
This module provides classes and methods to evaluate the performance of the COLEMAN framework in the context of test case prioritization. Various metrics such as NAPFD (Normalized Average Percentage of Faults Detected) based on errors or verdicts can be utilized to measure the effectiveness.
Classes:
| Name | Description |
|---|---|
EvaluationMetric |
Base class for all evaluation metrics. Defines basic attributes and methods used across all metrics. |
NAPFDMetric |
Implements the NAPFD metric based on error counts. |
NAPFDVerdictMetric |
Implements the NAPFD metric based on test verdicts (e.g., pass/fail). |
Notes
The evaluate method in EvaluationMetric is abstract and should be overridden in
child classes. Ensure that the reset method is called at the beginning of each
evaluation to reset metric values.
EvaluationMetric ¶
Base class for evaluation metrics.
Attributes:
| Name | Type | Description |
|---|---|---|
available_time |
float
|
The time available for test execution. |
scheduled_testcases |
list
|
Test cases that were scheduled for execution. |
unscheduled_testcases |
list
|
Test cases that were not scheduled. |
detection_ranks |
list
|
Ranks at which failures were detected. |
detection_ranks_time |
list
|
Durations of failure-detecting test cases. |
detection_ranks_failures |
list
|
Failure counts at each detection rank. |
ttf |
int
|
Time to Fail (rank value). |
ttf_duration |
float
|
Time spent until the first test case fail. |
fitness |
float
|
APFD or NAPFD value. |
cost |
float
|
APFDc value. |
detected_failures |
int
|
Number of detected failures. |
undetected_failures |
int
|
Number of undetected failures. |
recall |
float
|
Recall metric value. |
avg_precision |
float
|
Average precision metric value. |
evaluate ¶
Evaluate the test suite.
This is an abstract method and must be implemented in child classes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
test_suite
|
list of dict
|
Test suite to evaluate. |
required |
Raises:
| Type | Description |
|---|---|
NotImplementedError
|
If not implemented in a child class. |
process_test_suite ¶
Process the test suite and return the costs and total failure count.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
test_suite
|
list of dict
|
Test suite to process. |
required |
error_key
|
str
|
Key to determine the error in the test suite. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
costs |
list
|
List of durations for each test case. |
total_failure_count |
int
|
Total number of failures detected. |
total_failed_tests |
int
|
Total number of test cases that failed. |
set_default_metrics ¶
Set the default values for NAPFD and APFDc metrics.
This method is called when there are no detected failures in the test suite, ensuring that the metric attributes are appropriately initialized.
Notes
This method updates the instance's attributes directly and does not return any value.
update_available_time ¶
Update the available time for the metric.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
available_time
|
float
|
Time available for the metric. |
required |
NAPFDMetric ¶
Bases: EvaluationMetric
Normalized Average Percentage of Faults Detected (NAPFD) Metric.
Based on error counts.
__str__ ¶
Return a string representation of the metric.
Returns:
| Type | Description |
|---|---|
str
|
The metric name. |
compute_metrics ¶
Compute NAPFD and APFDc metrics.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
costs
|
list
|
A list containing the costs (e.g., execution time) for each test case. |
required |
total_failure_count
|
int
|
Total number of failures detected across all test cases. |
required |
total_failed_tests
|
int
|
Total number of test cases that failed. |
required |
no_testcases
|
int
|
Total number of test cases in the test suite. |
required |
Notes
This method updates the instance's attributes directly and does not return any value.
evaluate ¶
Evaluate the test suite using the NAPFD metric.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
test_suite
|
list of dict
|
Test suite to evaluate. |
required |
NAPFDVerdictMetric ¶
Bases: EvaluationMetric
Normalized Average Percentage of Faults Detected (NAPFD) Metric based on Verdict.
__str__ ¶
Return a string representation of the metric.
Returns:
| Type | Description |
|---|---|
str
|
The metric name. |
compute_metrics ¶
Compute NAPFD and APFDc metrics based on test verdicts.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
costs
|
list
|
A list containing the costs (e.g., execution time) for each test case. |
required |
total_failure_count
|
int
|
Total number of test cases that failed. |
required |
no_testcases
|
int
|
Total number of test cases in the test suite. |
required |
Notes
This method updates the instance's attributes directly and does not return any value.
evaluate ¶
Evaluate the test suite using the NAPFD Verdict metric.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
test_suite
|
list of dict
|
Test suite to evaluate. |
required |