Skip to content

Evaluation

coleman4hcs.evaluation

coleman4hcs.evaluation - Evaluation Metrics for COLEMAN.

This module provides classes and methods to evaluate the performance of the COLEMAN framework in the context of test case prioritization. Various metrics such as NAPFD (Normalized Average Percentage of Faults Detected) based on errors or verdicts can be utilized to measure the effectiveness.

Classes:

Name Description
EvaluationMetric

Base class for all evaluation metrics. Defines basic attributes and methods used across all metrics.

NAPFDMetric

Implements the NAPFD metric based on error counts.

NAPFDVerdictMetric

Implements the NAPFD metric based on test verdicts (e.g., pass/fail).

Notes

The evaluate method in EvaluationMetric is abstract and should be overridden in child classes. Ensure that the reset method is called at the beginning of each evaluation to reset metric values.

EvaluationMetric

Base class for evaluation metrics.

Attributes:

Name Type Description
available_time float

The time available for test execution.

scheduled_testcases list

Test cases that were scheduled for execution.

unscheduled_testcases list

Test cases that were not scheduled.

detection_ranks list

Ranks at which failures were detected.

detection_ranks_time list

Durations of failure-detecting test cases.

detection_ranks_failures list

Failure counts at each detection rank.

ttf int

Time to Fail (rank value).

ttf_duration float

Time spent until the first test case fail.

fitness float

APFD or NAPFD value.

cost float

APFDc value.

detected_failures int

Number of detected failures.

undetected_failures int

Number of undetected failures.

recall float

Recall metric value.

avg_precision float

Average precision metric value.

__init__

__init__()

Initialize the EvaluationMetric.

evaluate

evaluate(test_suite)

Evaluate the test suite.

This is an abstract method and must be implemented in child classes.

Parameters:

Name Type Description Default
test_suite list of dict

Test suite to evaluate.

required

Raises:

Type Description
NotImplementedError

If not implemented in a child class.

process_test_suite

process_test_suite(test_suite, error_key)

Process the test suite and return the costs and total failure count.

Parameters:

Name Type Description Default
test_suite list of dict

Test suite to process.

required
error_key str

Key to determine the error in the test suite.

required

Returns:

Name Type Description
costs list

List of durations for each test case.

total_failure_count int

Total number of failures detected.

total_failed_tests int

Total number of test cases that failed.

reset

reset()

Reset all attributes to their default values.

set_default_metrics

set_default_metrics()

Set the default values for NAPFD and APFDc metrics.

This method is called when there are no detected failures in the test suite, ensuring that the metric attributes are appropriately initialized.

Notes

This method updates the instance's attributes directly and does not return any value.

update_available_time

update_available_time(available_time)

Update the available time for the metric.

Parameters:

Name Type Description Default
available_time float

Time available for the metric.

required

NAPFDMetric

Bases: EvaluationMetric

Normalized Average Percentage of Faults Detected (NAPFD) Metric.

Based on error counts.

__str__

__str__()

Return a string representation of the metric.

Returns:

Type Description
str

The metric name.

compute_metrics

compute_metrics(costs, total_failure_count, total_failed_tests, no_testcases)

Compute NAPFD and APFDc metrics.

Parameters:

Name Type Description Default
costs list

A list containing the costs (e.g., execution time) for each test case.

required
total_failure_count int

Total number of failures detected across all test cases.

required
total_failed_tests int

Total number of test cases that failed.

required
no_testcases int

Total number of test cases in the test suite.

required
Notes

This method updates the instance's attributes directly and does not return any value.

evaluate

evaluate(test_suite)

Evaluate the test suite using the NAPFD metric.

Parameters:

Name Type Description Default
test_suite list of dict

Test suite to evaluate.

required

NAPFDVerdictMetric

Bases: EvaluationMetric

Normalized Average Percentage of Faults Detected (NAPFD) Metric based on Verdict.

__str__

__str__()

Return a string representation of the metric.

Returns:

Type Description
str

The metric name.

compute_metrics

compute_metrics(costs, total_failure_count, no_testcases)

Compute NAPFD and APFDc metrics based on test verdicts.

Parameters:

Name Type Description Default
costs list

A list containing the costs (e.g., execution time) for each test case.

required
total_failure_count int

Total number of test cases that failed.

required
no_testcases int

Total number of test cases in the test suite.

required
Notes

This method updates the instance's attributes directly and does not return any value.

evaluate

evaluate(test_suite)

Evaluate the test suite using the NAPFD Verdict metric.

Parameters:

Name Type Description Default
test_suite list of dict

Test suite to evaluate.

required