The term "Level Of Confidence" (LOC) is used to describe the percentage of instances that a set of similarly constructed tests will capture the true mean (accuracy) of the system being tested within a specified range of values[1] around the measured accuracy value of each test.
Put in another way, it makes common sense that as you perform more and more tests on a system (in this case, for accuracy) you become increasingly confident in predicting the result of the next test.[2]
If a biometric-based matching system has reasonable levels of consistency and repeatability, successive accuracy test result scores will tend to cluster within a progressively narrower range of values[3] as the number of tests increases.
There are (at least) two ways of establishing a reasonably accurate estimate of a system’s accuracy; 1.) Conduct one very large test – in terms of the number of “test sets” or “samples” used -- and declare that the system’s accuracy is the measured accuracy[4] of the test; or, 2.) Conduct many smaller tests and declare that the system’s accuracy will lie somewhere within a range defined by the highest and lowest measured accuracy values obtained in these small tests.
There are some problems associated with the two methods described above:
· What is a “very large test” in terms of the number of test sets used?
· How many “smaller tests” should be performed and how many test sets should be in each small test?