1 /
Risk management for analytical methods: Conciliating objectives of validation phase and routine decision rules.
Bruno Boulanger
a
, Walthère Dewé
a
, Aurélie Gilbert
b
, Bernadette Govaerts
b
,
1
and Myriam Maumy
c
.
a
Eli Lilly, European Early Phase Statistics, Belgium.
b
Université Catholique de Louvain, Institut de Statistique, Belgium.
c
Université Louis Pasteur, Laboratoire de Statistique, France.
SUMMARY
In industries that involve either chemistry or biology, the analytical methods are the necessary eyes of all the material produced. If the quality of an analytical method is doubtful, then the whole set of decisions that will be based on those measures is questionable. For those reasons, being able to assess the quality of an analytical method is far more than a statistical challenge; it’s a matter of ethic and good business practices. The validity of an analytical method must be assessed at two levels. The “prestudy” validation aims at proving, by an appropriate set of designed experiments, that the method is able to achieve its objectives. The “instudy” validation is intended to verify, by inserting QC samples in routine, that the method remains valid over time. This paper discusses and compares two methods, based on the total error concept, to check the validity of a measurement method at a prestudy level. The first checks if a tolerance interval on hypothetical future measurements lies within given acceptance limits and the second calculates the probability to lie within these limits and verifies if it is greater than a given acceptance level. For the “instudy” validation, the paper assesses the properties of the sn
λ
rule recommended by FDA. A crucial point is also to ensure that the decisions taken at the prestudy stage and in routine are compatible. More precisely, a laboratory should not see its method rejected in routine when it has been proven to be valid and remains so. This goal may be achieved by an appropriate choice of validation parameters at both pre and instudy levels.
1
Correspondance to : Bernadette Govaerts, Institut de Statistique, 20 voie du roman pays, 1348 LouvainlaNeuve, Belgium, Govaerts@stat.ucl.ac.be, Phone : +321047.43.13.
h a l  0 0 1 4 1 7 0 9 , v e r s i o n 1  1 1 J u l 2 0 0 7
Author manuscript, published in "Chemometrics and Intelligent Laboratory Systems 86 (2007) 198207" DOI : 10.1016/j.chemolab.2006.06.008
2 /
1.
INTRODUCTION
In industries that involve either chemistry or biology, such as pharmaceutical industries, chemical industries or food industry, the analytical methods are the necessary eyes and hear of all the material produced or used. If the quality of an analytical method is doubtful, then the decisions and possibly the product released based on measures obtained with this procedure may become questionable. For those reasons, being able to assess the quality of an analytical method is far more than a statistical challenge; it’s a matter of ethic and good business practices. Many regulatory documents have been released to address that issue primarily ICH and FDA documents in the pharmaceutical industry [1,2,3]. The objective of validation is to give to the laboratory as well as to the regulatory bodies guarantee that every single measurement that will be performed in routine will be close enough to the unknown “true” value of the sample [4]. The conformity of a given analytical method to this objective is usually assed in two stages [5, 6, 7, 8]. First, a “pre
study” phase is conducted to proof, on the basis of a designed experiment, that the method is able to deliver results of quality. Then, at a routine level, the laboratory must verify that the analytical method of interest remains valid over time and that each run performed provides trustful measures. This is usually achieved by inserting QC samples in the unknown samples runs. At these two stages, one needs then to have a way to quantify the quality of a measure in terms of its closeness to the “true” value of the property of interest. Traditionally this quantity is assessed by examining the two main performance criteria of an analytical method: the bias or “trueness” and precision of the method. Both should be small enough and are usually quantified separately [3, 9, 10, 11]. This approach focuses the method
itself in assuming that if the method is “good” then the measures it will provide are also “good”. As already show, this is not always the case [12]. The concept of “total error” [8, 13, 14, 15, 16, 17] however puts the emphasis on the results themselves and tackles the
problem globally by estimating the proportion
π
of measurements expected to lie within a fixed interval (+/
λ
) around their true value. The correct underlying assumption behind this approach is that, if the results produced are “good”, then the method that produces them is necessarily “good”. This paper presents procedures to check the pre and instudy validity of an analytical method based on this total error concept, i.e. by examining the quality of results it produces. At the prestudy level, the validation procedure consists of measuring a given set of samples for which the nominal values are known and arranged according to an experiment adequately designed. The design should be able to estimate measurement bias and precision for different nominal levels and, if necessary, provide a decomposition of the global precision in various components of variances (repeatability, betweenrun, and between laboratory). Two statistical procedures are discussed to assess the method validity on the basis of such an experiment. The first consists in estimating a tolerance interval in which “future” measurements are expected to lie and verify if this interval is included in predefined acceptance limits. The second estimates directly the probability to get a measure in this acceptance limits and verifies if this estimated probability is greater than a given minimal acceptance level on the basis of a lower bound of a maximum likelihood confidence interval on this probability. Simulations have been conducted to study the laboratory and client risks for these two procedures and show that the first method is particularly efficient when the measurement process is well centered and that
h a l  0 0 1 4 1 7 0 9 , v e r s i o n 1  1 1 J u l 2 0 0 7
3 / the probability approach gains in power for reasonably biased situations, a common situation in practice. In routine, budget and simplicity requirements usually lead to use validation rules that do not fully protect both the client and the laboratory. This paper studies the properties of the 4615 rule recommended by FDA [3] in this context, generalized here as sn
λ
rule. It consists of inserting a set of n QC samples within routine unknown samples and check if at least s of the measurements obtained with those samples are not more distant than
λ
from their nominal (true) value. Power functions of this generalized rule show that it is very difficult to protect simultaneously the client and laboratory interest at a reasonable cost. It is also shown that the FDA 46
15
rule protects mainly the laboratory, a surprising result since the spirit of those regulations are precisely to protect the client. In the practical organization of an industrial laboratory, the pre and instudy validation studies are often conducted separately by different persons especially if the method is developed and validated in one place (e.g. research laboratory) and is used in routine in another one (e.g. production plan). The compatibility of the decisions taken at both stages is not obvious and even not well understood by the analysts. A laboratory that has declared the validity of a method in a prestudy phase would not appreciate (economically speaking) to see its method rejected in routine if it is still valid. On the other hand, if a valid method is subject to a significant total error increase, the instudy validation rule should be able to detect it rapidly. Conciliating pre and instudy objectives is then crucial and may be achieved through an appropriate choice of validation rules parameters in order to align the risks associated. This paper is organized as follows: Section 2 gives a precise definition of method validation based on the concept of total error and introduces related notations. Section 3 introduces two procedures for “prestudy” validation, the
β
expectation tolerance interval approach and a maximum likelihood approach aimed at estimating the probability to be within acceptance limits using the delta method. Those two procedures are illustrated on a real example and their performances are then compared using simulations. Section 4 discusses the properties of the sn
λ
rule in terms of client and laboratory risks. Finally, Section 5 shows how to conciliate pre and instudy validation parameters to attain coherent properties for the validation decisions.
2.
ANALYTICAL METHOD EVALUATION BASED ON TOTAL ERROR
The objective of a good analytical method is to be able to quantify accurately each of the unknown quantities that the laboratory will have to determine. In other terms, the analytical method is expected to give results for which the difference with the unknown "true value" (µ
T
) of the sample is small or inferior to an acceptance limit, i.e.:
λ µ λ µ λ
<−⇔<−<−
T T
X X
Two components may influence this difference: the bias or trueness of the method and its precision. As illustrated in Figure 1, a biased method provides results that deviate “in mean” or systematically from the true value
µ
T
:
δ =
E(X) 
µ
Τ
=
µ

µ
T
. The precision expresses how results vary around the mean value
µ
=E(X) when the measure is repeated. Let’s note
σ
the standard deviation available to quantify this precision. A “good” analytical method should ideally give results close from the unknown true value of the
h a l  0 0 1 4 1 7 0 9 , v e r s i o n 1  1 1 J u l 2 0 0 7
4 / sample, i.e. within some acceptance limits. This “closeness” is directly linked to the size of the bias
δ
and precision
σ
of the method.

λλ
0

λλ
0

λλδ
0
δ

λλ
0

λλ
0
=δ

λλ
0

λλ
0
=δ
X
µ
Τ
X
µ
Τ
X
µ
Τ
X
µ
Τ
Figure 1 :
Comparison of four possible validation situations
Classical method validation and quality control tools usually check the size of these two components separately (t and
χ
2
tests in validation or
X
 R control charts in routine) but this approach has the drawback that a very small value of one component may not compensate a weakness of the other. The total error approach [4, 8, 13, 14, 15, 16, 17] suggests a global approach in
considering a procedure acceptable if it is “very likely” that the difference between each measurement X of a sample and its "true value" (µ
T
) is inside the acceptance limits [
λ
,+
λ
]
predefined by the analyst. The notion of "very likely" can be translated to the following probabilistic equation:
( )
min
π λ µ π
≥<−=
T
X P
where
π
min
is called the acceptance level and
π
the quality level. The acceptance limit
λ
can be expressed either in absolute or in relative value (%). In this later case, the equation is redefined as:
min
π λ µ µ π
≥
<−=
T T
X P
All results presented in this paper are applicable to both cases. Below, we will then use the first formulation without loss of generality. The value of
λ
must be chosen according to intended use of the results. The objective is linked to the requirements usually admitted by the practice (e.g. 1% or 2% on bulk, 5% on pharmaceutical specialties, 15% for biological samples, 30% for ligand binding assays such as RIA or ELISA, etc.). The probability
π
min
must also be fixed by the analyst according to cost, consumer and analytical domain requirements. The key aspect is to ensure the coherence between the
π
min
and
λ
values targeted in the prestudy and instudy phases. This issue is discussed in more detail in section 5. Under normality assumption for the measurement results, it is easy to establish the relationship between the quality level
π
and the bias
δ
(systematic error), and precision
σ
(random error):
h a l  0 0 1 4 1 7 0 9 , v e r s i o n 1  1 1 J u l 2 0 0 7
5 /
( )
−<<−−=<−=
σ δ λ σ δ λ λ µ π
Z P X P
T
where
Z
is a standard normal random variable. This leads to define the “acceptance region”, i.e. the set of (
δ
,
σ
)’s such that the quality level
π
is greater than
π
min
. Figure 2 shows, below the curves, the acceptance region for various values of
π
min
(99%, 95%, 90%, 80% and 66.7%) when acceptance limits are fixed to [15%,+15%] as recommended by FDA [3] for bioanalytical methods. Logically, as it can be seen on Figure 1, the greater the variance of measure or the greater the bias, the less likely a measure will fall within the acceptance limits.
Figure 2
: Acceptance region of analytical methods as a function of the method bias and precision when
λ = 15%.
Note that, in this graph,
δ
and
σ
must be interpreted as relative bias and relative standard deviation.
3.
PRESTUDY METHOD VALIDATION
Before an analytical method can be used in routine for qualifying unknown samples, it’s the practice to perform a more or less extensive set of experiments to evaluate if the analytical method will be able to achieve the objective stated above. Those experiments are usually called “prestudy validation” as opposed to the “instudy validation” experiments. Since the bias
δ
and the precision
σ
, the intrinsic performance parameters of the analytical procedure, are unknown, experiments are required before using the method in routine to allow the user to obtain estimates of these quantities. These estimates of bias (
δ
ˆ) and standard deviation (
σ
ˆ) are intermediary but obligatory steps to evaluate if the analytical procedure is likely to provide accurate measures of the unknown samples to be measured in routine. The objective of the prestudy validation phase is to evaluate whether, given the estimates of bias
δ
ˆand standarddeviation
σ
ˆobtained, the proportion of measures of new unknown samples, that will fall within the acceptance limits, is greater than a predefined acceptance level, say
π
min
.
( )
min
π λ µ
≥<−
T
X P
(1)
h a l  0 0 1 4 1 7 0 9 , v e r s i o n 1  1 1 J u l 2 0 0 7