Assessment of ROC curves for inspection of random fields

Inspection by non-destructive testing (NDT) techniques of existing structures is not perfect and it has become a common practice to model their reliability in terms of probability of detection (PoD), probability of false alarms (PFA) and receiver operating characteristic (ROC) curves. These results are generally the main inputs needed by owners of structures in order to achieve inspection, maintenance and repair plans (IMR). The assessment of PoD and PFA is even deduced from intercalibration of NDT tools or from the modelling of the noise and the signal. In this last case when the noise and the signal depend on the location on the structure PoD and PFA are spatially dependent. This paper presents how to deﬁne PoD and PFA when damage and detection are stochastic ﬁelds or spatially dependent. Corrosion of coastal structures in harbours is considered for illustration and ROC curves are deduced. Identiﬁcation of probability density functions on polynomial chaos is shown to be more suitable than predeﬁned probability distribution functions (pdf) in view of ﬁtting noise and signal plus noise distributions.


Introduction
The actual challenge of the maintenance of a set of structures (harbours, bridges, etc.) needs to find the optimum balance between the increasing number of deteriorating structures and the limited funds available for their upkeep [1][2][3][4][5][6][7]. The demolition and replacement of large engineering structures results in high economic and environmental costs, further increasing the need for efficient management plans to maintain these structures [8][9][10]. Reassessment of existing structures needs to up-date materials properties. In a lot of cases, on site inspection are needed and in some cases visual inspection is not sufficient. For example nondestructive-techniques (NDT) tools are required for the inspection of coastal and marine structures where marine growth acts as a mask or underwater zone gives harsh condition of visual inspection. In these fields, the cost of inspection can be prohibitive and an accurate description of the performance on-site of NDT tools must be provided. During the last decade, the concepts of probability of detection (PoD), probability of false alarm (PFA) [11], probability of indication [12][13][14] have been proved to be suitable when performing risk-based-inspection [15][16][17][18] or management of networks [19]. They allow introducing the cost/benefit of NDT tools in a complete risk analysis. The paper focuses on their modelling in the case of inspection on structures. The probability of detection is closely linked to the level of the detection threshold and the size of the defect. When this size of defect is randomly spatially distributed, then, the probability of detection is spatially dependent. Knowing the distribution of the noise on measurements due to the decision-chain ''physical measurement-decision on defect measurement-transfer of information" [20], the harsh environment of inspection and the complexity of the protocol (link diver-operator), the probability of false alarm can be defined. Then for a set of defects and given NDT tool and operator, couples (PoD;PFA) called receiver-operating-characteristics (ROC) can be assessed. Mainly in the case of very harsh conditions of inspection or of several diver experiences, ROC curves that link the probability of detection and the probability of false alarm are plotted. First we review in this paper the basic definitions of PoD, PFA and ROC. The way to introduce these concepts as decision aid tools is focused on. Then these definitions are extrapolated in the case of spatially dependent deterioration stochastic processes. Finally to illustrate these concepts, the paper focuses on generalized corrosion on steel-piles in coastal area. First, more than 1000 measurements of residual steel thickness are analyzed. They have been obtained from an ultrasonic NDTtool on a quay in order to analyze the probability distribution function that gives the best fit of defect distribution.
Moreover, a particular protocol during inspection allows defining and assessing the noise on measurements and discussing its spatial dependency. Two models of noise are suggested: the first one consists in considering one independent random variable by level of inspection and the second one consists in gathering data by area which leads to get one independent random variable by zone (tidal zone and underwater zone).

Theoretical background and basic concepts for PoD and PFA
The most common concept which characterizes inspection tool performance is the probability of detection (PoD). Let a d be the minimal defect size, under which it is assumed that no detection is done. Parameter a d is called detection threshold in the following. Thus, the probability of detection is defined as: whered is the measured defect size. The detection threshold a d is a deterministic parameter or a random variable. In the case where a d is deterministic, this definition implies that PoD is a monotonic decreasing function of a d . Detection theory gives the theoretical background for defining PFA, once given probability density functions f signal and f noise , respectively of (signal+noise) and noise. Noise depends on environmental conditions, human interference and the nature of what is being measured. Then PoD and PFA have the following expressions (2) and (3): Fig . 1 illustrates the probability density function and computation of PFA and PoD for a given detection threshold in the case where (signal+noise) and noise are normally distributed.

Building of receiver operating characteristic curve
For a given detection threshold the couple (PFA, PoD) allows defining N.D.T performance; this is the receiver operating characteristic (abbreviated R.O.C.). This couple can be considered as coordinates of a point in the plane (PFA, PoD). If we consider that a d takes values in the range À 1; þ1½, this point belongs to a curve called R.O.C curve. It is a parametric curve with parameter a d in Eqs. (2)  From a theoretical point of view, this is a curve corresponding to a monotonically increasing function, always lying above diagonal (PFA = PoD), and whose first derivative is closely linked to the sensitivity to the receiver [21,22]. The diagonal line running from lower left to upper right (curve ''PFA = PoD") is the line of no ''performance", since in that case the inspection result is the same, no matter what the observation is [13].
Looking for the best detection performances, the probability of detection should always take larger values than the probability of false alarm (low noise sensitivity). We have then: PoD P PFA. When reading ROC curves, one must keep in mind that the probability of false alarm depends on the noise and detection threshold only. It does not depend on defect size except if the noise depends on defect size itself. The operator adjusts the device for example to detect smaller defects when the current adjustment does not give any signal. Probability of detection is a function of the detection threshold, the defect size, and the noise. Thus, for a given detection threshold, the probability of false alarm is a constant, but the probability of detection is an increasing function of the defect size. ROC curve is a fundamental characteristic of the NDT tool performance for a given defect size. Perfect tool is represented by a ROC curve reduced to a single point whose coordinates are: (PFA, PoD) = [0,1]. The distance between this ''best performance point" and the ROC curve is a measure of the NDT ability [11]. Different theoretical ROC curves, corresponding each one to different signal/ noise ratio of NDT tool are presented in [13].
When assessing a ROC curve, the challenge is then to get discrete values for PFA and PoD in given conditions [23,24] or to model the (signal+noise) and noise distributions on-site. We follow here the second approach. Note that the PFA is named PFI (Probability of False Indication) too. The definition of discrete values for PFI is generally expressed as a percentage of false indication on the inspected length [11,25]. Finally, as the number of samples is limited, authors provide generally confidence bounds: 90% POD for example [23,26].

Spatial dependency of PoD and PFA
In some cases, the performance of NDT tools depends on the location of the point to be inspected on the structure. As illustration,  let us consider the inspection of welded joints of offshore platforms with some techniques such as MPI (Magnetic Particle Inspection); the probability of detection of a given crack at the hot-spot of a Y-joint (inside the circle on Fig. 3 left) is lower than the probability of detection of the same crack at the hot-spot of a T-joint (inside the circle on Fig. 3 right). It has been observed during ICON project (InterCalibration of Offshore Ndt) [23,27,28]. When the number of samples is fair some PoD couldn't be monotonic decreasing functions due to statistical bias. Fig. 4 illustrates this case with the plot of three PoD evolutions with crack size obtained after inspections of the same samples by three inspection societies A, B and C during the ICON project. Inspectors from societies B and C encountered some difficulties with defects of length 100 mm because they were on Y-joints. When locations of defects are on welded joints, the corresponding PoD and PFA should be changed according to the access, the luminosity and the wave shaking for instance. When defects are continuous fields on the structure, PoD and PFA should be indexed by the coordinates x of an inspected point. Here, we consider that the defect is produced by a deterioration mechanism indexed by space x and time t and can be modelled with a space-time stochastic process dðx; t; hÞ, where h denote the elementary event of an abstract probability space.

Definitions of PoD and PFA for stochastic deterioration model
After inspection with a NDT tool, the measurement of defect dðx; t; hÞ isdðx; t; hÞ, the 'signal+noise' stochastic field. Then the noise gðx; t; hÞ is defined from the knowledge of these two stochastic processes by (4). gðx; t; hÞ ¼dðx; t; hÞ À dðx; t; hÞ ð 4Þ From Eqs. (2) and (3), PoD and PFA are thus functions indexed by x and t like ROC curves. The processd at a given time t being assessed from inspection, the characterization of NDT tools by these curves requires the knowledge of one of the other stochastic processes in Eq. (4): d or g. Two situations can be considered: (1) The noise is known because it does not depend on the location of the NDT tool on the structure or because it is known on given areas on the structure. It is generally time invariant and zero mean. (2) The real size is known because it has been measured before on-site inspections as in ICON project [27] or because an assumption is made.
In both situations the definition of continuous spatial functions needs the complete characterization of the stochastic processes by their marginal distribution and spatial covariance. Practically, almost all NDT tools give data on specific locations and marginal distributions are thus obtained. Moreover, the distance between measurements is generally larger than the distance of correlation and additional assumptions on the structure of correlation for the stochastic processes are needed. Finally note that the knowledge of ageing laws for d allows defining the time dependence of ROC curves. For example for corrosion processes, several models are available [29][30][31].

Statistical approach in the case of repetitive tests with known bias
Starting from results of a specific NDT testing, we suppose that we get n r repetitive NDT measurements for particular positions x j on the structure and given times t l . We denote these measurements by fd ðiÞ j;l g nr i¼1 and consider them as n r outcomes ofdðx j ; t l ; hÞ. We consider that an outcome d j;l of the real size dðx j ; t l ; hÞ is assessed as follow from these n r repetitive NDT measurements that cover the whole set of noise sources and with bias b: If b is a variable (space and time independent) it can be evaluated from a specific NDT testing. If not, expert judgement and inspection process analysis can provide values or bounds for it.
Then we deduce n r outcomes g ðiÞ j;l of noise gðx j ; t l ; hÞ as follows: g ðiÞ j;l ¼d ðiÞ j;l À d j;l with i 2 f1; . . . ; n r g ð 6Þ Of course, a NDT testing on a given structure gives only one outcome of the real size d and n r outcomes of noise g. In practice, some assumptions on stochastic processes (stationarity, ergodicity, correlation length inferior to the distance between measurements) allow considering measurements at different locations and/or different times as independent outcomes of a random variable. Thus the characterization of marginal distributions of initial stochastic processes can be assessed from the unique available outcome. This will be illustrated in the following sections.  With these measurements and the initial thickness of the pile that has been reported on the design plans, it is possible to determine the measured loss of thicknessd mainly due to corrosion and thus the loss of thickness d from Eq. (5).

Presentation of the studied structure and data analysis
The benchmark structure is a wharf located near St. Nazaire city, France (see Fig. 6). Nantes-St Nazaire harbour is the fourth biggest harbour in France. This wharf belongs to the category of on-piles wharves. Building steps and other technical information on these structures can be found in [32]. It is located in the estuary of river Loire near between Nantes and Saint Nazaire towns. It is the second station of a container wharf with 4 stations. Being in the marine environment, the steel piles are susceptible to corrosion and hence the assessment of the structural health of such a structure is considered to be of great practical importance. In fact steel piles play two major roles: the first one is a mechanical function and the second one a protection of reinforced concrete inside against corrosion. The corrosion of these piles is generally uniform corrosion and only this kind of damage is studied in this paper. Eighteen piles have been inspected. At a given height on a pile, the corrosion can be distributed non-uniformly around the pile. It is due to currents, wind and vortex shedding which modify the distribution of dissolved oxygen and nutrients around the pile. Thus the protocol should be completed to assess if this phenome-non occurs or not. To this aim four areas (cardinal points) are selected around the pile at a given level (see Fig. 7). The vector of spatial indexes x of stochastic processes is the pile number, the vertical abscissa z (reference level: mean sea level) oriented upwards and the cardinal position (North, West, South, East).
For this benchmark structure, no correlation was found between different cardinal positions around the pile for the loss of thickness. Moreover, no correlation was found between the different piles at a given height z. Under usual hypotheses on stochastic processes (see Section 3.3), we can then consider the available data (for d,d and g) at different cardinal points and for different piles as independent outcomes of stochastic processes indexed by z and having the same probabilistic characterization as marginal stochastic processes. With an abuse of notation, we denote bŷ dðz; hÞ, dðz; hÞ and gðz; hÞ the respective stochastic fields ''measured defect", ''real defect" and ''noise". As three measurements are made on each location of four generatrix and 18 piles (72 generatrix in total) at the same height z j , 216 outcomes are available for dðz j ; hÞ and gðz j ; hÞ and 72 outcomes are available for dðz j ; hÞ. Here the structure is inspected at six heights: z 1 ¼ þ2 m and z 2 ¼ þ1 m for tidal zone, z 3 ¼ þ0:5 m, z 4 ¼ 0 m, z 5 ¼ À0:5 m and z 6 ¼ À1 m for underwater zone.

Loss of thickness and noise modeling from assumption on the exact value
We denote the available outcomes fordðz j ; hÞ and gðz j ; hÞ byd ði;kÞ j and g ði;kÞ j , with ði; kÞ 2 f1 . . . n r g Â f1 . . . n p g. We denote the outcomes of dðz j ; hÞ by d ðkÞ j , with k 2 f1 . . . n p g. Here n r ¼ 3 and n p ¼ 72. Following Section 3.3, outcomes of d and g are deduced from outcomes ofd as follows: From expert judgement (interview of diver and corrosion specialist), the protocol doesn't introduce a systematic bias on the measurement, i.e b ¼ 0. The corresponding discrete model of the noise (defined at each height) is called model 1.

Loss of thickness and noise modelling considering a priori classical distributions
In order to fit the loss of thicknessd and the noise g at a given height z j , we use three a priori classical distributions: a Normal distribution, a Generalized Extreme Values distribution (GEV) and a Student distribution. Their probability density functions are given in Table 1. We focus on two heights: z 1 ¼ þ2 m and z 6 ¼ À1 m which, respectively belong to tidal and underwater zones. Figs. 8 and 9 show the fittings with these theoretical probability density functions (pdf) and Tables 2 and 3 give the corresponding parameters for each one. By using the minimum of the (-log(likelihood)) points of measurement 12h 4h 8h    Table 4 that the Student distribution gives the best fit for both loss of thickness and noise.

Assessment of ROC points and curves with classical fitting
The size of samples being large, Eqs. (2) and (3) provide the following rather good approximation of PoD and PFA at each inspected height: PFAðz j Þ % CardðBðz j ÞÞ n p Â n r ; with Bðz j Þ ¼ fði; kÞ 2 I; g ði;kÞ where CardðÁÞ indicates the cardinal of a particular set and where I ¼ f1 . . . n r g Â f1 . . . n p g. Considering model 1, these points have been calculated by fixing detection threshold at arbitrary values with a step of 0.05 mm. By linking these points by segments of line, we obtain ROC curves without any fitting: they are denoted by experimental ROC curves in the following. Considering the classical distributions presented in the previous section, we can build the corresponding ROC curves for each selected pdf and compare them to experimental ROC curves coming directly from the experimental data. Considering model 1 for noise, ROC points and curves are plotted on Figs. 10 and 11 for heights z 1 ¼ þ2 m and z 6 ¼ À1 m. In order to simply quantify the performance of a non-destructive-technique, we can get, from the ROC curve, the distance between the curve and the best performance point with coordinate ½0; 1 as suggested in [11]: this distance d characterizes the optimal efficiency of the NDT tool under specific conditions (detection threshold, conditions of inspection, etc.). Table 5 indicates those distances d and the corresponding optimal detection threshold a d . According to these figures and table, two remarks are relevant: First, we notice that inspections performed at z 1 ¼ þ2 m lead to a shorter distance d than inspections performed at level z 6 ¼ À1 m: actually, we observe that all inspections performed in the tidal zone are more effective than inspections performed in underwater zone. The diver being the same, the noise is mainly governed by the area to be inspected: in tidal or splash zone, the conditions are good when in underwater zone they are harsh. Secondly, about the fitting using predefined distributions laws, it is clear that they don't lead to the best representation especially at z 6 ¼ À1 m. Thus, this way of modelling the loss of thickness and noise does not represent well the real performance of the NDT tool and tends to underestimate its efficiency.    From this second remark, the question is: is it possible to get a better fitting of loss of thickness measurements and corresponding noise? Here, we will use the decomposition on polynomial chaos and show that it is a useful tool for this work.

Assessment of ROC curves from decomposition of loss of thickness and noise on polynomial chaos
The method of identification on PC decomposition and the corresponding algorithm is available in [34,35]. It lies on the estimate of maximum likelihood [31]. Let XðhÞ be a second order random variable to be identified from N samples, denoted by fX ðkÞ g N k¼1 . An expansion of this random variable on the hermite polynomial chaos writes: where n is a standard gaussian random variable, h i is the normalized Hermite polynomial of degree i and p is the order of the polynomial chaos expansion. The aim of the identification procedure is to find the coefficients X i of the decomposition. We here assume that mean and standard deviation are rather well estimated from samples. Due to orthonormality properties of Hermite polynomials, it gives the following constraints on the coefficients: denoting by l exp and r exp the mean and standard deviation obtained from samples, mean l X and standard deviation r X of XðhÞ are rather well estimated from l exp and r exp and constraint the decomposition to satisfy the following conditions on the coefficients:  Fig. 11. Comparison between ROC curves coming from predefined distributions and experimental data; z6 ¼ À1 m.

Table 5
Distances d and corresponding detection threshold a d for ROC curves coming from classical fittings at level z1 ¼ þ2 m and z6 ¼ À1 m.

ROC curves
Letting a i ¼ X i =r X , the PC decomposition can be rewritten as follows: where coefficients a i must satisfy: Denoting by a ¼ ða 1 ; . . . ; a p Þ T 2 R p , we denote by p X ðÁ; aÞ the probability density function of X, parametrized by a, and introduce the likelihood function:       Maximum likelihood estimation is a popular statistical method used for fitting a mathematical model to a database. Here, it is used to fit a pdf. The aim is then to find a that maximizes LðaÞ. The identification problem then writes: find a such that This is an optimization on the unit hypersphere of R p . In practice, we adopt the following characterization of the hypersphere: sinð/ i Þ for i 2 f2; . . . ; p À 1g with / pÀ1 2 ½0; 2p and 8i 2 f1; . . . p À 1g, / i 2 ½0; p. The optimization problem is then reformulated as an unconstrained optimization problem on R pÀ1 . It is solved by a two step procedure: (1) a coarse localization of a minimum is found through a basic random search algorithm; (2) starting from this point, the Nelder-Mead Simplex Method is used [36].
This problem has generally several local minima (see Fig 12), and it is convenient to repeat this two-step procedure in order to find the global solution (for instance, 10 times for a PC of degree p ¼ 2, and 100 times for a PC of degree p ¼ 3).
For our application, this identification on polynomial chaos is performed separately on the measured loss of thicknessd and the noise g at each height. Figs. 13 and 14 present the fitting results with the method of identification on PC decomposition for order p 2 f1; 2; 3g, respectively at z 1 and z 6 . Table 6 indicates the MLE for loss of thickness and noise: clearly, with a PC order p ¼ 3 we get better fittings than with classical distributions. (Table 7) At a given height, the aim is to get the ROC curves, i.e. PoD and PFA, using polynomial chaos decomposition of loss of thicknessd and noise g. Denoting by fd j;i g p i¼0 and fg j;i g p i¼0 the coefficients of the PC decomposition ofdðz j ; hÞ and gðz j ; hÞ, we obtain the following approximation of PoD and PFA:   where p n is the measure of probability associated to the standard gaussian random variable n. Practically, integrals in Eqs. (19) and (20) are computed through Monte-Carlo simulations using 10 6 samples. These quantities are independent of the study: they can be pre-processed once for all and used for each application. As shown in Figs. 15 and 16, the ROC curves coming from polynomial chaos identification with p ¼ 3 lead to a good estimation of d according to experimental ROC curves. Fig. 17 presents all experimental ROC curves at the 6 levels and the corresponding ROC curves with identification of loss of thickness and noise with PC decomposition. Two families of curves appear. The first one is close to the best ROC point with coordinates [0, 1] and the second one is composed of less effective inspection. Note that this second family, in the colored area, gathers all the inspections performed in the underwater zone for which inspection conditions are harsh.

Use of another noise modelling
In this section, we present another model for the noise g called model 2. The building of sample of noise (see Eq. (6)) allows plotting the scatter diagram of Fig. 18: it is clearly shown that there is no correlation between the noise and the real size of loss of thickness [33]. Thus we suppose that the noise is mainly governed by the inspected area (tidal or underwater zones). Then, we consider that the noise is a piecewise homogeneous stochastic field depending on the location in the tidal or underwater zone. We also consider that samples at different heights in a given zone are different independent outcomes of a random variable allowing characterizing the marginal distribution of noise in this zone. Distributions of noise in tidal and underwater zones are presented on Fig. 19. A total of 432 measurements were taken in the tidal zone and 864 in the underwater zone. We note that decompositions on a polynomial chaos with order p ¼ 3 lead to best fittings for the two areas. Considering the real defect size at height z 1 ¼ þ2m and using the previous decomposition on the polynomial chaos with order p ¼ 3 of the noise for tidal zone, we generate three random samples of loss of thickness. Fig. 20 shows the identifications on the polynomial chaos of these three samples and compare them to the initial identification of the the real loss of thickness at z 1 ¼ þ2m. We observe that these new fittings are very close to the initial one: this is showing that model 2 seems accept-able for generating random noise. Finally, Fig. 21 presents ROC curves coming from identifications of these random samples and noise model 2: except for sample 2, we observe that each new ROC curves is very close to the initial one: the reference ROC curve leads to d ¼ 0:052 while ROC curves coming from the random samples, respectively give d 1 ¼ 0:052, d 2 ¼ 0:060 and d 3 ¼ 0:053. As the experimental ROC curves gives d exp ¼ 0:054, these results seem good with a maximum error inferior to 10%. For ROC curve calculated with sample 2 the gap is coming from the identification: we have noticed that the distribution's tails were different from the others. This leads to ask if the criteria based on the estimate of maximum likelihood for the identification is a good choice for well fitting the distribution's tails.

Conclusion
Concepts of PoD, PFA and ROC curves coming from detection theory are very useful tools in order to quantify the quality of non-destructive-techniques. Classically used for inspection of cracks of offshore structures, they can also be applied to corrosion problem in the case of inspection of ships or corroded marine and coastal structures. ROC curves can be easily built in the discrete case but their use in a RBI analysis involves getting them with a continuous formulation of loss of thickness and noise. In this paper fittings of loss of thickness and noise with predefined probability density functions have been performed but we have observed that classical distributions don't lead to correct fitting of data. It is shown that the method of identification based on polynomial chaos leads to better results for order of chaos p ¼ 3 and allows to obtain rather precise ROC curves according to experimental ROC points. This decomposition is also tractable for Stochastic Finite Element Method [37]. Moreover, two models of noise have been proposed in this paper which have both lead to good results according to experimental ROC curves. Thus, using these noise models would be possible to carry out a RBI analysis. repair and of maintenance) (web site: http://www.medachs.u-bor-deaux1.fr). The authors would like to thank Harbour Authorities of Nantes St Nazaire, for their technical support and expert judgement.