Scale-space for empty catheter segmentation in PCI fluoroscopic images

In this article, we present a method for empty guiding catheter segmentation in fluoroscopic X-ray images. The guiding catheter, being a commonly visible landmark, its segmentation is an important and a difficult brick for Percutaneous Coronary Intervention (PCI) procedure modeling. In number of clinical situations, the catheter is empty and appears as a low contrasted structure with two parallel and partially disconnected edges. To segment it, we work on the level-set scale-space of image, the min tree, to extract curve blobs. We then propose a novel structural scale-space, a hierarchy built on these curve blobs. The deep connected component, i.e. the cluster of curve blobs on this hierarchy, that maximizes the likelihood to be an empty catheter is retained as final segmentation. We evaluate the performance of the algorithm on a database of 1250 fluoroscopic images from 6 patients. As a result, we obtain very good qualitative and quantitative segmentation performance, with mean precision and recall of 80.48 and 63.04% respectively. We develop a novel structural scale-space to segment a structured object, the empty catheter, in challenging situations where the information content is very sparse in the images. Fully-automatic empty catheter segmentation in X-ray fluoroscopic images is an important and preliminary step in PCI procedure modeling, as it aids in tagging the arrival and removal location of other interventional tools.

Keywords Segmentation · Percutaneous coronary intervention · Mathematical morphology · Modeling interventional processes · Guiding catheter 1 Introduction In interventional cardiology, Percutaneous Coronary Intervention (PCI) procedures are performed with real time streaming of X-ray images, most of it being low dose X-ray images called fluoroscopic images. Physicians expect that the behaviour of the imaging equipment is continuously optimized to ensure optimum image quality with minimum dose delivery or automatized processing of some sequences to enhance some detail of interest at a particular moment. PCI procedure modeling can help to improve the interaction of the clinician with the imaging equipment. This concept refers to determining the intention of the clinician along the procedure. For this purpose, a continuous monitoring and labelling of the sequence is necessary. The key steps of the PCI procedure are: vessel diagnosis, guidewire navigation, stent positioning, stent deployment (balloon inflation), stenting assessment. Getting this information directly from the human operator is not acceptable from a workflow point of view. So we aim at designing a family of image processing algorithms to identify the presence of different interventional tools in the images and link this information to high-level knowledge describing the steps of the procedure and the user expectations for each of them. This is a form of semantic analysis which is fundamentally different from the traditional automatic X-ray exposure control combining user interactions and measure of the statistics of the image. In the field of operating theater monitoring and surgical process modeling for laproscopic and cataract surgeries, similar pioneering work has been reported [9], [8]. In our case, such semantic information may also be used for automatic dose control. Monitoring the interventional tools like guiding catheter, EP catheter, guide wire tip, guide wire body, marker balls, balloon, stent is necessary to obtain this information. Thus, segmentation of these tools is a fundamental brick in such semantic analysis. Most of these tools are highly contrasted and relatively easy to segment. Milletari [13] showed that it's possible to segment EP catheters with segmentation accuracy of 99.3%. Brost et. al [2] has also proposed efficient catheter tracking for electrophysiology procedures. Various works on segmentation of pigtail catheters [10] and EP catheters [12,13] portray the importance of segmentation of interventional tools and endovascular devices.
From first analysis, we observe that the segmentation of guiding catheter is of outmost importance. A guiding catheter is a tool that appears throughout the PCI procedure. It can contribute to significant semantic information since it is the first tool to appear in the field of view and is fixed at ostia for rest of the procedure, is the conduct for all other tools/devices. Thus, its segmentation can help in procedure modeling to determine the events/phases of the arrival and removal location of other devices (guide wire, marker balls). We address here the empty catheter case i.e. when it is not filled with contrast media or a guidewire. Such empty catheter appears in 20% to 30% of the images acquired during PCI procedure and mostly in the first steps of the procedure where the analysis shall start. A filled catheter is highly contrasted structure and relatively easy to segment.
Interventional systems are provided with two different application modes, namely fluoroscopy and record mode (also called cine or graphy). Fluoroscopy mode is used for manoeuvring the interventional tools. In record mode, the system is set to deliver images with a quality sufficient to support the operator in his assessment of the vasculature. Record images are more contrasted and less noisy than fluoroscopic images. In fluoroscopy, the intensity of X-ray beam and so the dose delivered are limited as per regulation. [4] reports observed Typical dose rate in record images is 6 to 10 times to that of fluoroscopic images which explains the significant difference in noise levels and contrast. In Fig. 1, a record and a fluoroscopic images taken at few seconds of interval in the same setting illustrate the difference of quality between these two imaging modes of interventional angiographic units. In order to perform continuous monitoring of the procedure, it's necessary to segment interventional tools in record as well as fluoroscopic images. However, the segmentation task is difficult due to low contrast in fluoroscopic images. In this article, we address the task of empty catheter segmentation in particular fluoroscopic images. In these images it appears as a low contrasted structure with two parallel and partially disconnected edges because it is just an empty tubular pipe made of a material with little radio-opacity. As X-ray contrast of the object depends on both the radio-opacity of the material and its thickness, an empty catheter is mainly detectable on its boundaries, where the projective thickness is larger. So overall the image signal can be characterized by a general geometric structure coming from smooth curve of the catheter and sparse information due to the limitations of X-ray imaging with low dose.
We devise a bottom up approach for segmenting the empty catheter in fluoroscopic images. We first use the level-set scale-space, i.e., the hierarchy of all the level sets of the gray scale image, called the component tree, to extract curve blobs, small dark persistent regions that are potentially part of the empty catheter. These curve blobs are disconnected in the image space. We then propose a structural graph-based scale-space, in the form of a hierarchy ( i.e., a tree), where these curve blobs are connected. We analyze this hierarchy to select the cluster of curve blobs that maximizes a score of likelihood to be an empty catheter. If the first tree exhibits the deep structures of the critical points, the second tree puts forward the even deeper structures of interest, that we call deep connected components. To evaluate our work, we use a database of 1250 fluoroscopic images from 6 patients (retrospective use of collected images of patients). The centerline of the catheter in this dataset was manually delineated by a trained observer to define the ground truth.
The major steps of our proposed algorithm are detailed in section 3. The strategy retained for assessing the performance of this algorithm is discussed in the section 4. The qualitative and quantitative results are presented in section 5. The main contributions are: (1) proposal of a new notion of a deep connected component, appearing in a second-order scale-space (Section 3); and (2) assessment of the proposed method on a database of 1250 fluoroscopic images (Section 5).
2 Technical challenges: scale spaces and deep connected components Classical techniques [5] which are mostly differential-based, do not work in this situation due to weak contrast of empty catheters and high noise level of fluoroscopic images. We decided to adopt an approach derived from the theory of scale-space. According to this theory, each structure in a scene is visible at a certain scale. Finding the right scale is challenging issue that has been studied by many authors, primarily using the Gaussian scale-space. Lindeberg [11] studied the problem of linking local critical points (extrema and saddle) over scales, leading to the so-called scale-space primal sketch which makes explicit the relation between structures at different scales. An important practical issue in this approach is the ability to attach a persistence measure to the structures, i.e., a measure of the duration of survival of the structures during the evolution. In their seminal works, both Koenderink [7] and Witkin [18] propose to investigate the deep structure of an image, i.e., the structure of all levels of resolution simultaneously.
For signal of dimension 2 or greater, two drawbacks of the Gaussian scale-space are that, during the evolution, 1) structures evolve (change shape), and 2) critical points can be created. On the other hand, connected filters from mathematical morphology [15] can be seen as a non-linear scale space: in such approaches, the image is transformed into an equivalent tree-based representation (tree of upper-level sets, of lower level sets, or both), and attributes can be computed for each node of the tree. Selecting the nodes with a criterion based on these attributes allows to study the evolution of the nodes of the tree, and in particular their persistence. Obviously, during such evolution, structures cannot change, and no novel structure can be created. A formalization of such ideas in the context of image segmentation has been achieved by Guigues et al. [6]. The hierarchical data organization presented in this article has the main scale-space properties studied in [6]. Early attempt for using hierarchical data organization for guidewire localization has been made by Barbu et al. [1]. They use marginal space learning based hierarchical model of curves (obtained from low-level segment detector) to model complex free-form curves. The similarity with our approach is that both are bottom up approaches with low level segment/blob detector as first step. Though [1] does not show segmentation of empty catheter, a head-to-head comparison would be helpful but neither the dataset nor the implementation has been made public. Reported computational time are close to ours. A major insight that we draw from such methods is that any hierarchical data organization has the main scale-space properties. As the algorithms for computing the trees are graph-based ones, these ideas can be extended to work on any graph, and not only on 2D/3D images (e.g. Xu et al. [19], [14]).
In this work, we first use the level-set scale space to identify curve blobs, which are small dark persistent regions that are potentially part of the empty catheter. We propose a novel structural graph-based scale-space, in the form of a hierarchy, i.e. a tree built on the curve blobs. We analyze this second tree with the very same techniques as the first one, and we retain the most persistent structures in this second scale-space as the final segmentation. If the first tree exhibits the deep structures of the critical points, the second tree puts forward the even deeper structures of interest, that we call deep connected components.

Methodology
Our bottom-up approach comprises two main parts: the first one aims at identifying small dark regions called curve blobs (Section 3.1) and the second one focuses on grouping them in order to retrieve the whole catheter (Section 3.2 and Section 3.3). Both steps are performed by analyzing a hierarchical structure, called a component tree [15].

Curve blobs extraction
We are interested to extract the curve blobs from the image, using a component tree called min tree. The min tree [15] structures the connected components of the lower-level sets of the grayscale image based on inclusion relationship. A grayscale image f , when thresholded in an increasing order at every possible gray level ranging from h min to h max , yields a stack of nested (lower) level sets. Each level set can be partitioned into connected components when the image domain is structured as a pixel adjacency graph (we consider 4-adjacency relation). In this setting, any two connected components A and B at two successive thresholds are either nested or disjoint and we say that the connected component A is the parent of B whenever B is nested in A. With this "parent" relation, the set of all connected components is a directed tree called the min tree of the image f .
The min tree considers only the dark connected regions of the image and the curve blobs appear as regions at different levels of this tree (i.e., at different scales, refer Fig. 2 for illustration). The image f on which the min tree is computed is obtained with a simple morphological dark top-hat on the input X-ray image with a circular structuring element whose radius is same as radius of catheter. Thus, two curve blobs might well be obtained from two distinct threshold values. Then, the set of all curve blobs is included in a non-local (i.e., a spatially variable) threshold of the image, that is a non-horizontal cut of the min tree. In order to obtain curve blobs among the connected components, we design a criterion that selects them in this non-horizontal cut. We assign to any component in the min tree, attributes characterizing its shape and structural properties. For curve blobs, we design a selection criterion based on area and elongation attributes. The area of a component refers to the number of pixels whereas the elongation is given by 1 − l min /l max , where l max and l min are the lengths of the axes of an ellipse optimally fitted to the component.
In order to select curve blobs, a straightforward idea is to select components whose area is in a certain range and with elongation attribute large enough. Note that we are not interested in too small connected components which might come from noise nor in too large Fig. 3 Curve blobs clustering; from left to right: extracted curve blobs (see Fig 2); connection of curve blobs on structural scale-space; hierarchy of deep connected components (clusters of curve blobs at different scales); selected deep connected component.
ones which might correspond to filled catheters, pacing leads, or other large anatomical or interventional structures. To establish a proper selection criterion, we built a set of curve blobs belonging to empty catheter (taken in 14 images) and another set of curve blobs selected randomly in the same image at location away from the marked empty catheter. By investigating the distribution of these two sets in the area-elongation space, we established a relevant criterion: defined by independent lower limit for area and elongation and a maximum upper limit on the weighted sum of area and elongation. All nodes satisfying this criterion are selected to form the set C of curve blobs. Nested connected components could satisfy the criterion as they may depict the same region in the image. Based on the min tree structure, a filtering is performed to preserve elements with largest area(taking aid of the inclusion relationship).

Curve blob clustering: deep connected components construction
This section presents the main idea of empty catheter detection, i.e. curve blob clustering in the structural scale space. Fig. 3 intuitively portrays this idea of curve blob clustering in the structural scale space. Some curve blobs extracted in the previous step are regions of edges of the catheter, while some others correspond to other anatomical and interventional structures or to noise. By analyzing individually a given curve blob, it is difficult to decide whether it is part of a catheter because contextual information is missing. So we consider them in a common space and define a weight for each pair of curve blobs, called as blob pair weight. The weight is defined to be small when the two considered curve blobs are likely to be part of an empty catheter. In rightmost image in Fig. 3, the selected cluster of curve blob belonging to empty catheter is shown where green curve blobs are connected with red edges which link blob pairs. We propose to build it by combining three elementary weights, each of them characterizing one aspect of the relation between two curve blobs: -Spatial weight, w S : Given two curve blobs, the minimum Euclidean distance between the extremities of blob axes is considered. It is essential because the intent of the hierarchy is to connect close curve blobs. -Alignment weight, w A : Empty catheter looks like partially disconnected curvilinear structure. Thus, the blobs belonging to it shall be a part of a smooth curvilinear structure. The alignment weight is designed to measure this property. Each blob is represented by its barycenter, and enriched by a vector representing the orientation (or first moment) of the blob. Thus, the alignment weight of two blobs is the minimum of the inner product of these orientation vectors and of vectors joining barycenters.
-Profile weight, w P : It estimates the dissimilarity between the intensity profiles along a pair of blobs and the desired intensity profile along an ideal empty catheter. The intensity profile along a blob is a series of values (i d | d ∈ [−N, N]), where i d is the average of the intensities of the pixels on the segments parallel to the blob axis located at (signed) distance d. The distance to expected intensity profile of a blob is the sum of squared difference between intensity profiles of the blob and of an ideal catheter. The profile weight of a blob pair is the mean of the distances to expected profile of the two blobs.
We transform the individual weights with sigmoidal function so that the three weights lie in the same range. The blob pair weight w B is computed for any two blobs b 1 and b 2 in C as: where w ∼ S and w ∼ P denote the weights transformed with sigmoidal function. The hierarchy of curve blobs is defined thanks to blob pair weight. Intuitively, a threshold on the blob pair weight gives a partition of C into clusters of connected blobs which are "consistent" with respect to blob pair weights. More precisely, for a given threshold value λ , we build a curve blobs graph G λ = (C, E λ ) where each vertex is a curve blob in C and where two curve blobs are linked by an edge in E λ if their blob pair weight is below λ : Such graph induces a partition P λ of the curve blobs into connected components, each element being referred to as blob clusters at scale λ . The set of all blob clusters obtained at every possible scale is a hierarchy H of partitions on the set of curve blobs, given by: Indeed H is a hierarchy since any two blob clusters in H are either nested or disjoint. Hence, this hierarchy can be managed as a tree structure where the parenthood relationship is given by inclusion relationship on the set of clusters (more precisely it is its Hasse diagram). Using the terminology from mathematical morphology, where a similar construction has been done for pixels with a different measure [3,16], we name this precise component tree the quasi-flat zone hierarchy of the blob pair weight. Any element of a partition at scale λ is called a (quasi-flat) zone. Hence, in the context of this article, these zones refer to clusters of the curve blobs or deep connected components. This hierarchy is what we mention previously as a structural scale-space in which we are looking for the deep connected component corresponding to the empty catheter in the image. Unlike the min tree, which was directly built over the image pixels, at every threshold value, the set of all curve blobs is partitioned, the elements of the partitions being clusters of curve blobs.

Deep connected component selection
An empty catheter appears to be a zone (deep connected component) in the partition of this quasi-flat zones hierarchy at some scale λ in the structural scale-space. In order to analyze the zones in this space, we have a measure L which maps to any zone Z a positive real value L (Z) that represents the likelihood of Z of being an empty catheter. For each zone Z the measure L (Z) depends on some attributes of zone Z. These attributes depend on several geometric and (time and space) continuity properties of Z modeling the appearance of empty catheter in fluoroscopic images sequences. The six zone attributes used in this study are explained below: -Length: the length of a zone is determined by fitting a 3rd order polynomial curve on the curve blobs of the zone, each curve blob being represented by the center and end-points of blob axes. Once the fitting is done, the segment of the curve is selected by mapping the blob points on the fitted curve and determining the extremities of these mappings on the curve. The length is the arc length of the curve between these extremities. -Fitting error: it is calculated as average of the residual errors in the fitting above.
-Average distance to expected profile: the mean of distances to expected profile of all curve blobs of the zone. -Proximity to expected scale of observation: an empty catheter is expected to appear as zone at a certain scale λ . Therefore, we design a proximity attribute to this expected scale. The expected scale is observed from the image dataset. -Proximity to image borders: this attribute is computed as the minimum of the distances of all the curve blobs in a zone to the border of image. -Temporal feedback: for each frame, we compute a feedback image with the score of the detected catheter in its bounding box and 0 elsewhere. We then sum the feedback images of the past 10 frames and consider the average for all pixels of a zone. It is taken as the temporal feedback attribute of the zone.
However, to homogenize the range of the attributes before combining them, we form homogenized scores as the images of the attributes by Gaussian functions whose parameters are determined by analyzing the ground truth (see Section 4 for details on ground truth). The measure L is product of the six homogenized scores. The resulting segmentation S is the zone in the hierarchy H that maximizes the likelihood score:

Segmentation Quality Evaluation
For evaluation, we use a database of clinical images and the ground truth annotated by experienced human observers with support of a semi-automatic software.

Ground truth construction
Catheter appears as low contrasted tubular structures in X-ray images. We decided to have the centerline of the catheter as a reference to evaluate the performance of our segmentation. An internally developed (image similarity based) semi-automatic software for curve tracking is used by human operators to mark, track and correct the centerline, forming a curve which forms the ground truth for empty catheters in fluoro images along the temporal sequence. For each frame in a sequence, a centerline is a curve in 2D image space, which is then sampled in a series of equidistant pixels as C gt = (g 1 , . . . , g m ).

Segmentation
The automatically detected empty catheter is a cluster of curve blobs. As described in previous section, a polynomial curve is fitted to these blobs which is then sampled in a series  (s 1 , . . . , s n ). Right column in Fig. 5(a) shows the estimated centerline. In C gt and C seg , the sampling distance between two consecutive points is one-fourth of the radius of the catheter.

Evaluation measures
In this work, we want to evaluate our ability to locate the empty catheter. We quantify the proximity between the two objects: the curve marked as ground truth and the cluster of curve blobs. This metric of proximity is then analyzed using the precision and recall formalism. Precision is defined as fraction of correctly detected catheter. As explained in Fig. 4(a), the matched detection is denoted as true positive, emphasizing the fact that the segmentation algorithm has indeed found the catheter. The unmatched detection is denoted as false positive, because the detected catheter hypotheses are incorrect. Similarly, recall is fraction of reference centerline (ground truth) which is explained by detected centerline. Fig. 4(b) shows the matched reference (true positive) which is correctly retrieved ground truth points. Such centerline line based evaluation methods are employed for evaluation of road extraction algorithms in photogrammetry and remote sensing [17]. Precisely explaining our implementation, for each image, we quantify the proximity between: the series C gt = (g 1 , . . . , g m ) of ground truth points and the series C seg = (s 1 , . . . , s n ) of the points extracted from segmentation. To this end, we consider the minimal distance from a point x to a series of points C = (c 1 , . . . , c ) as, δ (x,C) = min{d(x, c i ) | i ∈ {1, . . . , }}, where d is Euclidean distance. Based on this measure, a point s i of the segmented catheter C seg is considered as correctly classified (true positive) when δ (s i ,C gt ) ≤ η and a point g i of the ground truth is considered as correctly retrieved when δ (g i ,C seg ) ≤ η . The value of η is based on the the standard diameter of an empty catheter in the image plane (here η = 24 pixels (4.8mm)). Thus we compute for each image, the Precision and Recall as the fraction of segmented points correctly classified and the fraction of ground truth points correctly retrieved, respectively.

Results
Dataset: We evaluate our empty catheter segmentation algorithm using a dataset of 1250 fluoroscopic images. These 1250 fluoroscopic images belong to 10 sequences taken from examinations of 6 patients. These images were acquired at frame rate of 15 fps. Considered  images of angioplasty exams depict large variability because of patients' body mass index (BMI), noise levels, different anatomical backgrounds, occasionally presence of pacing leads, stents, staples, sternal wires(see Table 1).
A small set of 30 images from 4 sequences (A1, B1, B2, C1) was used for tuning parameters in the full algorithm development. Once the development was completed, we built a large database of images with ground truth. Optimizing the α and β parameters of the blob pair weight function (refer equation 1) on 650 images (instead of 30) from 4 sequences (A1, B2, C1 and D1) slightly improves the results ( 3.45% Recall / 6.20% Precision). Our evaluation measure and the ground truth are used for this optimization step.
Results and discussion: The segmentation of empty catheters in different image qualities and different anatomical and interventional contents are shown in the Fig. 5. In Fig. 5(a) and 5(b), the input image on left is overlayed with the selected cluster of curve blobs in the middle image. Whereas, the image on the right shows the fitted curve for the selected cluster of blobs, this curve is considered as an estimation of the centerline of the catheter. Fig. 5(b) portrays empty catheter segmentation in presence of other elongated interventional and anatomical objects. Fig. 5(c) depicts the results from three different patients, illustrating the potential of this method, where empty catheter is detected in spite of the presence of other elongated objects like pacing leads. These fluoroscopic images also have disturbing anatomical contents like the spine. Indeed, some internal structure of the vertebra bodies may take part of the appearance of catheter because of contrast and curvilinear outlook. We assess two versions of our automatic algorithm with and without temporal feedback using the defined evaluation measure. The mean precision and recall without temporal feedback are 62.40% and 55.84% respectively. With temporal feedback, this mean precision and recall improves to 80.48% and 63.04% respectively. In a detailed per sequence analysis (Table 1), we notice that in few sequences the performance of our algorithm is hampered because of some factors such as the patient's body mass index (BMI), catheter appearing over the spine making it less visible, several sections of catheter in the field of view (FOV) leading to multiple apparent catheters (e.g. left and middle images in Fig. 5(c)). Low precision and recall was observed in sequences B2, D1 and E2 (in Table 1) due to multiple section of the same catheter in the FOV and high patient body mass index. In sequence B2, our proposed algorithm fails to identify the desired section of catheter with the tip, among the two sections. Fig. 6(a) depicts a frame from sequence B2, where ground truth is marked in green and detected catheter is marked in red. However, in sequence E1 the undesired section of catheter (without the tip) was above the spine making it difficult to be detected. Hence, the precision and recall for this sequence is 96.54% and 71.10% respectively, leading to successful segmentation of desired section of catheter with tip. In sequence D1 (refer Fig. 6(b)), the algorithm fails due to high patient BMI of 39.7, resulting in high noise level and very low contrasted catheter. We also noticed that position and orientation of the gantry affects the image quality and the performance of our algorithm. In further experiments we observe that the precision/recall rate is stable when parameters of the edge weights are changed in a range of ±20%, which is encouraging regarding the robustness of the approach which is not dependent of very precise parameter setting. The average execution time per image is 0.68 seconds on a Intel R Core TM i7 − 4810MQ CPU. The software has good potential for further optimization. In this article, we studied the challenging problem of detecting and locating the empty catheter in fluoroscopic images. To achieve our goal, we developed a novel structural scalespace in the form of a hierarchy of deep connected components, one of them being selected as empty catheter. Our experimental results are very encouraging, showing that it is indeed possible to locate with good precision the empty catheter in such noisy images. However, additional experiments on larger dataset are required to further estimate the quality of segmentation. These results also open the doors for PCI procedure modeling since empty catheter is an important landmark in these images. Indeed, using a similar strategy in the scale space framework, we aim to simultaneously detect other landmarks, such as guide wire tip, marker balls, or balloons that appear more clearly in the images. Therefore, future work includes segmentation of these objects by handling in a common scale-space framework. These segmentations can contribute to the PCI procedure modeling.

Compliance with ethical standards
Conflict of Interest: The authors declare that they have no conflict of interest.
Ethical approval: For this type of study formal consent is not required.