An innovative and automated method for characterizing wood defects on trunk surfaces using high-density 3D terrestrial LiDAR data

We designed a novel method allowing to automatically detect and measure defects on the surface of trunks including branches, branch scars, and epicormics from terrestrial LiDAR data by using only high-density 3D information. We could automatically detect and measure the defects with a diameter as small as 0.5 cm on either oak (Quercus petraea (Matt.) Liebl.) or beech (Fagus sylvatica L.) trees that display either rough or smooth bark. Ground-based light detection and ranging (LiDAR) technology describes standing trees with a high level of detail. This provides an opportunity to assess standing tree quality and to use this information in forest inventory. Assuming the availability of a very high level of detail, we could extract information about the surface defects, mainly inherited from past ramification and having a strong impact on wood quality. Within the general framework of the development of a computing method able to detect, identify, and quantify the defects on the trunk surface described from 3D data produced by a terrestrial LiDAR, this study focuses on the relevance of the whole process for two tree species with contrasted bark roughness (Quercus petraea (Matt.) Liebl. and Fagus sylvatica L.) in terms of detection, identification of the defects, and comparison with measurements performed manually on the bark surface. First, a segmentation algorithm detected singularities on the trunk surface. Next, a Random Forests machine learning algorithm identified the most probable defect type and allowed the elimination of false detections. Finally, we estimated the position, horizontal, and vertical dimensions of each defect from 3D data, and we compared them to those observed directly on the trunk by an operator. The defects were detected and classified with a high accuracy with an average F1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${F}_{1}$$\end{document} score (harmonic mean of precision and recall) of 0.74. There were differences in computed and observed defect areas, but a much closer agreement for the number of defects. The information about the defects present on the trunk surface measured from terrestrial LiDAR data can be used in an automated procedure for grading standing trees or roundwoods.


Introduction
Grading standing trees is an essential task in forest inventory and pre-harvest assessment (Fonseca 2005). Besides global attributes such as volume and curvature, local defects are among the most critical factors influencing quality and valorization of wood. These defects can degrade the mechanical performance for structural use or the aesthetics for interior or furniture uses. In both cases, they depreciate the economic value. One method used in research for evaluating standing tree quality is to climb on the tree using a ladder and record all the defects and their attributes such as position, dimension, and type (Colin et al. 2010b), but for inventory purposes, a visual inspection from the ground is generally achieved (Jourez et al. 2010). For grading roundwood, X-ray tomography can be used to measure the knots and other defects inside a log. Although this approach provides the best performance, its main drawbacks are the requirement for felling the tree and high investment costs (Thomas et al. 2006).
LiDAR (light detection and ranging) can measure an object in 3D dimensions using a technique emitting a laser beam and receiving the reflected light by a sensor. The data provided by LiDAR is a point cloud, which contains the three spatial coordinates of the intercepted points and additional characteristics such as reflectance or RGB values.
In forestry, Terrestrial LiDAR Scanning (TLS) can provide unprecedented detail information of trees or forest plots. In recent decades, several applications have been developed to take advantage of the TLS for replacing conventional methods for measuring forest inventory attributes. The first step is, however, the isolation of every tree or stem in the point cloud. The attributes of each tree are then estimated such as DBH (Simonse et al. 2003;Henning and Radtke 2006;You et al. 2016;Pitkänen et al. 2019), tree height (Popescu et al. 2002;Srinivasan et al. 2015) or both DBH and height (Hopkinson et al. 2004;Maas et al. 2008;Huang et al. 2011;Moskal and Zheng 2012;Olofsson et al. 2014), wood volume or biomass (Holopainen et al. 2011;Dassot et al. 2012;Yu et al. 2013;Bienert et al. 2014), and the attributes of stem quality such as curvature (Liang et al. 2014;Noyer et al. 2019), taper (Thies et al. 2004;Norzahari et al. 2012), and ovality . Plot attributes can be calculated from the TLS such as wood volume (García et al. 2011;Kwak et al. 2014), stem density (Watt and Donoghue 2005;Brolly and Kiraly 2009), canopy measurements (Zande et al. 2008;Seidel et al. 2015;Cifuentes et al. 2017), and leaf area index or leaf area density (Moorthy et al. 2008;Béland et al. 2011;Bailey and Mahaffee 2017;Taheriazad et al. 2019).
TLS has been increasingly used for characterizing individual standing trees, thanks to the high density of 3-dimensional point data. In particular, TLS allows an accurate reconstruction of the tree bark surface. Several studies have recently investigated the grading of standing trees or roundwood using TLS data by detecting defects on the reconstruction of bark. Most approaches used the difference between the surface of the trunk and a reference surface resulting from a primitive shape fitting, such as circle (Thomas et al. 2006(Thomas et al. , 2010 or cylinder (Schütt et al. 2004;Kretschmer et al. 2013;Stängle et al. 2013). However, because the cross-sections of a log are often not perfectly round, the primitive shapes hardly fit the real log surface. The methods based on primitive shape fitting result thus in detecting only large defects with a diameter above 12.7 cm and protruding more than 2.5 cm from the log surface (Thomas et al. 2006). Depending on species and tree age, the bark may have different structures (smooth or furrowed) that can interfere with the detection of defects. Therefore, an automated process is challenging to design, especially for small defects.
The earlier methods of estimation of tree quality using TLS were mostly semi-automatic (Schütt et al. 2004;Kretschmer et al. 2013;Stängle et al. 2013) and limited to large branch scar defects. Schütt et al. (2004) presented a semi-automatic approach to detect and classify wood defects using both range and intensity information of TLS data. The potential defects were firstly segmented by the region-based segmentation of the greyscale image derived from the distance between 3D point cloud of the log and the fitted cylinders. A manual adjustment can be done to make the final decision about these potential defects after classification by a neural network. Thomas et al. (2006Thomas et al. ( , 2010 proposed a method to automatically detect severe defects on red oak and yellow poplar using a multi-scale analysis of contours from 2D images derived from the distance between 3D points describing the log and the fitted circles. More recently, Kretschmer et al. (2013) used a 3D approach based on a distance map of the difference between the trunk surface and the reference surface obtained by fitting cylinders. The pseudo coloring of the distance map was a visual aid to detect and measure the branch scars on European beech within a diameter range from 4 to 18.2 cm. Stängle et al. (2013) used a similar method to estimate the knotty core size in European beech. The comparison with the X-Ray scanning results led to an agreement of 62.5% between both grading methods.
While branch scars constitute indisputably one of the most common defects on trunks with a strong impact on the wood quality, other defects, such as burl and picot (Colin et al. 2010a), cannot be neglected as mentioned in the standards (Carpenter and Jones 1989;AFNOR 1997;Jourez et al. 2010). For example, epicormics consisting of burl, picot, and bud cluster may degrade the quality of oak by at least one class (Meadows and Burkhardt 2001).
During earlier research, we successfully developed algorithms to detect surface defects with dimensions as small as 0.5 cm (Nguyen et al. 2016a) and then classified them by using a machine learning approach (Nguyen et al. 2020a). The features used by Random Forests included species, 3D Hu moment invariants (Hu 1962), 3D dimensions, eigenvalues, and mean of the roughness of the defect point cloud. The results obtained on trunks of several tree species including oak, beech, wild cherry, pine, fir, and spruce were rather promising with an average F 1 score (the harmonic mean of precision and recall) of 0.86. The results showed robustness to the tree species features, such as furrowed bark, age, trunk diameter, and shape. These promising results allow us to go to a step further to estimate defect attributes.
Indeed, besides type, dimensions of the defects have an important impact on wood quality. Thus, an accurate measurement of defects is important to estimate their impact. Moreover, dimensions of branch scars provide a prediction of the knot trace inside the wood (Wernsdörfer et al. 2005).
In the context of tree quality assessment, the first objective of this paper was to propose a method for automatically measuring the dimensions of the defects after detection (Nguyen et al. 2016a) and identification (Nguyen et al. 2020a). The second objective was to verify the efficiency of the whole data processing. For this, we compared the defect attributes automatically estimated with field measurements achieved by a forest expert. The third and last objective was to assess the computing time in relation to the main algorithm parameters.

Method overview
We developed an automated method consisting in three steps. The first step "Detection" aims to quickly find the smallest possible candidate defects on the mesh of the trunk. A threshold-based segmentation algorithm was developed to detect the occurrence of a defect based on the difference between the reference distance and the distance of 3D points to the trunk centerline. The reference distance to the centerline here represents the distance from the point to the centerline of an idealized trunk which would have no defects and obtained from the distribution of its neighboring points belonging to the elongated patch along the trunk axis. This first step was tested on different tree species (Nguyen et al. 2016a). The second step "Identification" has two main objectives: (1) to remove the false positives coming from the first step and (2) to classify the candidate defects into different types by their biological nature such as branch, branch scar, burl, and small defect (Nguyen et al. 2020a). The last step consists of estimating the position and dimensions of the identified defects, and this last part of the processing chain is thoroughly discussed in this paper.

Data collection and preprocessing
The assessment of the proposed approach required the availability of 3D point clouds datasets describing the intricate details of the trunks' surface as well as manual measurements superimposed over the 3D point cloud coordinate system.

Manual measurement of defects on standing trees
The objective of the manual measurement was to create a reference value of each defect attribute: type, position, and dimension(s). A measurement protocol was defined to facilitate the identification of the defects on the reconstructed surface of the trunk. The defects with a diameter greater or equal to 0.5 cm were described prior to carrying out the scans in terms of dimension, type, and position. The position was taken in an (O,l,z) coordinate system where h is the height and l is the arc length from a reference line to the defect. A measuring tape carefully adjusted over the trunk surface that best fit the visual projection of the trunk's centerline approximated the central axis of the trunk. The measuring tape divided the scanned portion of the trunk into two equal parts. It also allowed measurements of z coordinate. Two small markers (ping-pong balls or small pieces of paper) at 1-m height, the origin of the coordinate system (O), and 5-m height were in the scans and used as markers of the reference line to compute coordinates of the automatically detected defect. The area under evaluation was from 1 to 5 m in trunk length and a half of tree circumference.
After each defect type identification, the position of the defect was measured in the (O,l,z) coordinate system (Fig. 1a). The z coordinate was the position along the reference line. The l coordinate was the horizontal arc length between the tape and the defect. The measurement was carried out on both sides of the tape between − 1 4 circumference (to the left hand) and 1 4 circumference (to the right hand) at breast height and with the height z ranging from 1 to 5 m from the ground.
The measurements of defect dimensions depend on defect type. For burl and small defects, the measurements were the defect width w and height h (Fig. 1b). The width w was measured by the arc length between the leftmost and rightmost points, assessed visually, of the intersection between the defect and the bark (see Fig. 1b). The height h was measured by the difference between the maximum and minimum z coordinates of the intersection between the defect and the bark. For branch scars, in addition to width w and height h of the scar, the Chinese mustache width ( w k ) and height ( h k ) were also measured by the same method used to measure w and h, respectively (Fig. 1c). For branch type, including sequential and epicormic branches, the branch diameter was measured at a short distance from the trunk after the junction between the branch and the trunk (Fig. 1d).
The validation of the automatic measurement of defects involved six trees including three oak trees and three beech trees. We used 401 defects on 17 other trees (Table 1) including seven sessile oaks (Quercus petraea (Matt.) Liebl.), five European beeches (Fagus sylvatica L.), two wild cherries (Prunus avium (L.) L.), one Scots pine (Pinus Sylvestris L.), and two silver fir (Abies alba Mill.) to train the classifier.

TLS data collection and preprocessing
A Faro Focus 3D X130 scanned the sampled trees in the Champenoux forest close to Nancy, France (48° 43′ 30.1″ N, 6° 20′ 52.1″ E). The database consists of point clouds of trees of two species: sessile oak (Quercus petraea (Matt.) Liebl.) and European beech (Fagus sylvatica L.). In order to detect the small defects, we used a high-resolution setting and the distance from the scanner to the trunk was approximately 3-4 m. The angular resolution was 0.018° on both horizontal and vertical directions. Each tree was scanned from two points of view towards the target side to describe half of the trunk circumference. Scans of each tree were coregistered in FARO SCENE software (Faro Technologies Inc., Lake Mary, FL) and exported to a unique file to recover the 3D view of one side of the trunk.
The point cloud was preprocessed to enhance the outcome of the process. The noise was reduced by using an algorithm based on Euclidian distance clustering (Nguyen et al. 2020a).
Then the point cloud with its reduced noise was processed in the Graphite software (https:// gforge. inria. fr/ frs/? group_ id= 1465) including the following steps: (1) smoothing of the point cloud, (2) meshing of the point cloud, and (3) smoothing the resulting mesh. Additional details regarding the process were presented in Nguyen et al. (2020a)). The data that support the findings of this study are openly available in (Nguyen et al. 2020b).

Detection
The detection of the defects is based on an enhanced version (Nguyen et al. 2020a) of an initial segmentation algorithm (Nguyen et al. 2016a). The outer part of a defect is typically a more protruded region compared with the surface of the local area surrounding it.
First, we computed the centerline of the trunk using an adaptation of the method proposed in Kerautret et al. (2016) based on the detections of voxels (a division of point cloud  into regular size cubes) where directions of interior normals to the surface converge. Second, based on the computed centerline, we segmented the branches by dividing the point cloud volume into equal angular cylindrical sectors (pie slices) with the centerline is the axis of cylinders. Each angular cylindrical sector had height equals to L, and angle equals to L R where R is the radius of the trunk. The branch segmentation algorithm then divided the point cloud into two disjoint sets: the trunk set T and the branch set B. By choosing an appropriate value of L (L was set between 50 and 100 mm in our experiment), we could ensure that in each cylindrical sector, there was at least one trunk point, which was assumed to be the closest point to the centerline. We added these points to the set T; other points were added to the set B. Points in the set B with a distance below √ 2L to at least one point in the set T were moved from set B to set T. At the end of the algorithm, we obtained the set T containing all trunk points and a small number of branch points which would be segmented in the next step and the set B containing only branch points.
Third, the detection of smaller protruding defects in the trunk set T was performed as follows: for each point P in the set T, a narrow longitudinal patch consisting of neighbors of P was defined. The reference distance to the centerline of a point P was computed from a least-squares linear regression linking the radius variation to longitudinal positions of all points belonging to the patch (Nguyen et al. 2016a). We then computed for each point the difference (denoted as δ) between the distance of the point to the centerline and the corresponding reference distance to the centerline. An automatic thresholding method (Rosin 2001) processed the histogram of this δ for the whole set of trunk points to obtain the points of potential defects corresponding to the δ greater than the threshold. The potential defects were obtained by using the Euclidean clustering method (Rusu and Cousins 2011) on the union of these potential points of defects and the points of branches. The minimum distance used for the clustering method was set to 10 mm.

Identification
The identification (or classification) of defects is a critical step because the impact on wood quality depends on the defect type. Furthermore, the relevant characteristics of the defect are type-dependent. For example, the branch diameter was measured by the distance to the branch centerline, while the burl dimension was measured by the arc length between the leftmost and rightmost points. In a previous research (Nguyen et al. 2020a), we described in detail a classification method based on the supervised machine learning Random Forests (Breiman 2001) using a feature vector including the following elements: (1) species of the tree, (2) ratio between the number of points of the defect and the volume of its bounding box, (3) the seven Hu moment invariants (Hu 1962), (4) defect width, (5) ratio between defects width and height, (6) ratio between defects width and thickness, (7) mean of the difference between the distance and the reference distance to the centerline of all points of the defect, (8) standard deviation of the difference between the distance and the reference distance to the centerline of all points of the defect, (9) ratio between the eigenvalue of the radial vector and the eigenvalue of the longitudinal vector, (10) ratio between the eigenvalue of the tangential vector and the eigenvalue of the longitudinal vector, and (11) angle between the radial vector and the principal axis of the tree.
Random Forests was trained by 401 defects on 17 trees ( Table 1). In addition to defect type, a bark class was introduced in Random Forests. The bark class represents bark zones with a roughness above the local average which are not associated with a defect. Indeed, no size criterion was introduced for detecting a defect in the previous segmentation algorithm for exhaustivity purpose. The consequence was the detection of putative defects corresponding to a bark portion and the existence of such a class allowed the rejection of such false positives in the most refined analysis performed during the classification step. Finally, the different classes are as follows: (1) branch including sequential and epicormic branches; (2) branch scar; (3) burl; (4) small defects including picot, sphaeroblast, bud and bud cluster with less than 6 buds; and (5) bark.
After training, Random Forests classified the detected defects. For each candidate defect, its feature vector was computed and then passed to the Random Forests. Random Forests consists of a set of Classification and Regression Trees (Breiman et al. 1984). Each tree makes a classification, or in other words, gives a vote for a class. Random Forests chooses the class which is voted by the majority of trees.

Automatic estimation of dimensions
We tried to mimic as accurately as possible the approach applied when we manually measured the defects on the standing tree. Consequently, the measurement method depended on the defect type. We needed to represent the defect on an (O,l,z) coordinate system similar to what we used on the field. For the computational purpose, defect corresponded to a set of points D = {P(r, , z)} in a cylindrical coordinate system (O,r, , z) that can be converted to (O,l,z) coordinate system where l is the arclength and is the product of and r, respectively, expressed in radians and metrics unit. Figure 2 describes the conversion of a defect point P from Cartesian to a cylindrical coordinate system. The detail of the conversion is as follows. The intersection of the centerline of the trunk and the plane containing the marker point M (marked by the small ball on the bark) and this perpendicular to the centerline defines the origin O. The radial distance r is the distance from P to the centerline. The azimuth θ is the angle between the radial vector ����� ⃗ P ′ P and ������ ⃗ OM with P′ on the centerline, and P′P being perpendicular to the centerline. The height z is the distance between P′ and O along the centerline.

Measurement of the position of the defect and the dimensions of burls, small defects, and branch scars
In the cylindrical coordinate system, the following procedure allowed measuring the width w and height h of burls and small defects and the Chinese mustache width w k and height w k of the branch scar.
• Compute the minimum distance to the centerline r min of all the defect points. • Initialize the set I as empty: This set is used to store the closest points of the defect to the bark surface to avoid the positive error measurement due to shoots and leaves.
• For each point P of the defect points: Add P to set I if r P − r min < r t , withr t = 10mm ; The threshold r t allows simulating the junction between the defect and the bark as we do it manually on the field. In other words, we measured only the closest points of the defect to the bark surface. • Find the bounding box of the set I, represented by two points P 1 and P 2 . • Compute the center of the bounding box of I: • Compute the radius of the trunk at the defect position r trunk by using the mode of the distance to the centerline of all the points in a trunk segment of 5 cm length with P C as the center of the segment. • The coordinate z of the defect is the coordinate z of P C . • The coordinate l of the defect is l = P C * r trunk • Compute the width of the defect w (or w k )

Measurement of branch scar dimension w and h
A branch scar results in different shapes such as ellipse or Chinese mustache. Since the above-mentioned algorithm was expected to infer the dimensions of the internal knot from the externally visible branch scar, the dimension of the scar that was critical for estimating the knot diameter was more difficult to measure when it has a Chinese mustache shape. We proposed a simple method based on the variation of the height occupied by a given defect along the periphery of the trunk to detect and measure the branch scar. The principle is that a Chinese mustache has a significantly smaller height to the scar. Thus, on a profile of height along the arc length, the boundaries between the Chinese mustache and the scar can be detected by significant small local minima. The algorithm works with both ellipse and Chinese mustache shapes. The algorithm is illustrated in Fig. 3 and is described as follows. First, the defect points were converted from a cylindrical coordinate system into (O,l,z) coordinate system with l = * r trunk . Then, in the (O,l,z) coordinate system, the height profile (h(l)) of the defect was computed as the difference between the maximum and minimum z values of the defect as a function of the arc length l ∶ h(l) = z max (l) − z min (l) . Next, the profile was smoothed by an average of neighbors with a window of size 7 mm. Then, the profile is clipped at the first left ( M 1 ) and right local minima ( M 2 ) which have a height of less than 40% (empirically chosen) of the maxima (d). Finally, the scar width was computed by the difference between the arclength of M 1 and

Estimation of branch diameter
The estimation of branch diameter occurred when the classification step returned a branch as defect type. A branch is different from the other defect types because the diameter is not measured at the junction of the branch and the trunk. We considered that branches have a tubular shape, and two measurement methods were developed because of the unsatisfactory results obtained by the first one. The first method was to fit a cylinder to a portion of the detected branch, and from our experience, the frequent flexuosity of the branch impedes the fitting of a cylinder, the frequent noise in 3D data at the branch junction has also an impact, and a plausible initial diameter value is a priori required to ensure the success of the fitting procedure. The second method was to compute the centerline of the branch and compute the mode of the distance to the centerline of points in a portion of the branch close to the trunk. This second method was more robust and insensitive to the flexuosity and did not require initialization by a diameter value.
The following procedure describes the method based on the centerline. First, a portion of the branch corresponding to a distance between 3 and 8 cm from the trunk was extracted. Second, the computation of the centerline for the portion resulted from the same method as for the centerline of the trunk. Third, the distances of all points in the portion to the centerline were computed. And finally, the diameter of the branch was taken as twice the mode of the distance to the centerline.

Implementation of the software
The input of the software is a mesh derived from TLS data of the trunk (the preprocess and creation of the mesh were presented in the section "Preprocessing" or in Nguyen et al. (2020a). The program was implemented in C++ language using several libraries: DGtal (DGtal) to represent 3D volumic data and for the computation of the trunk centerline, OpenCV (Bradski 2000) for the machine learning, GNU GSL (Gough 2009) for the linear regression, and PCL (Rusu and Cousins 2011) for normal computation and clustering. The final objective of the project was to provide a graphical tool for grading trunks or logs. The graphic environment was provided by the platform Computree (http:// compu tree. onf. fr), an open-source software dedicated to processing LiDAR data in forestry. The graphical interface is currently available as a plugin of Computree and tested in a Linux operation system. A faster alternative, command-line mode, is also available for batch processing of multiple meshes. The source code of the command-line mode and sample data are available at the following GitHub repository: https:// github. com/ vanth onguy en/ trunk defec tclas sific ation. Table 2 shows a comparison of the detection results of the segmentation and classification algorithms with the reference data. The segmentation algorithm could detect most defects on the oak and beech barks, even small defects such as picots and sphaeroblasts, the smallest diameter of detected defects being 0.5 cm. Overall, the segmentation algorithm detected 134 out of 141 defects in the meshes of six trunks. For each trunk, the missed defects (false negative) were at most 3. As expected, the result of the segmentation algorithm contained many false positives (false identification) resulting from miss-identified bark portions. The number of false positives was 270, which was reduced to 69 after the treatment by the classification algorithm. The classification, however, removed also 19 correct detected defects. Visually, the type of defects resulting from the classification corresponded well to the protrusion on meshes of trunk (Fig. 4). The evaluation criteria were the precision, recall, F 1 score, and the average F 1 (see Appendix for the definition of these criteria). These scores were calculated for each defect type (Fig. 5). The average F 1 was 0.74. There were obvious differences in the F 1 score between the different defect types: the branch type with an F 1 score of 1.0, following by the burl type with an F 1 score of 0.75. The algorithm performed less well than for the Branch scar and Small defect with F 1 scores of 0.53 and 0.56, respectively.

Defect measurement
The measurement results were compared with the reference value (manual measurement) using the absolute difference between the results and reference values (Fig. 6). Overall, the algorithm achieved a good result with the median of the difference close to zero both for defect width (10.1 mm) and height (5.9 mm). The overall coefficient of variation for the absolute difference between the automatic measurement and the manual measurement of the defect width ( cv w ) and height ( cv h ) was 145% and 144% respectively.

Branch scar
The Branch scar type had a relatively compact height difference and a small median of 2.9 mm. The difference of the Branch scars width was, however, more disperse and with a larger median of 16.5 mm. The distributions of the plots were slightly skewed on the positive side, but there is a considerable part of the plots with values below zero. The difference between the automatic and manual measurements of the Chinese mustache width had a considerable dispersion due to the weak relief of these parts in the defects making them very challenging to quantify. The coefficient of variation cv w and cv h were 134% and 139%, respectively. The coefficients of variation of the Chinese mustache's width and height were 101% and 122%, respectively.

Burl
Burls were affected by the largest differences between the estimated and the observed value of width and height with medians of 24.4 mm and 25.1 mm, respectively. The distributions were relatively compact but situated mostly over zero. The coefficient of variation cv w and cv h were both 95%.

Small defect
The small defect type exhibited the most compact plots and had the smallest median width and height of 5.2 mm and 3.9 mm, respectively. The distributions of the width difference and height difference were similar. The positive median and the large part of the plots of small defects were above  zero suggesting that the algorithm overestimated mostly the diameter of the small defects. The coefficient of variation cv w and cv h were 127% and 135%, respectively.
The absolute error is smallest for small defects and largest for branch scars. The coefficient of variation for small defects was, however, the largest due to its small mean (width mean of 18 mm and height mean of 13 mm).

Branch diameter measurement
Our test database mainly focused on naturally pruned parts of the trunks, where only two branches were measured: one in an oak tree and another on a beech tree. The estimated diameters of the branches were 8.3 mm (oak) and 20.3 mm (beech) compared with field measures of 8 mm and 20 mm, respectively. The reconstruction of the two branches is illustrated in Fig. 7.   Fig. 6 Violin plots of the differences between the estimated and the manual measurement of the defect dimensions for different defect types. For each plot, the dash lines show the quartiles and median with the latter accompanied by the number

Beech 102
Oak 411 a) b) Fig. 7 Illustration of branch measurement by computing its centerline. The reconstruction of the branch from the centerline and estimated diameter is shown in purple. The defects are highlighted according to their type: is Branch type including both sequential and epicormic branches. is Branch scar, is Burl, and is Small defect including bud cluster, sphaeroblast, and picot

Overall species effect
The distribution of the differences was also different between the two species (Fig. 8). The violin plot of the width of defects on oak was comparatively shorter which suggests that there was a good agreement between estimated and observed widths. The lower quartile was above 0 meaning that the algorithm overestimated the defect width for the oak trees. The violin plot of width on the beech trees was much taller which suggests that the agreement between estimated and observed widths was not as good as for oaks. The algorithm tended to underestimate large defects. The violin plots of the difference between the estimated and observed heights were shorter than the one of widths both on oak and beech. The coefficient of variation of the absolute difference between the estimated and the observed defect dimensions was smaller on oak than on beech trees: the cv w and cv h of oaks were respectively 104% and 105% compared with the cv w and cv h of beeches were respectively 147% and 165%. The absolute error of the measurement of defect height was smaller than that of defect width. The coefficient of variation of the height measurement was however smaller because most defects displayed smaller heights than widths.

Missing and false identifications
The classification algorithm missed 26 out of 141 defects in total. The algorithm missed only small defects (Fig. 9). The largest missing defects had a width of 40 mm and 73% of the missing defects had a width equal to or less than 20 mm. All missing defects had a height smaller than or equal to 30 mm (showed in the histogram as from 30 to 40 mm). The classification resulted in 69 false identifications (false positives). Most false positives were relatively small; 80% had a width that was less than 60 mm and a height that was less than 30 mm (Fig. 10).

Processing time
The running time was recorded by the "time" program provided in the GNU Linux operation system. The experiments were carried out on a laptop running GNU Linux with a 4th generation 2.7 GHz Intel Core i7 processor, 32 GB of RAM, and a Quadro K610M graphics card. Figure 11 shows the running time of the program for 6 trunks and different settings of the main parameters, which impact the running time of the different algorithmic steps. They are the number of points (vertices) in the meshes, the voxel size in the centerline computation, and the patch size in segmentation. If other parameters were fixed, the general trend is that the execution time increased when the number of points increased (Fig. 11a). However, the factors that had the strongest impact on execution time were the voxel size and the patch size of the segmentation algorithm. That was because the request for neighbors for the computation of reference points was done using a KD Tree (Friedman et al. 1977) with a complexity of O(n 2∕3 + k) with n is the number of points in the point cloud and k is the number of returned points. And this request was done for each non-empty voxel; thus, the algorithm is bounded by complexity of The execution time is almost inversely proportional to the square of the voxel size (Fig. 11b). As the voxel size increased from 1 to 3 mm, the execution time decreased from 6 to 10 times. The last main factor that had an impact on the execution time was the patch size. In our experiments, the execution time was proportional to the patch size. When we doubled both width and height, the execution time was almost doubled. In our experiments, the voxel size of 3 mm gave results that were as good as of the 1 mm voxel size but reduced the running time between 6 and 10 times (Fig. 11b). The choice of other parameters such as patch size and density depends on the characteristics of the tree and the defects on the trunk. In general, the patch size must be large enough to ensure the assessment of the local reference distance and thus be greater than the general defect height, a furrowed bark requires longer patches. The influence of these parameters on the performance of the segmentation was discussed in detail in Nguyen et al. (2016b).

Defect detection and identification
The automatic detection of defects on the surface of the trunk using TLS data is still a real challenge. Despite having at disposal an efficient method, many factors could impact the results such as species and age of the tree, global shape of the trunk, resolution, and quality of the scan. The species and age of the tree have a direct impact on the structure of the bark passing from smooth to furrowed as the tree ages or as we move from the upper to the bottom part of the trunk. Good quality and high-resolution data are required for the detection of small defects. In our experiment, for detecting defects with a diameter of 0.5 cm a resolution of at least 20 points per cm 2 was required. From the results, we can say that our algorithms had an excellent performance in detecting even small defects. The detection outcomes of the segmentation algorithm were very good with only 7 missed defects out of 141 in total. Two reasons enabled the good detection rate. The first is that we have a very good and robust estimation of the trunk centerline. The second reason comes from our patch-based approach to estimate the local reference distance to the centerline for each individual point in the mesh of the trunk. The efficiency of estimation allows detecting even slight modifications of bark roughness. The smallest diameter of defects detected by our algorithm was 0.5 cm outperforming all previous studies with size ranging from 7.5 cm (Thomas et al. 2006) to 2 cm . Moreover, our method takes all types of branching defects into account whereas the previous works focused only on branch scars. The detection results, however, contained 270 false negatives for 141 manually detected defects. The classification algorithm allowed removing 201 of these false negatives. The most critical challenge in the classification of the defects would be to reduce the number of false positives. For example, the precision of the classification of branch scars and small defects was low because of the false positives, in turn, due to the miss-classification of bark portions as branch scars (Fig. 5). They indeed directly affect the result of the assessment method. As discussed in Nguyen et al. (2020a), the performance of the classification might be improved by adding in the learning data set more defects having a large variability in shape, but also by defining several classes for a same biological defect taking size range into consideration for a better analysis of the performance.

Quantification
Since the defect size was estimated depending on the defect type, we will discuss first the performance of the defect size estimation by type. Second, we will discuss the robustness of our algorithms to species. Third, we will discuss the potential utilization of our results in grading standing trees and roundwoods. Finally, we will discuss some limits of our approach and the TLS technology.

Comparison between defect types
Branch scar Due to the particularity of branch scar shape, for inferring the knot size, we need the information about the scar seal width and height, and the Chinese mustache width and height. The scar width and height were estimated with a median of 16.5 mm and 2.9 mm respectively. Compared with the burl, the branch scar had a better distribution of the differences between the automatic and manual measurements. Nevertheless, two or more branch scars were rather frequently connected on a beech trunk. They were recognized by our program as one branch scar. This phenomenon denoted as under-segmentation led to a large error in the estimation of the scar dimension that appeared in the violin plots with a difference greater than 300 mm in scar width. This kind of under-segmentation could be detected by a more detailed analysis of the shape of the detected defect and future developments could be imagined solving this particularity. The performance of our algorithm on the measurement of scar height was similar to the result of Kretschmer et al. (2013) with 56% and 58% of measured scar height having a difference of less than 1 cm respectively.

Burl
In our experiments, burls were only present on oak trunks. Our algorithm often overestimated their diameters. The algorithm overestimated 95% of both width and height with medians of the error equal to 24.4 mm and 25.1 mm respectively. Our sample oak trees had a furrowed bark resulting in small bark portions included in the detected defects (Fig. 12). This difference between the human and machine visions of the relief resulted in detected defects were often larger than the defects observed on the field.

Small defect
The dimensions of small defects were also often overestimated with a rate of 73% for width and 76% for height. Because these defects have small dimensions, the relative measurement error was usually high. For example, the coefficient of variation of the absolute error between the manual and automatic measurement for width and height were 127% and 135%, respectively. However, from a wood quality point of view, the impact of these small defects is mainly related to their number independently of their actual size. For example, the standard NF EN 1316-1 (AFNOR 1997), treats a picot as a defect of diameter 5 mm.
Branch diameter The estimation of branch diameter was accurate, even though the result depends on the quality of the scan. The measurement of branch diameter was based on a reconstruction of the branch by a computation of the branch's centerline. The reconstruction was very robust (Fig. 7). The branch on the Beech 102 (Fig. 7a) was highly curved, and there was noise at the junction between the branch and the trunk. The reconstruction of the branch was close to the real surface and shape of the branch. The scan of the branch on the Oak 411 (Fig. 7b) was very noisy. The branch looked thus much larger and flatter than the real branch due to noisy contour. The algorithm estimated, however, a very good diameter of 8.3 mm compared with 8 mm from the manual measurement. Besides accuracy, the approach based on the branch's centerline had the advantage of having fixed parameters: the accumulation radius (see Nguyen et al. (2016a) for the details of this parameter), the skipped distance from the trunk bark, and the length to estimate the diameter.
It is also possible to estimate the insertion angle, i.e., the angle between the principal axis of the branch and the axis of the trunk, with the branch reconstruction approach using the centerline. This information could be used to distinguish between a sequential branch and an epicormic branch since the former is more fastigiated. This angle could be also used to predict the knot diameter and direction inside the trunk wood.

Comparison between oak and beech
Our results showed that the method can work on both oak and beech which are the most common broadleaved species in Europe. There were, however, differences between the two species (Fig. 8) due to the difference in the bark, global shape of the tree, and defect types. The oak characteristics are a furrowed bark with often epicormic defects such as burl and picot. The branch scars on oak, however, are difficult to be detected from external characteristics of the bark because the scar tends to vanish with time, evolving towards a normal aspect of rhytidome. On the opposite, for beech, the smooth bark reveals easily even the very small and weak relief defects. And consequently, it is also very easy to get false detections. These false detections are more difficult to remove in comparison with the false detections occurring on oaks because they are very similar to old scars of early pruned branches ridging the bark surface.
Beech bark often has anomalies due to Nectria (Nectria coccinea Desm.). This disease caused by fungi and is one of the most common beech bark diseases at least in Northern France. The irregularities of the bark associated with the Nectria disease are mostly superficial and have little or no impact on wood quality (McCullough et al. 2005). These irregularities, however, were often miss-identified as branch scars by our algorithms. It might be possible to remove these false identifications in the classification step by adding a dedicated class for damages due to Nectria.

Potential application to grading roundwoods and standing trees
The results obtained in this study show clearly that the accuracy of defect measurements with TLS technology can be used to assess the wood quality of standing trees or roundwoods. It turns out that the use of the information delivered by our algorithm, such as position, type, and dimension of Fig. 12 The difference between the manual and the automatic measurement of the burl diameter. The automatic measurement often leads to larger values than the manual one defects, makes the automated grading of roundwoods or standing trees possible by standards such as NF EN 1316-1 (AFNOR 1997) grading oak or beech logs and Jourez et al. (2010) for grading broad leaf standing trees. However, we observed that grading a class A log is rather challenging whether by an automatic algorithm or by an operator. That is because this class tolerates no defects on a beech log or just a small defect such as picot or sound knot, which has a diameter of less than 15 mm. An error of the algorithm or an operator could lead to a miss evaluation from class A to B or vice versa. For class B, these standards use the sum of the defect's diameters for grading oak and beech logs. However, as shown above, the algorithm gave results often larger than the results from a manual evaluation. For the sum of diameters, the automatic measurement is about 50% larger than the manual measurement on oak, which might lead to an under evaluation of quality compared with a manual evaluation, and must be corrected by improving the method or by inter-calibration like it is the rule when changing a measurement tool.
The method still has limits as the underestimation of branch scar edge due to the insufficiency of the relief and the under-segmentation of defects. The under-segmentation of defects may affect both the grading of standing trees and roundwood. For example, an under-segmentation might count two defects as one, which results in a class B log graded as a class A. Thus, further analysis is needed to detect under-segmentations and then correct them.
These results need to be improved through the different possibilities already mentioned. Furthermore, the two methods mentioned are based on the evaluation of external defects, the correctness of the assessment needs thus to be validated by a measurement of knot inside the wood, and the coupling between the outside and internal characteristics of the defect can be improved by building a learning database for classification making use of internal information coming from X-ray CT scanning.
An important point in grading log quality is the identification between a sound knot and an unsound knot. To our knowledge, there was no research about this aspect of using TLS. For a completed automatic procedure of grading log quality using TLS, a study of the feasibility of the identification of these two-knot types is still necessary.
The last aspect that we want to compare to a conventional procedure for grading standing trees such as Jourez et al. (2010) is the processing time. As showed in the section "Results," the required time for running our algorithms was small if we choose an optimized parameter, in particular, using the voxel size of 3 mm. The execution time for a nearly 2 million vertices mesh was from 2 to 3 min. However, if we count the time for acquiring and preprocessing the TLS data for a tree, this could be as long as 2 h. The overall time required to process a tree by using TLS data was longer than the field procedure used in research, with a duration around 30 min, and longer than the qualification duration of a tree by an expert which lasts a couple of minutes. Although the human vision is known to be better than the machine vision for an object recognition task, an operator, however, could make errors and inconsistent results depending on the emotion and working condition. Furthermore, digital information or its synthesis resulting from a computer program, however, often gives a reproducible result, and a significant advantage will be the availability of detailed information along the commercial and processing chain for traceability and optimization purposes. Moreover, with the future advance of technology, we could incrementally improve the accuracy and reduce the acquisition and processing times.

Limitations of TLS
The TLS has several limitations such as the occlusion by branches, moss, or ivy and the reduction of the spatial resolution in the high parts of tree trunks. A small cluster of moss can also look like a picot or a sphaeroblast leading to an increase in the number of detected defects. In this work, the choice of TLS data was justified as a reference method for getting detailed and accurate 3D information on standing trees, but the processing method is independent on the origin of the 3D cloud, and how can we speed up the 3D data acquisition keeping a relevant level of details, is the task we must and can address thanks to the present work. Furthermore, we only use geometrical information, in our approach, but other data, such as distance-normalized reflectance or color, could be used in addition to improving the classification.

Conclusion
In this paper, we presented the final evaluation of a method consisting of a suite of algorithms to measure the defects of the trunk surface from only 3D information contained in TLS data. This final evaluation depended on the success of the three different steps involved in the process. The detection algorithm was very performant despite it relies only on a geometrical analysis targeting local protuberance. This geometrical analysis was performed with high-density 3D data, and such density requires special conditions for the acquisition, and the question of the minimal defect size detectable with lower point densities is still opened. Nevertheless, the challenge achieved by this step is to be enough sensitive to avoid forgetting defects, the counterpart was the detection of many small singularities which are not defects but bark irregularities. This is the function of the Random Forests classification step to re-examine all detected singularities and to decide, which is the most probable class of defects it corresponds to. The classification was rather successful and failed mainly for small defects but as a supervised method it is highly dependent on the quality and quantity of data in the learning database, and some improvement of the current database could be looked by structuring it not only through the biological definition of the defects, but also through size criteria, and by additional data. Then, the last step looking for characterizing each defect according to its type by relevant dimensional characteristics gave encouraging results. Nevertheless, for some flat defects like old branch scars, some inconvenience is due to the definition of the defect borders coming from their different perceptions by the algorithm or by a human operator.
Despite possible improvements, the results presented in this paper demonstrates the theoretical feasibility of using detailed 3D information for looking for rather small defects complementing the trunk shape criteria to enhance the automation of grading standing trees or roundwood. Nevertheless, these results also questioned the role of the bark in industrial processes and the useful information about quality it contains.

Evaluation criteria for detection and classification
The criteria used to evaluate the performance of the detection and classification were the F 1 score (F-measure), which is the harmonic mean of the precision (P) and the recall (R). The F 1 score of the classification result for each type is computed from the precision and the recall as follows: where TP, FP, and FN are true positive, false positive, and false negative respectively. Their definition is as follows: • TP is the number of actual defects correctly classified as defect. • FP is the number of non-defects incorrectly classified as defect. • FN is the number of actual defects incorrectly classified as non-defect.
The overall F-measure ( F 1 ) is computed from the averaged precision ( P ) and recall ( R ) (Manning et al. 2008) by the following equations: where TP i and FP i are the true and false positive of class i (defect type i) respectively.