Comparison of two atmospheric correction methods for the classification of spaceborne urban hyperspectral data depending on the spatial resolution

ABSTRACT For remote-sensing applications such as spectra classification or identification, atmospheric correction constitutes a very important pre-processing step, especially in complex urban environments where a lot of phenomenons alter the shape of the signal. The objective of this article is to compare the efficiency of two atmospheric correction algorithms, COCHISE (atmospheric COrrection Code for Hyperspectral Images of remote-sensing SEnsors) and an empirical method, on hyperspectral data and for classification applications. Classification is carried out on several simulated spaceborne data sets with different spatial resolutions (from 1.6 to 9.6 m). Four classifiers are considered in the study: a k-means, a Support Vector Machine (SVM), and a sun/shadow version of each of them, which processes sunlit and shadowed pixels separately. Results show that the most relevant atmospheric method for classification depends on the spatial resolution of the processed data set. Indeed, if the empirical method performs better on high-resolution data sets (up to 4%), its superiority fades out as the spatial resolution decreases, especially with the lower spatial resolution where COCHISE can be 10% more accurate than the empirical method.


Introduction
During the last century, urban areas grew in such a manner that more than 50% of mankind lives now in cities (Chen et al. 2013). These areas are complex and dynamic ecosystems consuming a huge amount of energy and materials on a daily basis. In return, they produce an excessive volume of waste, heat, and pollutant, which represents a critical environmental issue. Consequently, numerous sustainable development programmes in needs of an increasing mass of information have emerged. This need can be efficiently fulfilled by the Earth observation technologies such as spatial remote sensors, able to gather quickly and recurrently a large quantity of image data, which are usable as part of many applications: air quality control, ground cartography, material aging monitoring, and vegetal biodiversity characterization. Indeed, optical remote sensing has proved to be a powerful tool in order to conduct urban studies . Multispectral high spatial resolution sensors are very efficient for the detection of urban objects and for the characterization of their size and shape. Agile sensors such as Pleïades (de Lussy et al. 2004) can even acquire images in stereoscopy and tri-stereoscopy in order to produce digital elevation models that are able to characterize the three-dimensional structure of the ground. The high spatial resolution of these data sets allows to perform object-based classification through segmentation and object feature (statistical, geometric, or contextual) processing, which makes urban mapping and planning applications (such as territorial urbanization study (Dupuy, Barbe, and Balestrat 2012) or vegetation spread characterization (Landry and Chakraborty 2009)) easier and more accurate.
However, multispectral imagery is limited regarding spectral analysis. Its low spectral resolution does not allow the discrimination among the large variety of urban materials. Hyperspectral imagery, which is characterized by a very high spectral resolution and whose spatial resolution tends to improve (Briottet et al. 2011), has proved to be a promising tool to overcome this matter. Herold, Gardner, and Roberts (2003) showed that hyperspectral sensors were better at characterizing soils, materials, and vegetals than multispectral sensors. As for them, Platt and Goetz (2004) and Tan and Wang (2007) highlight the benefits of hyperspectral data sets for the classification of urban surfaces. Several other authors including Weng and Quattrocchi (2007) also insist on the benefit provided by hyperspectral imagery for the characterization of impervious surfaces. Nowadays, two new spaceborne hyperspectral mission programmes are understudied: HYPXIM (Briottet et al. 2011) for 'HYPer Spectral IMagerie à haute résolution et grand champ' and SHALOM (Dor, Kafri, and Varacalli 2014) for 'Spaceborne Hyperspectral Applicative Land and Ocean Mission', with a spatial resolution of, respectively, 8 and 10 m. Thus, there is a real interest to evaluate the potential benefits of these missions for the study of urban areas.
One of the keypoints to achieve these applications is the atmospheric correction phase which aims to retrieve, from at sensor radiance, the reflectance associated to the targeted surface, which is by nature independent of the irradiance conditions on the one hand and of the environment topography on the other hand. Different correction methods exist. ATREM (Goetz et al. 1997) for 'Atmospheric REMoval program' and ACORN (Miller 2002) for 'Atmosphere CORrection Now' assume a flat homogeneous ground hypothesis with a Lambertian surface. FLAASH (Cooley et al. 2002) for 'Fast Line-of-sight Atmospheric AnalySis of Hypercubes' and COCHISE (Miesch et al. 2005) for 'atmospheric COrrection Code for Hyperspectral Images of remote-sensing SEnsors' consider as for them a heterogeneous background. The main drawback of these four methods is that they perform atmospheric correction without accounting for the three-dimensional structure of the ground, which may introduce artefacts due to slopes and shadows. SIERRA (Lenot, Achard, and Poutier 2003) for 'Spectral reflectance Image Extraction from Radiance with Relief and Atmospheric correction' bypasses the slope issue by using a digital elevation model (DEM) but only ATCOR4 (Richter and Schlapfer 2002) for 'Atmospheric and Topographic CORrection' and ICARE (Lacherade et al. 2008) for 'Inversion Code for urban Areas Reflectance Extraction' take also shadows into account. Yet, these algorithms are very expensive in terms of computation time, and they necessite to have access to accurate elevation data which may be unavailable. Considering these two drawbacks, Chen et al. (2013) introduced a faster, classification-orientated method (spectrum identification is not possible) that is able to compute reflectance in shadow areas using only the at sensor radiance image and the atmospheric and viewing conditions based on empirical assumptions.
The objective of this work is to compare the classification performances obtained on spaceborne hyperspectral data sets simulated at different spatial resolutions and corrected by two atmospheric compensation methods: COCHISE, a flat ground algorithm, and the empirical method developed by Chen et al. (2013) which takes shadows into account. These two atmospheric compensation methods are very different in essence, as the first one is a physically based tool which is very similar to ATCOR or FLAASH. The second one keeps the flat surface assumption for the atmospheric compensation but adds some empirical hypothesis to retrieve the reflectance in shadowed areas.

Atmospheric correction methods
The framework used for both atmospheric correction methods is described in this section. This work only considers the reflective domain (from 0:4 to 2:5 µm).

Radiative transfer framework
The reflectance is a unitless and wavelength-dependent parameter representing the ratio between the radiance reflected by a surface and the irradiance incident to this surface. Assuming a Lambertian hypothesis and for given illumination conditions, the reflectance value of a pixel ðx; yÞ can be formulated as follows: with: where R tot is the radiance measured by the sensor, R env is the portion of R tot coming from the neighbourhood of the target, and R atm the portion scattered by the atmosphere without interacting with the ground. I tot , the total irradiance incident to the target pixel, includes four components: • I dir : the photons hiting the target directly from the sun, without any interaction with the atmosphere. • I dif : the photons scattered in the atmosphere at least once before hiting the target. • I coup : the photons that make at least one round trip between the ground and the atmosphere before hiting the target. • I refl : the photons hiting the neighbouring 3D structure at least once before hiting the target.
Finally, τ " dir is the direct upward transmission. Most of these terms are illustrated in Figure 1.
In cities where the number, the size, and the density of buildings imply the omnipresence of canyon-like structures, shadows constitute a major issue, especially for applications such as classification where a strong variability of spectra can lead to high confusion rates and an overestimation of the number of classes (Lacherade et al. 2008). For example, in an urban data set acquired over the downtown of Toulouse, France, in winter time and early in the morning, we identified as much as 33% of shadowed pixels. In order to highlight the interest of correcting the effects of shadow in a reflectance retrieval process, we compared two algorithms. The first one, COCHISE, does not make any difference between shadowed and sunlit areas while the second one, an empirical method developed by Chen et al. (2013), processes the two kinds of area separately.

COCHISE
COCHISE considers a flat and heterogeneous ground in order to compute the reflectance value of a pixel, which means on the one hand that the I refl term of Equation (3) is ignored, and on the other that I dir and I dif are considered constant for the whole scene. This method is used as a reference to evaluate the benefit of the Chen et al. (2013) method. Starting from Equation (2) and Miesch et al. (2005), we have:  where τ " dir and τ " dif are, respectively, the direct and diffuse components of the upwelling transmission. Vðx; yÞ is the neighbourhood of the target pixel ðx; yÞ, S is the atmospheric spherical albedo and: is a mean reflectance value associated to the Earth-atmosphere coupling effect. G and F environment functions stand for the probability that an energy reaching the target through this effect results from the neighbour ðu; vÞ. These functions are computed using a Monte Carlo algorithm (Miesch et al. 2000). Atmospheric parameters involved in Equations (2), (4), and (5), that is, R atm , τ " dir , τ " dif , I dir , I dif and S, are computed using the radiative transfer code MODTRAN, for 'MODerate resolution atmospheric TRANsmission' (Berk, Bernstein, and Robertson 1989), knowing some other parameters such as aerosol type and abundance, molecular atmospheric profile, water vapour content, and illumination conditions.
The inversion process in order to obtain the reflectance ρðx; yÞ is based on an iterative algorithm. At first, the reflectance associated to the environment of the target is considered equal to the reflectance of the target: The next reflectance value is then computed as follows: Usually, two iterations of the algorithm are enough to reach convergence.

Empirical method
The proportions of each radiative terms in Equation (3) are very different depending on whether the target is in a sunlit area or a shadowed one. Consequently, Chen et al. (2013) proposed to use different estimation methods in order to retrieve the reflectance in both area types. The framework of Chen's method is presented in Figure 2. The algorithm has been improved so that the atmospheric terms are no longer computed with 6S (Vermote et al. 1997) but with MODTRAN (Berk, Bernstein, and Robertson 1989), which is more suited to the processing of hyperspectral data due to its better spectral resolution.

Shadow mask generation
The first step consists in the determination of a shadow mask based on the automatic shadow detection algorithm proposed by Nagao, Matsuyama, and Ikeda (1979). This choice has been done according to the comparison of shadow detection methods done by Adeline et al. (2013b) The purpose of this algorithm is to apply an automatic histogram thresholding process to the following linear combination of bands: where R λ r , R λ g , R λ b , and R λ nir are, respectively, the radiance values associated to the red, green, blue, and near infrared channels of an image. Then, a final optional step has been added for images where water is present. Indeed, the Nagao algorithm usually considers water pixels as shadow pixels due to their low reflectance. To solve this problem, another histogram thresholding process is applied on the shadow pixels detected by the first one, this time only on the green channel, which have shown the best ability to discriminate shadow and water. The shadow mask is a binary array M with M ¼ 0 for a shadowed pixel and M ¼ 1 for a sunlit one. Figure 3 shows an example of shadow extraction done for an area located in the centre of Toulouse and including several buildings and a waterway.

Irradiance characterization
In sunlit areas, direct and diffuse terms account for at least 95% of the total irradiance (Chen et al. 2013), which means that I tot can be reasonably approximated by these two terms. In shadow areas, however, the direct irradiance is absent and the total irradiance is mainly composed of diffuse and reflected irradiance. In this case, the diffuse term may be subdued by the proximity of buildings. The coupling irradiance term is neglected in both cases, which leads to these two approximated expressions: where α sky 2 ½0; 1 is a factor accounting for the fraction of sky viewed from the ground. For example, a pixel located in a perfectly flat area will have a sky view factor equal to 1. Sky view factors can be computed using either a DEM or fish-eye pictures (Gal et al. 2007). However, the purpose of Chen's method is to correct shadow effects without elevation data. Therefore, they mean α sky value of 0:75 is used according to the study made by Gal et al. (2007) which asserts that most of the sky view factors located nearby typical urban 3D structures belong to ½0:7; 0:87. The reflected irradiance, which depends on the 3D surface and the reflectances located in the neighbourhood of the target, is very difficult to estimate accurately without a DEM. This term can be neglected in sunlit areas where it represents a minor percentage of the total irradiance but not in the shadowed regions. In this method, I refl is estimated as the sum of direct and diffuse irradiances weighted by a factor κ representing the effects of the 3D structure and a mean reflectance " ρ which stands for the reflecting surface around the target. In Chen et al. (2013) method, κ is set to 0:2 for cluttered urban environment and 0 for open suburban areas. For " ρ, a mean of spectra associated to several urban materials (such as concrete, tile and steel) found in spectral libraries is used.
2.3.3. Environment radiance processing R env accounts for the proportion of radiance coming from the neighbourhood of the target and scattered by the atmosphere towards the target's associated photosensor. It is computed iteratively as follows: where (15) is the mean reflectance spectrum associated to the neighbourhood of the target for the iteration t þ 1 and ρ 0 ¼ " ρ. Usually, two iterations of the process suffice to reach convergence.

Classification methods
This article aims to evaluate the efficiency of the two atmospheric correction methods previously described in the context of classification applications. In order to carry out this comparison, we considered several classification approaches, supervised or not. The first one is k-means++ (Arthur and Vassilvitskii 2007), an improved version of the widely used and unsupervised k-means algorithm (MacQueen 1967), where the initial centroids are not chosen randomly but according to the assumption that distant centroids (in the vector space) will lead to a better identification of the classes. The second one is the Support Vector Machine (SVM) algorithm (Boser, Guyon, and Vapnik 1992), a robust supervised method able to perform well in harsh situations (high-dimension data and few learning samples) and even to process non-linearly separable data using kernels to simulate higher dimensional spaces.
However, these two approaches are not suited to data sets where shadows are omnipresent. Indeed, such data sets often induce the presence of a shadow class which gather a large proportion of pixels associated to a wide variety of materials. Thus, we propose a new method implying a sequential process (cf. Figure 4) where sunlit and shadow pixels are classified separately. Using the shadow mask described in Section 2.3, the sunlit pixels are extracted and classified either by a k means++ or an SVM algorithm, with a number of classes L fixed by the user. Then, the centroids of each class are preserved and used as input for the classification of shadowed pixels. In this case, a Spectral Angle Mapper (SAM) classification has been chosen for its ability to focus on the shape of spectra, which remains globally similar for a pair of pixels associated to the same material, even if one is located in a sunlit area and the other in a shadow area. For each pixel p and each centroid μ i ; i 2 ½1; L, a SAM value is processed as follows: where B is the number of bands of the image. The class chosen for pixel p is the one minimizing the SAM between p and its representative centroid. It should be noted that for this method to be efficient, the same classes must be present in both sunlit and shadow areas. Furthermore, the spectral bands used to classify shadow pixels must be chosen carefully. Indeed, in shadow areas and for spectral bands from the near infrared to shortwave infrared (SWIR) (above 0:8 µm), the amplitude of the signal tends towards zero as the wavelength grows, which means that over a given threshold only noise is measured.
To summarize, four different classifiers are considered in this study: • KM: k-means++ algorithm applied on the whole image • SS-KM: sun/shadow classification with a k-means++ to classify the sunlit pixels and SAM to classify shadowed ones • SVM: SVM algorithm with polynomial kernel applied on the whole image • SS-SVM: sun/shadow classification with an SVM to classify the sunlit pixels and SAM to classify shadowed ones Concerning the supervised methods, two kinds of training data sets are used. The first approach includes for each class both sunlit and shadowed pixels (GTWS: Ground Truth With Shadows), which implies better results but does not represent a very realistic application case. Indeed, it is rare to have access to accurate ground truth for every image to classify. This ground truth is built either from a spectral library (where the spectra are usually acquired under ideal illumination conditions) or manually, in which case the user picks most of the time sunlit pixels whose associated class is obvious. This is why we consider a second approach where only sunlit pixels are included in the training sets (SGT: Sunlit Ground Truth), which also implies that their variability is mainly due to the characteristics of the materials.

Working data and simulation
The data used in this study have been acquired on an area covering both the downtown and the suburb of Toulouse, France (cf. Figure 5), during the Umbra airborne campaign (Adeline et al. 2013a) which occurred on 24 October 2012. Two sensors of the Hyspex product line have been used simultaneously in order to cover a wavelength interval starting from 400 to 2500 nm. The first one is a VNIR-1600 sensor covering the 400 À 1000 nm interval with a spectral resolution of 3:7 nm and 160 bands. The second one is a SWIR-320m-e sensor covering the 1000 À 2500 nm interval with a spectral resolution of 6 nm and 256 bands. During this flight, both sensors were placed at an altitude of 2392 m, which implies a spatial resolution of 80 cm for the visible and near infrared (VNIR) sensor and 160 cm for the SWIR sensor. In order to work on a single data set, the IGN (French national geographic institute) undertook the coregistration between the VNIR and SWIR data sets. To do so, the VNIR image has been downsampled in order for its spatial resolution to be the same as in the SWIR image.
In this study, we focus on spaceborne hyperspectral data. Thus, a simulation protocol of HYPXIM (Briottet et al. 2011) images from Hyspex data has been established. HYPXIM is a space mission aiming to commission a satellite which would carry a high-resolution hyperspectral sensor covering a wavelength interval starting from 400 to 2500 nm with a spectral resolution of 10 nm (accurate enough for urban material characterization) and a swath of 16 km. Its planned spatial resolution is 8.
The simulation protocol includes four steps (cf. Figure 6). First a top of atmosphere (TOA) transition is simulated using MODTRAN.

TOA transition
First, the acquired radiance L is transferred to TOA level: For each channel, the transmission coefficient kðλÞ and the path radiance lðλÞ (associated to absorption and diffusion mechanisms) are solution of a linear regression over several couples of radiances ðR i ; R TOA i Þ computed with Comanche (Miesch et al. 2005) for various ground reflectances ρ i and a specific type of atmosphere.

Spectral agglomeration
Then a spectral agglomeration of the Hyspex bands is processed by convolution with the spectral responses of the spectral bands of HYPXIM, in this case a Gaussian function centred on the HYPXIM bands and with a full width at half maximum FWHM ¼ 5 nm: where L ð1Þ int ðλ 0 Þ are the integrated radiances corresponding to the HYPXIM instrument bands and σ ¼ ðFWHMÞ 2:355 .

Spatial agglomeration
Regarding the spatial agglomeration step, both the signal/noise ratio (SNR) and the modulation transfer function (MTF) of the simulated instrument (cf. Table 1) have been taken into account. The spatial agglomeration is done using a 2D gaussian filter of standard deviation: where p is the spatial resolution of the agglomerated image. For practical reasons, the new spatial resolution can only be a multiple of the original one. This latter being 1:6 m, we can simulate the exact spatial resolution of HYPXIM (8 m). We also simulated two other spatial resolutions: 4:8 m which represents an intermediary scale between Hyspex and HYPXIM and 9:6 m which is approximately the spatial resolution of SHALOM (Dor, Kafri, and Varacalli 2014). The filter characterized by σ is represented by a matrix of size: where p 0 is the spatial resolution of the original image. The filter is applied to the image in order to simulate p-sized pixels (with an agglomerated image p p 0 smaller than the original one).

Noise addition
Finally, the simulated sensor noise is added. Its standard deviation is processed as follows: where a and b are wavelength-dependent coefficients specific to the simulated instrument and L ð2Þ int is the radiance agglomerated spectrally and spatially. Let also remark that if every spectral bands are kept for the atmospheric correction step, the noisy ones or those corresponding to the water vapour absorption bands are put aside for the classification step (for HYPXIM: ½1 À 4, ½45 À 53, ½63 À 70, ½84 À 101, ½125 À 149 and ½181 À 192).

Results
The evaluation of the quality of the results produced by the two atmospheric correction algorithms described in this study has been conducted through a sensitivity study involving several parameters: spatial resolution, classification approach and, in the supervised case, training set composition. Four spatial resolutions are compared: 1:6, 4:8, 8, and 9:6 m. As for the classification approaches and the training set compositions, they are detailed in Section 3. Regarding the k-means++ classifier, the algorithm is launched 10 times on each data set (each time with a different random initialization), and the final accuracy ratio is a mean of these 10 results. Regarding the SVM classifier, the algorithm is launched 10 times on each data set too. Each time, a new training set is selected randomly from the whole ground truth and the final accuracy ratio is the mean of these 10 results.

Ground truth description
In order to measure the accuracy of the classifiers' results, a ground truth has been built manually from the data set with the most accurate spatial resolution (cf. Figure 7). Five common urban classes are considered: asphalt, gravel, tile, vegetation, and water. The training sets used for the supervised classifiers are composed of 2% of the samples included in the ground truth. The classifiers' efficiency is evaluated by measuring the average and overall accuracies of the results over the rest of the samples. The overall accuracy (OA) represents the general proportion of well-classified pixels, whereas the average accuracy (AA) represents the proportion of well-classified pixels by class, which is more pertinent when the classes do not have the same number of elements.
The ground truths associated to the three other spatial resolutions are processed by undersampling the manually built one. Each low-resolution sample is labelled according to a majority voting rule applied to a set of high-resolution samples, knowing that the dominant class is validated if and only if it labels at least 50% of this set (see Figure 8(a)). This approach is likely the most representative of a situation where an operator has only access to low spatial resolution data and must therefore built a ground truth from it.

Unsupervised case
Regarding the unsupervised results (cf. Table 2), several comments can be drawn. First, the atmospheric correction method leading to the best classification rate depends on the spatial resolution. With the highest spatial resolution, the empirical method is slightly more efficient. In Figure 9, we focused on four specific areas of the working data (1.6 m) where large shadows are present. We can see that the empirical method allows to classify more accurately the shadowed pixels (especially for water and vegetation pixels, strong confusions between asphalt and gravel still remain) than the COCHISE method at least for classic k-means and SVM classifiers. However, for the three other spatial resolutions, COCHISE outperforms the empirical method, especially when a classic k-means is used. This inversion regarding the performances can be explained by a higher degree of mixity between sunlit and shadowed pixels when the spatial resolution is low. Indeed, the empirical method assumes that pixels are either totally shadowed or totally illuminated. Therefore the pixels located in mixed areas, between light and shadow, are not properly considered and their correction will be either over-or underestimated. The larger the pixel, the higher the proportion of mixed pixels (relative to the total number of pixels) will be, as shown in Figure 10. Table 3 shows that when the classification is performed only upon sunlit areas, the results obtained on the data set corrected with the empirical method are less accurate than those obtained on the data set corrected with COCHISE, which means that the empirical method's model used to process the reflectance in sunlit areas is not as relevant as the one used by COCHISE. When the spatial resolution is high enough, the good classification rates obtained by the empirical method in shadowed areas allow it to slightly surpass COCHISE. However, as the resolution decreases, this benefit becomes less and less preponderant.
Regarding the sun/shadow classification approach (SS-KM), Table 2 shows that the separated processing of shadowed and sunlit pixels induces a significant improvement of the classification accuracy, especially when the resolution is low. Indeed, a sequential process allows to avoid the creation of a class gathering all the shadowed pixels, which is a recurrent issue for the unsupervised algorithms. Globally, the classification accuracy rate tends to decrease with the spatial resolution, which is not surprising considering that lower resolutions imply bigger pixels and a higher number of mixed spectra, which also implies a higher difficulty to label the pixels.

Supervised case
The results obtained with a supervised classification algorithm are systematically better (cf . Table 4), especially when the learning step is done using only sunlit pixels (SGT columns in the table). The empirical method leads to better classification results for the finest spatial resolution again, but this time for the classical SVM algorithm only. When sunlit and shadowed pixels are processed sequentially, the flat ground hypothesis which limits the COCHISE method is counterbalanced (even for the SS-KM algorithm, the results obtained with COCHISE and the empirical method are close). Similarly, the sun/ shadow version of SVM (SS-SVM) always leads to more accurate results. When sunlit pixels as well as shadowed pixels are included in the training step (GTWS columns in the (a) Undersampled ground truth method (b) Oversampled classification map method Figure 9. The two accuracy processing methods. In (a) the high-resolution ground truth is undersampled in order to fit to the low-resolution classification map, whereas in (b) it is the low-resolution classification map that is oversampled in order to fit to the high-resolution ground truth.  ) however, the classic SVM is more efficient. It can be explained by the robustness of this algorithm when it is used with adequate training sets, sufficiently representative of the intrinsic variability (in this study, the luminosity rate) of the classes existing in the image. Yet, let remark that when the user does not have access to any ground truth and must therefore create it manually, it can be difficult to define the class of shadowed pixels. In these conditions, using an algorithm efficient with only sunlit samples can be helpful.

Mixity and quality assessment
For supervised and unsupervised algorithms, the accuracy of the results globally decreases with the spatial resolution. However, some discrepancies appear for the intermediary spatial resolutions, especially with the SVM classification method. These discrepancies may be a consequence of the ground truth used to process the accuracy of these results. Indeed, in order to make a relevant comparison of data sets associated to several spatial resolutions, the growing mixity factor (i.e. Figure 10) should be taken into account. Thus, instead of undersampling the 1:6 m original ground truth, which is equivalent to measure the classification accuracy over highly mixed samples, we propose a second accuracy processing method. This latter consists in oversampling the classification maps in order to use the original ground truth on them (i.e. Figure 8(b)). Such a process allows us to weight the classification score associated to a low resolution pixel with its corresponding level of mixity. The effect is shown in Tables 5 and 6. Because of the mixity, the classification accuracies are globally much lower than those obtained through the first accuracy processing method. The discrepancies have also almost totally disappeared, meaning that for data sets associated to varying spatial resolutions, if we consider in each case the same classes of pure spectra, the classification accuracy strictly scales down with the spatial resolution, mainly because of the strong mixity induced by the low resolution. However, if we consider the results obtained using data sets corrected by COCHISE and then classified by the sun/shadow versions of k-means and SVM, we observe that the classification accuracy remains high until 8 m (the planned spatial resolution of HYPXIM) before falling at 9:6 m (approximately the planned spatial resolution of SHALOM). Table 5. Accuracies, average (AA) and overall (OA), obtained with the unsupervised classifiers using the second accuracy processing method.

Conclusion
In this article, we compared the efficiency of two atmospheric correction algorithms, the COCHISE method and a classification-orientated empirical method, for the classification of urban hyperspectral data. This comparison have been conducted using a Hyspex data set acquired over Toulouse, France, in 2012 on four hyperspectral data sets simulated at various spatial resolutions, with the objective to evaluate the impact of spatial resolution on classification results. Four classifiers have been considered in the study: a k-means, an SVM and a sun/shadow classification algorithm using either a k-means or an SVM as a first step. In the end, the empirical atmospheric correction method, which takes shadows into account, appears more accurate than COCHISE, the flat ground method, on high spatial resolution data. However, as the spatial resolution decreases, the mixity between sunlit and shadowed pixels become more and more troublesome and finally, the benefit of considering shadows areas is no longer significant enough to compensate the lack of precision characterizing the empirical method's model in the sunlit areas. For these cases, the simultaneous use of both COCHISE correction method and sun/shadow classification approaches seems like a relevant alternative in order to process urban data. In this work, we did not include the Enmap spatial resolution (30 m) to the comparison because it is not adapted to an urban context (Heldens et al. 2011). On cities such as Toulouse, the choice of the spatial resolution seems critical. Indeed it appears that the classification performances obtained with a ground sample distance equal to 8 m (the planned spatial resolution of HYPXIM) are significantly better than those obtained with a ground sample distance equal to 10 m (the planned spatial resolution of SHALOM). Future work will consider the addition of the ICARE icare_XC (Ceamanos et al. 2016) atmospheric correction algorithm to the comparison process. Such a method, which uses a digital elevation model as an a priori, may perform even better than the empirical method on high spatial resolution data sets due to the consideration of 3D data on the surface. Furthermore, it would be interesting to compare these algorithms on several Table 6. Accuracies, average (AA) and overall (OA), obtained with the supervised classifiers using the second accuracy processing method.
other data sets simulated from missions such as Pleiades, WorldView 3, and Sentinel-2. At last, the fusion of panchromatic high-resolution data with hyperspectral data such as HYPXIM or SHALOM is also a promising way. Indeed, the recent work of Loncan et al. (2015) allowed to compare several panchromatic/hyperspectral fusion methods on urban data and select the most efficient one in order to generate a hypercube with a ground sample distance equal to 2 m. Even if these methods are limited when too many mixed pixels are present in the scene, it would be interesting to evaluate the impact of such fusion methods on the classification performances.