New Results in Perceptually Lossless Compression of Hyperspectral Images

Hyperspectral images (HSI) have hundreds of bands, which impose heavy burden on data storage and transmission bandwidth. Quite a few compression techniques have been explored for HSI in the past decades. One high performing technique is the combination of principal component analysis (PCA) and JPEG-2000 (J2K). However, since there are several new compression codecs developed after J2K in the past 15 years, it is worthwhile to revisit this research area and investigate if there are better techniques for HSI compression. In this paper, we present some new results in HSI compression. We aim at perceptually lossless compression of HSI. Perceptually lossless means that the decompressed HSI data cube has a performance metric near 40 dBs in terms of peak-signal-to-noise ratio (PSNR) or human visual system (HVS) based metrics. The key idea is to compare several combinations of PCA and vid-eo/image codecs. Three representative HSI data cubes were used in our studies. Four video/image codecs, including J2K, X264, X265, and Daala, have been investigated and four performance metrics were used in our comparative studies. Moreover, some alternative techniques such as video, split band, and PCA only approaches were also compared. It was observed that the combination of PCA and X264 yielded the best performance in terms of compression performance and computational complexity. In some cases, the PCA + X264 combination achieved more than 3 dBs than the PCA + J2K combination.


Introduction
Hyperspectral images (HSI) have found a wide range of applications, including For many practical applications, it is unnecessary to compress data losslessly because lossless compression can achieve only two to three times of compression. Instead, it will be more practical to apply perceptually lossless compression [6] [7] [8] [9]. A simple rule of thumb is that if the peak-signal-to-noise ratio (PSNR) or human visual system (HVS) inspired metric is above 40 dBs, then the decompressed image is considered as "near perceptually lossless" [10]. In several recent papers, we have applied perceptually lossless compression to maritime images [10], sonar images [10], and Mastcam images [11] [12] [13].
In the past few decades, there are some alternative techniques for compressing HSI. In [14], a tensor approach was proposed to compress the HSI. In [15], a missing data approach was presented to compress HSI. Another simple and straightforward approach is to apply PCA directly to HSI. For instance, in [3], the authors have used 10 PCA compressed bands for anomaly detection. There are also some conventional, simple, and somewhat naïve approaches, to compressing HSI. One idea known as split band (SB) is to split the hundreds of HSI bands into groups of 3-band images and then compress each 3-band image separately. Another idea known as the video approach (Video) is to treat the 3-band images as video frames and compress the frames as a video. The SB and Video approaches have been used for multispectral images [13] and were observed to achieve reasonable performance.
One powerful approach to HSI compression is the combination of PCA and J2K [16]. The idea was to first apply PCA to decorrelate the hundreds of bands and then a J2K codec is then applied to compress the few PCA bands.
In the compression literature, there are a lot of new developments after J2K [17] in the past 15 years. X264 [18], a fast implementation of H264 standard, has been widely used in Youtube and many other social media platforms. X265 [19], a fast implementation of H265, is a new codec that will succeed X264.
Moreover, a free video codec known as Daala, emerged recently [20]. In light of these new codecs, it is about time and worthwhile to revisit the HSI compression problem.
In this paper, we summarize our study in this area. Our aim is to achieve perceptually lossless compression of HSI at 100 to 1 compression. The key idea is to compare several combinations of PCA and video/image codecs. Three representative HSI data cubes such as the Pavia and AVIRIS datasets were used in our studies. Four video/image codecs, including J2K, X264, X265, and Daala, have been investigated and four performance metrics were used in our comparative studies. Moreover, some alternative techniques such as video, split band, and PCA only approaches were also compared. It was observed that the combination of PCA and X264 yielded the best performance in terms of compression perfor-C. Kwan, J. Larkin Journal of Signal and Information Processing mance (rate-distortion curves) and computational complexity. In the Pavia data case, the PCA + X264 combination achieved more than 3 dBs than the PCA + J2K combination. Most importantly, our investigations showed that the PCA + X264 combination can achieve more than 40 dBs of PSNR at 100 to 1 compression. This means that perceptually lossless compression of HSI is achievable even at 100 to 1 compression.
The key contributions are as follows. First, we revisited the hyperspectral image compression problem and extensively compared several approaches: PCA only, Video approach, Split Band approach, and a two-step approach. Second, for the two-step approach, we compared four variants: PCA + J2K, PCA + X264, PCA + X265, and PCA + Daala. We observed that the two-step approach is better than PCA only, Video, and Split Band approaches, as perceptually lossless compression can be achieved at 100 to 1 ratio. Third, within the two-step approach, our experiments showed that the PCA + X264 combination is better than other variants in terms of performance and computational complexity. To the best of our knowledge, we have not seen such a study in the literature.
Our paper is organized as follows. Section 2 summarizes the HSI data, the technical approach, the various algorithms, and performance metrics. In Section 3, we focus on the experimental results, including the PCA only results, video approach, split band approach, and two-step approach (PCA + video codecs).
Four performance metrics were used to compare different algorithms. Finally, some concluding remarks are included in Section 4.

Data
We have used several representative HSI data in this paper. The Pavia and AVIRIS image cubes were collected using airborne sensors and the Air Force image was collected on the ground. The numbers of bands in the three data sets vary from one hundred to more than two hundred.
The first image we had tested was the Pavia data with a 610 × 340 × 103 image cube. The image was taken with a Reflective Optics System Imaging Spectrometer (ROSIS) sensor during a flight over northern Italy. Figure 1 shows the RGB bands of the Pavia image cube.

Image 2: AF image
The second image was the image cube used in [3] and it consists of 124 bands and has a height of 267 pixels and a width of 342 pixels. The RGB image of this data set is shown in Figure 2.

Compression Approaches
Here, we first present the various work flows of several representative compression approaches for HSI. We then include some background materials for several video/image codecs in the literature. We will also mention two conventional performance metrics and two other metrics motivated by human visual systems (HVS).
PCA only PCA is also known as Karhunen-Loève transform (KLT). Comparing with discrete cosine transform (DCT) and wavelet transform, PCA is optimal because it is data-dependent whereas the DCT and WT are independent of input data.
The work flow is shown in Figure 4. After some preprocessing steps, PCA compresses the raw HSI data cube (N bands) into a pre-defined number of bands (r bands) and those r bands will be saved or transmitted. At the receiving end, an inverse PCA will be performed to reconstruct the HSI image cube.

Split Band (SB) Approach
This idea is very simple. The HSI bands are divided into groups of 3-band images. Each 3-band image is then compressed as a still image with an image codec. This approach has been observed to work well for multispectral (MS) image cubes [13] where there are only nine bands. The work flow is shown in Figure 5.

Video Approach
This approach is similar to the SB approach. Here, the 3-band images are treated as video frames and then a video codec is then applied. Details can be found in Figure 5.
We include some details for some of the blocks.   Two-step Approach: PCA + Video The two-step approach has been used in [13] [16] before. In [13], X265 was observed to perform better in the second step. In [16], the second step was a J2K codec. However, the study in [13] was for MS images rather than an HSI.
The work flow for the two-step approach is summarized in Figure 6. In the second step, we propose to treat the PCA bands as a video.
Brief Review of Relevant Compression Algorithms Instead of reinventing the wheels, we will use image codecs in the market and objectively evaluate different codecs and eventually recommend the best codec to our customer.
With the above in mind, we include a brief overview of some representative codecs.
DCT based algorithms These video compression algorithms are owned by Google. The performance is somewhat close to X-264. We did include VP8 and VP9 in our study because they are not as popular as X264 and X265.
• X-264 [18]: X264 is the current state-of-the-art in video compression. Youtube uses X264. It has good still image compression.
• X-265 [19]: This is the next-generation video codec and has excellent still image compression and video compression. However, the computational complexity is much more than that of X264. In general, X265 has the same basic structure as previous standards. Journal of Signal and Information Processing Several studies concluded that X265 yields the same quality as X264, but with only half of the bitrate. It should be noted that X264 and X265 are optimized versions of H264 and H265, respectively. • Daala [20] Recently, there is a parallel activity at xiph.org foundation, which implements a compression codec called Daala [20]. It is based on DCT. There are pre-and post-filters to increase energy compaction and remove block artifacts. Daalaborrows ideas from [26].
The block-coding framework in Daala can be illustrated in Figure 7.
In this study, we compared Daala with X264, X265, and J2K in our experiments.

Wavelet-based algorithms
J2K is a wavelet [17] [27] [28] [29] based compression standard. It has better performance than JPEG. However, J2K requires the use of the whole image for coding and hence is not suitable for real-time applications. In addition, motion-J2K for video compression is not popular in the market.

Performance Metrics
In almost all compression systems, researchers used peak signal-to-noise ratio (PSNR) or structural similarity (SSIM) to evaluate the compression algorithms.
Given a fixed compression ratio, algorithms that yield higher PSNR or SSIM will be regarded as better algorithms. However, PSNR or SSIM do not correlate well with human perception. Recently, a group of researchers investigated a number of different performance metrics [30]. Extensive experiments were performed to investigate the correlation between human perceptions with various performance metrics. According to the results found in [30], it was determined that two performance metrics correlate well with human perception. One image example shown in Figure 8 demonstrates that HVS and HVSm have high correlation with human subjective evaluation results. In the past, we have used HVS and HVSm in several applications [11] [12] [13].

Experimental Results
Here, we briefly describe the experimental settings. In PCA only approach, a program was written for PCA. The input is one hyperspectral image and the number of principal components to be used in the compression. The outputs are the PCA bands. The performance metrics are generated by comparing the original hyperspectral image with the inverse-PCA outputs.
In the Video only approach, we used ffmpeg to call X264 and X265.   built-in J2K function in Matlab. In each codec, there is a quantization or quality parameter (qp) that controls the compression ratio. We chose around 50 qp parameters in our experiments in order to generate smooth performance curves.
In the two-step approach, the PCA is applied first, followed by the video codecs.

PCA Only
Here, we applied PCA directly to compress the 103 bands to 3, 6, and 9 bands, which we denote as PCA3, PCA6, and PCA9, respectively. From Figure 9, one can see that PCA3 achieved 33 times of compression with 44.75 dB of PSNR.
The other metrics are also high. Similarly, PCA6 and PCA9 also attained high values in performance metrics. This means that PCA alone can achieve reasonable compression performance. However, if our goal is to achieve 100 to 1 compression with higher than 40 dBs of PSNR, then the PCA only approach may be insufficient.

Video Approach
As mentioned earlier, the video approach treats the HSI data cube as a video C. Kwan, J. Larkin Journal of Signal and Information Processing where each frame takes 3 bands out of the data cube. There are 35 frames in total in the video for the Pavia data. We then applied four video codecs (J2K, X264, X265, and Daala) to the video. Four performance metrics were generated as shown in Figure 10. If one compares the metrics in Figure 9 and Figure 10, one can see that Video approach is slightly better than the PCA only approach. For instance, at 0.03 compression ratio, PCA3 yielded 38.2 dBs and the Video approach yielded more than 40 dBs in terms of PSNR. X265 performed better than others at compression ratios less than 0.1.

Split Band (SB) Approach
Here, SB approach means that every 3 bands in the hyperspectral image cube are treated as a separate image. We then applied four image codecs to each 3-band image. The averaged metrics from all 3-band images were computed.  Figure 11, one can see that the Video approach is slightly better. For example, at a compression ratio of 0.05, J2K has 43 dBs (HVSm) using the SB approach and X265 has 52 dBs (HVSm) using the Video approach.

Two-step approach
Two-step approach first compresses the HSI cube by using PCA to a number of bands (3, 6, 9, etc.) The second step applies a video codec to compress the PCA bands. We have five case studies below.    Figure 12 summarizes the two-step approach (PCA3 + Video). It can be seen that, at 0.01 compression ratio, the two-step approach can get above 40 dBs of PSNR. The other metrics are also high. Daala has better visual performance (HVS and HVsm) than others. We can also notice that the PCA3 + Video approach can attain much higher compression than PCA only, SB, and Video approaches. That is, the compression ratio can be more than 100 times compression with close to 40 dBs of HVSm in the two-step approach whereas the SB and Video approach cannot achieve 100 to 1 compression with the same performance metrics (40 dBs). PCA6 + Video Figure 13 summarizes the PCA6 + Video results. At 0.01 compression ratio, the PCA6 + Video approach appears to be slightly better than that of PCA3 + Video. X264 is better than others in three out of four metrics. In particular, at 0.01 compression, X264 has 45 dBs in terms of HVSm. This value is very high and can be considered as perceptually lossless.

PCA9 + Video Approach
From Figure 14, it is clear that PCA9 + Video is slightly worse as compared to the PCA6 + Video case. For example, at 0.01 compression ratio, PCA9 + Video has 44 dBs in terms of PSNR whereas PCA6 + Video has 45 dBs of PSNR. X264 is better than other codecs.

PCA12 + Video
As shown in Figure 15, the performance of PCA12 + Video is somewhat similar to PCA9 + Video.

PCA15 + Video
As shown in Figure 16, the performance of PCA15 + Video is somewhat similar to PCA12 + Video.

Comparison of different combination of the two-step approaches
The performance comparison of different combinations of the two-step approaches is summarized in Table 1. First, we observe that PCA3 and PCA6 have better performance than PCA9 to PCA15. Second, PCA6 has better performance in X264 and Daala. Third, for PCA6, we observed that X264 is 3.16 dBs in terms of PSNR better than J2K. In terms of HVSm, X264 is 4.2 dBs better than J2K. This is quite significant. For PCA6, Daala has slightly better performance than X264 and X265. However, we noticed that Daala took more computational times than X264. Hence, for practical applications, X264 may be a better choice for HSI compression.

PCA only approach
Here, we applied PCA directly to compress the 124 bands to 3, 6, and 9 bands, which we denote as PCA3, PCA6, and PCA9, respectively. From Figure 17, one    Video approach Comparing the performance of video approach (Figure 18) with the PCA only approach (Figure 17), one can immediately notice that the Video approach allows higher compression ratios to be achieved. For instance, at 0.01 compres-sion ratio, X265 achieved about 38 dBs in PSNR. X265 performs well for small ratios (high compression).
SB approach Comparing the SB approach in Figure 19 with the Video approach in Figure  18, the Video approach is better. For instance, if one looks at the PSNR values at 0.05 compression ratio, one can see that the X265 codec in the Video approach has a value of 44 dBs whereas the best codec (J2K) has a value of 41.5 dBs.
Two-step approach Here, the PCA is combined with the Video approach. That is, the PCA is first applied to the 124 bands to obtain 3, 6, 9, 12, and 15 bands. After that, a video codec is applied to further compress the PCA bands.

PCA3 + Video
From Figure 20, we can see that the PCA3 + Video can achieve 0.01 compression ratio with more than 40 dBs of PSNR. Hence, the performance is better than the earlier approaches (PCA only, Video, and SB). Daala has better performance in terms of HVS and HVsm.

PCA6 + Video
From Figure 21 and Figure 20, we can see that PCA6 + Video is better than PCA3 + Video. For example, at 0.01 compression ratio, Daala has 44 dBs (HVSm) for PCA6 + Video whereas Daala only has 34.75 dB for PCA3 + Video. X264 has better metrics in PSNR and SSIM, but Daala has better performance in terms of HVS and HVSm.

PCA9 + Video
Comparing Figure 21 and Figure 22, PCA9 + Video is worse than that of PCA6 + Video. For instance, at 0.01 compression ratio, PCA9 + Video has 42 dBs (PSNR) and PCA6 + Video has slightly over 44 dBs of PSNR. Daala has better scores in HVS and HVSm, but X264 has higher values in PSNR and SSIM.

PCA12 + Video
As shown in Figure 23, the performance of PCA12 + Video is worse than some of the earlier combinations. For example, Daala's HVSm value is 40 dBs at 0.01 compression ratio and this is lower than PCA6 + Video ( Figure 21) and PCA9 + Video (Figure 22). PCA12 + Video is better than PCA3 + Video ( Figure 20).

PCA15 + Video
From Figure 24, we can see that PCA15 + Video is similar to PCA3 + Video ( Figure 20), but worse than the other PCA + Video combinations (Figure 21, Figure 22, Figure 23).
Comparison of different combinations of the two-step approach From Table 2, we have the following observations. First, PCA6 + Video combination has the best performance for each codec. Second, X264 has the best performance in PSNR whereas Daala has the best performance in HVS and HVSm. Third, X264 is 1.45 dBs better than J2K for PCA6 case. As mentioned earlier, X264 is faster to run than Daala. Hence, X264 may be more suitable in practical applications.

PCA only approach
Here, we applied PCA directly to compress the 213 bands to 3, 6, and 9 bands, which we denote as PCA3, PCA6, and PCA9, respectively. From Figure 25, one can see that PCA3 achieved 72 times of compression with 39.7 dBs of PSNR. The other metrics are also high. Similarly, PCA6 and PCA9 also attained high performance, but lower compression ratios. This means PCA alone can achieve reasonable compression.      Here, the 213 bands are divided into groups of 3 bands. As a result, there are 71 groups, which are then treated as 73 frames in a video. After that, different video codecs are applied. The performance metrics are shown in Figure 26.
Comparing with PCA only approach, the video approach is slightly inferior. For instance, PCA6 has PSNR of 44 dBs at a compression ratio of 0.028 whereas the Video only approach has about 42.5 dBs at 0.028 ratio.

SB Approach
Here the 73 groups of 3-band images are compressed separately. The results shown in Figure 27 are worse than the video approach. This is understandable as the correlations between frames were not taken into account in the SB approach.

Two-step approach
We have the following five case studies based on the number of PCA bands coming out of the first step.

PCA3 + Video
From Figure 28, the performance metrics appear to flatten out after a com-

PCA9 + Video
Comparing Figure 29 and Figure 30, we can see that PCA9 + Video has better metrics than that of PCA6 + Video. For instance, at 0.01 compression ratio, PCA9 + Video has achieved 40 dBs of HVSm (Daala), but PCA6 + Video has 38.5 dBs.

PCA12 + Video
Comparing Figure 30 and Figure 31, we can see that PCA12 + Video is slightly worse than that of PCA9 + Video.

PCA15 + Video
Comparing Figure 31 and Figure 32, it can be seen that PCA15 + Video is slightly worse than PCA12 + Video.
Comparison of different combinations in the Two-step approach

Conclusion
In this paper, we summarize some new results for HSI compression. The key idea is to revisit a two-step approach to HSI data compression. The first step adopts PCA to compress the HSI data spectrally. That is, the number of bands is greatly reduced to a few bands via PCA. The second step applies the latest video/image codecs to further compress the few PCA bands. Four well-known codecs (J2K, X264, X265, and Daala) were used in the second step. Three HSI data sets with diversely varying numbers of bands were used in our studies. Four performance metrics were utilized in our experiments. We have several key observations. First, we observed that compressing of the HIS to six bands has the best overall performance in all of the three HSI data sets. This is different from the observation in [16] where more PCA bands were included in the J2K step.
Second, the X264 codec gave the best performance in terms of compression performance and computational complexity. Third, the PCA6 + X264 combination can be 3 dBs better than the PCA6 + J2K combination in the Pavia data at 100 to 1 compression and this is quite significant. Fourth, even at 100 to 1 compression, the PCA6 + X264 combination can attain better than 40 dBs in PSNR for all of the three data sets. This means the compression performance is perceptually lossless at 100 to compression.