Full Image Inference Conditionally upon Available Pieces Transmitted into Limited Resources Context

In a context marked by the proliferation of smartphones and multimedia applications, the processing and transmission of images have become a real problem. Image compression is the first approach to address this problem, it nevertheless suffers from its inability to adapt to the dynamics of limited environments, consisting mainly of mobile equipment and wireless networks. In this work, we propose a stochastic model to gradually estimate an image upon information on its pixels that are transmitted progressively. We consider this transmission as a dynamical process, where the sender pushes the data in decreasing significance order. In order to adapt to network conditions and performances, instead of truncating the pixels, we suggest a new method called Fast Reconstruction Method by Kalman Filtering (FRM-KF) consisting of recursive inference of the not yet received layers belonging to a sequence of bitplanes. After empirical analysis, we estimate parameters of our model which is a linear discrete Kalman Filter. We assume the initial law of information to be the uniform distribution on the set [0, 255] corresponding to the range of gray levels. The performances of FRM-KF method have been evaluated in terms of the ratios in the quality of data image/size sent and in the quality of image/time required for treatment. A high quality was reached faster with relatively small data (less than 10% of image data is needed to obtain up to the sixth-quality image). The time for treatment also decreases faster with number of received layers. However, we found that the time of image treatment might be large starting from a image resolution of 1024 * 1024. Hence, we recommend FRM-KF method for resolutions less or equal to How to cite this paper: Saoungoumi-Sourpele, R., Nlong, J.M., Fotsa-Mbogne, D.J., Kamdjoug, J.-R.K. and Bitjoka, L. (2021) Full Image Inference Conditionally upon Available Pieces Transmitted into Limited Resources Context. Journal of Signal and Information Processing, 12, 57-69. https://doi.org/10.4236/jsip.2021.123003 Received: April 25, 2021 Accepted: July 3, 2021 Published: July 6, 2021 Copyright © 2021 by author(s) and Scientific Research Publishing Inc. This work is licensed under the Creative Commons Attribution-NonCommercial International License (CC BY-NC 4.0). http://creativecommons.org/licenses/by-nc/4.0/ Open Access R. Saoungoumi-Sourpele et al. DOI: 10.4236/jsip.2021.123003 58 Journal of Signal and Information Processing 512 * 512. A statistical comparative analysis reveals that FRM-KF is competitive and suitable to be implemented on limited resource environments.


Introduction
Transmission of digital images has been widely studied, since the early years of the Internet [1]. It deals with compression and transmission of images data in such a way that the receiver can start decoding and displaying the received images even without receiving the whole file. Because of the large amounts of data needed in image technology, applications are highly constrained by the available resources, and the quality of service during the transmission. In video streaming, for instance, the latency of the transmission of individual image frames plays a fundamental role [2] [3] because of the isochronal character of the video. The images must be displayed at a given frequency, with a fault threshold above which the visual quality of the video is not acceptable. Many techniques have been proposed in the literature to tackle these problems, among which image compression and Progressive Image Transmission (PIT).
The primary objective of PIT is to transmit a significant and interpretable core of the image and subsequently transmit complements layers in order to gradually improve the quality. This method requires a preparation of the image to be transmitted. PIT techniques can be grouped into three main areas: the spatial domain [4], the methods based on transform domain, and the pyramid-structured domain [4] [5]. As new areas of interest emerge, like live streaming over narrow networks, wireless sensor networks, digital image transmissions are still of a significant challenge. As reviewed in [4], most of the recent improvements in image coding are based on wavelet transformation. The challenge is then to organize the transmission of the bitstream to adapt to the fluctuations of the network and the receiving device capabilities.
In this work, we are interested in the progressive transmission and refinement of still images, as a process that adapts to low quality network service. A special focus is made on JPEG2000 format since it is the most used standard nowadays [6] [7]. However, the method presented here is general enough and can be applied to any image format that encodes the image file as a two-dimension data container (the resolution and the color depth), and where the resolution and the color depth can be picked independently to adapt to the end user device. The display of an image is considered as a progressive process in order to adapt to network conditions. The sender selects the image data, layer by layer, from most significant to the least one, depending on the quality of the desired image at the receiver. Upon reception of these data, the receiver decodes a blurred version of the image and smooths it by statistically inferring the missing information. As more refinement data come from the network, this process is recursively repeated. Kalman filtering (KF) algorithm will be used to infer the refinement data [8] [9] [10].
The rest of the paper is organized as follows. In Section 2, we review the literature works about progressive image transmission. Section 3 deals with the theoretical foundations on the discrete Kalman Filtering, while Section 4 presents the proposed method modeling image transmission as a filtering procedure. In Section 5, we apply the FRM-KF on a standard image gallery and discuss its performance.

Short Background on PIT
PIT techniques can be grouped into three main areas namely the spatial domain, the methods based on transformed domain and the pyramid structured domain.

Spatial Domain
Spatial domain methods are based on the bit-plane decomposition (BPDM) [4] and the vector quantization method (VQM).
The bit plane decomposition method is the most intuitive one when tackling the problem of progressive transmission. Indeed, the level of gray of each pixel in an image is coded over 8 bits having different significances. The collection of the i th significant bits of all pixels constitutes the i th bit plane to be transmitted at the i th step. On the receiver side, the binary image will be rebuilt after receiving a certain number of bit planes, and gradually refined with the arrival of the other planes. BPDM does not introduce any distortions but it suffers from a lack of flexibility and limited performance in terms of adaptation to variations in network conditions. Improvements of BPDM in terms of reduction of storage space and therefore transmission time are available in the literature: quantification of pixels and selection of areas of interest with higher priorities [11].
During the vector quantization the pixels are grouped in blocks (code-blocks) which are transformed each into a vector. The obtained vectors are grouped into a lighter structure called code-book where they are codewords. Codewords are progressively transmitted and used to produce an approximative image on the receiver side. The main available improvement of the VQM is the Tree-sourced Vector Quantization method (TSVQM) which consists to transmit first the vector quantizations contributing more quickly to obtain a better image quality [12].
The VQM has some disadvantages: block effects during the display, transmission by codeword and overout of the transmitter side complexity, calculation overhead for the creation, organization and codewords selection.

Transform Domain
The main goal of transform based methods (e.g. Discrete Cosines Transform (DCT)) is to achieve the concentration of energy in low-frequency areas which are grouped into a small number of coefficients. The low frequency coefficients have a strong and decisive impact on the final and overall quality of the image. Before their transmission in the decreasing order of importance, those coefficients are hierarchized following a technical scanning pattern (e.g. the zigzag scan used in JPEG [13]) or the multistage quantization based on variances of coefficients.

Pyramid Structured Domain
The pyramidal shape is ideal for progressive transmission. To form a pyramid, an image is reduced in terms of resolution according to a predefined method such as Discrete Wavelet Decomposition (DWT) [4] [5] and the Quadrature Mirror Filter (QMF) [12]. The reduced image has few coefficients and the transmission process therefore consists of transmitting first the top of the pyramid followed by the differences between the current layer and the next layer.

Fast Progressive Image Transmission
All the above techniques do not integrate a prediction on the data not yet received. Such an inference allows a faster access to a transmitted image. A method trying to achieve that goal is the pixel interpolation permitting to estimate not yet received data using a model constructed based on available data. The SIDE-MATCH algorithm is an implementation of interpolation method [14]. Although the expected rapidity to converge to a relatively good image quality, inference methods suffer of a certain number of drawbacks, namely the difficulty of producing good quality images at the beginning of the process and their complexity inducing large calculation times.

The Discrete Kalman Filter
Filtering is a procedure which aims at estimating the state of a given dynamic system with noisy observations. Usually, the outputs are given as a sequence: { } n n T Y ∈ . T ⊆  denotes a set of time values. It can be discrete or continuous, depending on data availability and observation rate. Each output n Y is related to an unknown or partially known state n X through a stochastic model of the form where n V is the noise occurring in the measurement procedure.
n H represents and averaged relationship between n Y and n X . In other terms it is a trend of evolution of Y as a function of X The observation noise is usually assumed to be a normal or Gaussian random variable [15]. The additional hypothesis of independence of system { } n i T V ∈ is very common and useful for computations. In Equation (1) [17]. As mentioned in [18] [19] [20], applications of filtering cover areas such as sensorless control, prognostics and health management (PHM), fault-tolerant control of ac drives, management of storage systems, signal processing, robotics, computer vision, real-time industrial control systems, localization, navigation, mobile trajectory tracking and other applications combining knowledge of a priori dynamics with sensors measurements.
A large class of these applications is covered by the discrete filtering that can be described by the general linear problem where n A , n B , n C and n D are matrices expressing the dynamics of the signal and the observation. The filtering problem (2) has an explicit solution in the Gaussian linear case known as the "discrete Kalman Filter" which is presented as follows.
Let x The techniques developed in the linear filtering can sometimes be extended to the nonlinear case by the mean of linearization methods [21]. However, there are more general results that can be applied in nonlinear cases such as particle filtering.

Model Statement
We consider the progressive transmission of a JPEG2000 image, encoded in bitplane. We assume the transmission is done bitplane by bitplane, over a narrow network channel. Because of the poor network quality, the receiver cannot wait until all the data are transmitted before decoding and displaying the image. Moreover, the transmission can unpredictably stop at any time. Thus, the receiver has to use the data received so far to estimate as better as possible the whole image. A first approach consists in simply refreshing the estimated image with newly received layers and in displaying the result when its quality reaches a given threshold. Instead, we learn from successive bitplanes or layers, considered as partial observations of the image, to infer the missing parts. Hence, the bitplane transmission can be viewed as a dynamic system with partial observations. Since image structures are variables, we can use a representative sample of coefficients for statistical inference.
However, following our purpose of inference, a stochastic description is needed here. Hence, from the receiver's viewpoint, the following model that recalls the problem (2) can be considered: Equation (7) describes the dynamics of the remaining information to be received while Equation (8) gives the next layer to be received. Indeed, we make the hypothesis of an arithmitico-geometric progression of the part of the image that remains to be sent ( n X ). On the same manner, we assume an affine relationship between the current layer to be sent ( n Y ) and the current part of the image that remains to be sent ( n X ). The choice of an affine model is as simple as natural for a first modeling that will prove otherwise reasonable. Notice that 0 0 Y = and that by formulation of the problem, 0 X follows the uni- Equation (7) underlines an exponential variation of estimation errors both with their variances. The filtering procedure will consist in determining the mathematical expectation of n X conditionally upon 0:n Y , at the step 1, , 7 n =  .
The choice of the upper bound of n = 7 is motivated by the fact that we process images by channels. And for a real color image, we have red, green and blue channels each coded on 8 bits (numbered from 0 to 7). The estimation n S of S is given by  The use of the Kalman filter also gives us the benefit of its memoryless cha- racteristic: it only retains the previous state to infer the current one. So it is not necessary to keep track of all the previously computed states in memory for the prediction method.

Calibration and Validation of the Model
The dynamics of the conditional distribution law (characterized by its mean vector and its variance-covariance matrix) is stirred by the filtering equations. In Since we adopted 0 β = , it remains to find α in such a way that (18) is minimal. After α and β have been identified, one can obtain consecutively a, b and c by minimizing the following SSEs: In (18), (19) and (20) Following the aforementioned regressions, we obtained Table 1.
Note that all the parameters satisfy the hypotheses of Proposition 1 and therefore guarantee the exponential convergence of the filter.

Experimental Evaluation of FRM-KF
This section aims at applying the filtering procedure we described above to a sample of 210 images coming from the University of Southern California-Signal  Figure 1 illustrates the evolution of the visual rendering of images following the quality layers reception and the filtering procedure.
Compared to the results in [5], the visual rendering they obtained at their fifth step is obtained here at the 3 rd step (Figure 1(d)), corresponding to a good visual quality for human perception. The Peak Signal to Noise Ratio (PSNR) was measured for the successive estimated images based on received layers. We compared our PSNRs to those of reference methods: the Set Partitioning In Hierarchical Tree (SPIHT) method, the method of Tzu-Chuen and the method of Tung [5].
A regression analysis showed for all considered methods that there is an affine relation between the number of received layers and the measured PSNR (at least 93% for the adjusted R-squared) with high significant 2 slope and intercept.  Table 2 gives the regression coefficients of each method for 256 * 256 and 512 * 512 resolutions of Lenna image studied in [5]. Table 3 gives the difference of regression coefficients of each considered method with respect to FRM-KF. The intercept shows that Tzu-Chueng Lu method has the better initial PSNR while the FRM-KF has the worst probably because its initial estimation is drawn uniformly randomly. Fortunately, the FRM-KF has the best slope which is about 1.63 times better than the second higher slope displayed by the SPIHT method. Hence, with a few number of images (from 3) FRM-KF presents the best performance compared to other methods.
The database coming from the USC-SIPI contains 73 images having a 256 * 256 resolution, 83 images having a 512 * 512 resolution, 53 images having a 1024 * 1024 resolution and only 1 image having a 2050 * 2050 resolution. For our statistical analyses we then focused on 256 * 256, 512 * 512 and 1024 * 1024 resolutions. Again we found an affine relation between the PSNR and the number of received layers. The P-value was less than 2 × 10 −16 and the adjusted R 2 (model fitting factor) was between 88.77% and 96.96%. In order to give a general behavior of the FRM-KF method, the computed values of the slope and the intercept are given in Table 4.  We evaluated the time needed to decode the images. The first phase consisting to generate white noise, to decode the first quality layer of the original image, and to combine the both took about 2.24 × 10 −1 ± 2.868 × 10 −2 , 8.87 × 10 −1 ± 5.612 × 10 −2 and 3.505 ± 1.587 × 10 −1 (in terms of average ± standard deviation) seconds respectively for 256 * 256, 512 * 512 and 1024 * 1024 resolution images. The necessary times to decode each other quality layer and to combine it with previous result, were given by 3.149 × 10 −2 ± 4.012 × 10 −3 , 1.223 × 10 −1 ± 1.024 × 10 −2 , 4.95 × 10 −1 ± 4.009 × 10 −2 seconds respectively for 256 * 256, 512 * 512 and 1024 * 1024 resolution images. The images used on current mobile devices have a resolution of at least 512 * 512. With regard to the time corresponding to the processing of the 1024 * 1024 resolution image, we recommend the FRM-KF method to resolutions less or equal to 512 * 512.
Focusing on the amount of data transmitted during a streaming of images for each quality layer, we notice that less than 10% of image data is needed to obtain up to the sixth-quality image. So, the process is suitable in terms of processing and memory resources for small devices with low computing capabilities.

Conclusions
This work addressed the problem of image transmission in limited environment. We were interested in the progressive transmission and refinement of still images, as a process that adapts to low quality network service. In order to achieve our objectives, we proposed a stochastic model which presents the missing parts of the image as noise effects. In a stochastic context, the problem of estimating dynamically a signal conditionally upon available observations is known as filtering. Thus, we tried successfully to calibrate a Kalman filter model using statistical regression and some general considerations. The output model we got was precisely a discrete Kalman filter.
Applying the filtering procedure on a dataset of 209 images we got satisfactory results. Indeed, we evaluated the evolution of Peak Signal to Noise Ratio (PSNR) with respect to the number of received layers. An affine relation was found independently on the PIT method we considered (Set Partitioning In Hierarchical Tree, Tzu-Chuen, Tung and FRM-KF methods). The FRM-KF approach we proposed appeared to be one which improves the PSNR faster.
The performance of FRM-KF method has been further evaluated in terms of the ratios in the quality of data image/size sent and in the quality of image/time required for treatment. A high quality was reached faster with relatively small data (less than 10% of image data is needed to obtain up to the sixth-quality image). The time for treatment also decreases faster with number of received layers. However, we found that the time of image treatment might be large starting from a image resolution of 1024 * 1024. Hence, we recommend FRM-KF method for resolutions less or equal to 512 * 512.
In future works, we are expected to extend our method in multimedia communication environments, subject to disturbances, in order to ensure robustness to breakdowns and interference. We should also consider adapting our approach to video streaming in order to ensure a greater continuity of video streaming service content.