Lightness Perception Model for Natural Images

A perceptual lightness anchoring model based on visual cognition is proposed. It can recover absolute lightness of natural images using filling-in mechanism from single-scale boundaries. First, it adapts the response of retinal photoreceptors to varying levels of illumination. Then luminance-correlated contrast information can be obtained through multiplex encoding without additional luminance channel. Dynamic normalization is used to get smooth and continuous boundary contours. Different boundaries are used for ON and OFF channel diffusion layers. Theoretical analysis and simulation results indicate that the model could recover natural images under varying illumination, and solve the trapping, blurring and fogging problems to some extent.


Introduction
Studies show that human could perceive a wide dynamic range of lightness from dim moonlight to dazzling sunlight.Human visual system has stable perceptual capability for scenes under variable illuminations, which is referred to as lightness constancy.First of all, two concepts must be clarified: luminance and lightness.Luminance values denote light intensities within the retinal image while lightness is related to our perceived world.Luminance values in the retinal image are a product, not only of the actual physical shade of gray of the imaged surfaces, but also, and even more so, of the intensity of the light illuminating those surfaces.The luminance of any region of the retinal image can vary by a factor of no more than thirty to one as a function of the physical reflectance of that surface.However, it can vary as a factor of a billion to one as a function of the amount of illumination on that surface.The net result is that any given luminance value can be perceived as literally any shade of gray, depending on its context within the image.Despite the challenge, human perceive shades of surface grays with rough accuracy [1].Lightness represents the cortical perceptual result of retinal stimuli and the process of lightness perception can be understood as a mapping from natural image input to visual percept output.
BCS/FCS (Boundary Contour System/Feature Contour System) proposed by Grossberg et al. is representative of visual lightness perception model.Its extended versions have explained a mass of psychological experimental data [2,3].BCS/FCS model discounts the illuminant to obtain contrast information through retinal preprocessing.Further processing is made by visual cortex to get boundary contour and surface.However, such illumination discounting information can just estimate relative measurements of reflectance of the surface.Visual cortex needs to compute the absolute lightness values that exploit the full dynamic range to perceive effectively.That's just the so-called anchoring problem.
Grossberg believed that boundaries and surfaces are visual perceptual units, and proposed aFILM model in 2006 [4].The model augments a lightness anchoring stage in the framework of BCS/FCS.Sepp et al. proposed a multi-scale filling-in model to reconstruct the image surface lightness [5].It extends the confidencebased filling-in model to multi-scale processing, thus speeding up the filling-in process.A key aspect of visual cognition research is to process natural images effectively while explaining psychological data, which facilitates higher cognition process such as object recognition and provides inspirations to machine vision.
This paper presents a neural dynamic model to simulate the mapping process from luminance to lightness according to recent neurophysiological and psychological experimental findings.The proposed model could recover natural images under varying levels of illumination, and solve the trapping, blurring and fogging problems to some extent.

Model Description
After retinal photoreceptors receive the input image, re-tinal adaptation occurs firstly.The light-adapted signal goes to multiplex coding and boundary detection.The contrast information from multiplex coding is used to recover the absolute surface lightness, and the boundary contour is to block activity diffusion between surfaces.Final perceptual lightness output is obtained through surface filling-in mechanism.The overall model architecture is depicted in Figure 1.

Retinal Adaptation
The model retina calculates the steady-state of retinal adaptation to a given input image [4].It adapts the response of photoreceptors to varying levels of incoming light, since otherwise the visual process could be desensitized by saturation right at the photoreceptor.Light adaptation, at the photoreceptor outer segment, protects each photoreceptor from saturation by using intracellular temporal adaptation that shifts the photoreceptor sensitivity curve [6].As illustrated in Figure 2, the light-adapted signal is further processed at the photoreceptor inner segment where it gets feedback from a horizontal cell (HC) that is connected with other HCs by gap junctions [7], forming a syncytium that is sensitive to spatial contrast.HC inhibition further adjusts the sensitivity curve to realize spatial contrast adaptation.It is assumed that the permeability of gap junctions between HCs decreases as the difference of the inputs to HCs from coupled photoreceptors increases.The model retina hereby segregates and selectively suppresses signals in regions that have strong contrasts, such as a light source.HCs connections are not constrained to nearest neighbors but reach farther regions.Inhibition of the HC on the photoreceptor controls the output of the photoreceptor by modulating Ca 2+ influx at the photoreceptor inner segment.This feedback prevents the output from being saturated by localized high-contrast input The outer segment of retinal photoreceptor can be modeled by the equation: where ( , ) i j denotes spatial position, ( ) g t is an automatic gain control term simulating negative feedback mediated by Ca 2+ ions and follows: where , and where / between ij H and HC activity ij h follows: where H a and H b are constants.The activity of an HC connected to its neighbors through gap junctions is defined as diffusion equation: where pqij P is the permeability between cells at ( , ) i j and ( , ) ,   are constants.ij N is the neighborhood of size r to which the HC at ( , ) i j is connected:

Multiplex Contrast Code
Lightness anchoring model generally incorporates an extra luminance-driven channel to recover absolute lightness in addition to retinal contrast channels.Maybe multi-scale band-pass filters are used to get contrast and luminance information, such as aFILM model which acquires contrast information through small scale filtering and obtains luminance information through large scale filtering [4].There is evidence showing that a larger disinhibitory surround exists outside of the classical receptive field of retinal ganglion cells.Accordingly, a multiplex retinal code is proposed to solve the anchoring problem.The code is composed of retinal contrast responses, where contrast responses are locally modulated by brightness (ON cell) or darkness (OFF cell).The modulation is implemented by an extensive disinhibitory surround or outer surround (OS), an annulus which is situated beyond the classical center-surround receptive of retinal ganglion cells as shown in Figure 3 [8].The classical receptive field is sensitive to contrast information and outer surround is sensitive to local luminance.Then luminance-correlated contrast information can be obtained through multiplex encoding without additional luminance channels.Also, it is plausible from a neuronphysiological point of view.Due to the asymmetry phenomenon of ON cell and OFF cell responses for the classical center-surround receptive field of retinal ganglion cells, a self-inhibition mechanism is adopted: ( where ij x  and ij x  are ON cell and OFF cell responses respectively, [ ] max( ,0)  , and  is a decay factor.
input.s G is a Gaussian kernel."  " denotes convolution operator.The last term of the right side of above equations denotes self-inhibition which is used to solve the response asymmetry problem [9].
Let m  and m  denote local luminance-correlated multiplexed contrast response of ON and OFF channels respectively.The activity of the outer surround is computed by convolution with a Gaussian kernel: [ ] Norm  implements the normalization operator which maps the input into [0 1].Outer surround activity acts to multiplicatively gate the classical center-surround responses of retinal ganglion cells.So the multiplexed contrast responses are defined as where o  is a saturation constant.

Boundary Detection
The same retinal contrast information is used in both boundary detection and filling-in mechanism in BCS/ FCS model.In this paper, we use different contrast information.Boundary detection is completed through a dynamic normalization network [10].First, we define an operator [ ] We can get for simplicity we denote: Subsequently, we define nonlinear diffusion operators: [ ] where 4 {( 1, ), ( a b c d are minsyncytium, max-syncytium, normalized ON type and OFF type cell activity respectively. denotes Dirac's delta function.Seen from (17), a cell ij a of the minsyncytium may only decrease its activity along with time, as long as there exists any activity gradient between this cell and its four nearest neighbors.The result of diffusion is that ij a decreases to the global minimum value.In an analogous fashion, a cell ij b of the max-syncytium finally gets the global maximum value of activity.We can get smooth and continuous boundary contour by ear-ly dynamics of the dynamic normalization network.We denote the resultant responses as , y y   .ON contour and OFF contour representing diffusion barriers are defined as ( ) where w  is a saturation constant, The detected boundaries are always discontinuous due to noise or other factors, which allows for activity exchange between adjacent surface representations.Consequently, perceived luminance contrasts between surfaces decrease, because they eventually adopt the same value of perceptual activity.It's the so-called fogging problem.In order to counteract fogging, an interaction zone around contours is defined.Within this zone, brightness activity and darkness activity undergo mutual inhibition, leading to a slow-down of diffusion rate at boundary leaks.Thus fogging is decelerated and surface edges will appear blurry at boundary gaps.Let ( ) with a threshold value z  .Interaction zone activity Z is defined as where z  is a saturation constant, z G  is a Gaussian kernel with standard deviation z  .

Surface Filling-in
The multiplex contrast responses diffuse within regions formed by boundary contours to form perceptual surface representations.Diffusion layers are computed by dynamic equations: where / ij f   represent brightness and darkness activity. is a constant, in E is an inhibitory reversal potential.Diffusion coefficients: where  is a constant.From equations above we can see that OFF contours are used to block brightness activity diffusion while ON contours are employed to block darkness activity diffusion.In this way, we can alleviate the activity trapping problem.The perceptual activity of surface representations where leak g is leakage conductance, and rest V is the resting potential.

Simulation Results
In order to validate the effectiveness of the proposed model, we first evaluate the performance of each stage, then the lightness perception performance of the whole model is tested using natural images.The size of test images in simulation is 256 × 256.

Retinal Adaptation
Firstly, we evaluate the performance of retinal adaptation.The simulation results are illustrated in Figure 4.The original input image, HC activity and retinal adaptation output are shown in Figures 4(a)-(c) respectively.The input image has intensive contrast, thus we can hardly see the dark scene.Through retinal adaptation processing, we cannot only see the clouds in sky of bright region, but also see the sand of dark region, even small rocks on the ground.It's the result of light adaptation and spatial contrast adaptation of the retina.

Boundary Contour Detection
In this section, we compare the boundary detection performance of dynamic normalization with that of classical laplacian linear filtering method.Figures 5(a

Multiplex Contrasts
For the synthetic stimulus in

Surface Filling-in
Nonlinear diffusion is used to implement surface fillingin and recover absolute perceived lightness.Most filling-in models have blurring, trapping and fogging problems such as confidence-based filling-in.As described previously, in this paper, different boundaries are used for the brightness and the darkness diffusion layers.As a consequence, trapping can be weakened to some extent.
Besides, the adoption of interaction zone could alleviate fogging.

Conclusions
In this paper, a perceptual lightness anchoring model based on visual cognition is proposed.It can recover absolute lightness of real-world images using filling-in mechanism from single-scale boundaries.Further, it can weaken trapping, blurring and fogging to some extent.Reasonable perceptual results could be obtained for natural images under varying illumination conditions.The proposed model could be applied to image enhancement and image reconstruction in machine vision, and facilitate robust processing of higher levels such as object recognition.However, there are not normative objective measurements to evaluate the performance of visual cognition model.Most measurements are qualitative, and it makes no exception for this paper.Additionally, this model recovers absolute lightness from single scale channel.So its ability of noise suppression is weaker than multi-scale processing.We will study further in these aspects.

Figure 1 .
Figure 1.Model block diagram signals.The outer segment of retinal photoreceptor can be modeled by the equation: ( ) ( )

Figure 6 (
a), one-dimensional luminance staircase is shown in (b)-(d) by the black solid lines.The red solid lines denote ON responses while the blue dashed lines correspond to OFF responses.Figure 6(b) shows the responses of classical center-surround receptive field.OFF responses are always higher than ON responses around edges, and both decrease with luminance increase.The asymmetry problem of ON and OFF responses is solved by self-Inhibition mechanism illustrated in Figure 6(c), where ON and OFF responses are only sensitive to contrast and insensitive to luminance.Therefore, we could modulate ON responses with local brightness and OFF with local darkness, thus getting luminance-correlated multiplex contrast responses.As seen in Figure 6(d), ON responses increase while OFF responses decrease as the luminance increases.

Figure 7 (
a) shows the result of confidencebased filling-in, demonstrating serious fogging problem.