DMCA
A recurrent model of contour integration in primary visual cortex. (2008)
Venue: | J. of Vision |
Citations: | 10 - 1 self |
BibTeX
@ARTICLE{Hansen08arecurrent,
author = {Thorsten Hansen},
title = {A recurrent model of contour integration in primary visual cortex.},
journal = {J. of Vision},
year = {2008},
pages = {1--25}
}
OpenURL
Abstract
Physiological and psychophysical studies have demonstrated the importance of colinearity in visual processing. Motivated by these empirical findings we present a novel computational model of recurrent long-range processing in the primary visual cortex. Unlike other models we restrict the long-range interaction to cells of parallel orientation with colinear aligned receptive fields. We also employ a recurrent interaction using modulatory feedback, in accordance with empirical findings. Self-normalizing shunting equations guarantee the saturation of activities after a few recurrent cycles. The primary computational goal of the model is to evaluate local, often noisy orientation measurements within a more global context and to selectively enhance coherent activity by excitatory, modulating feedback. All model simulations were done with the same set of parameters. We show that the model qualitatively reproduces empirical data of response facilitation and suppression for a single bar element depending on the local surround outside the classical receptive field (M. K. Kapadia, M. Ito, C. D. Introduction The response properties of neurons in the early visual stages are characterized by their receptive field (RF) properties In this article we focus on the role of recurrent longrange processing for the enhancement of contours. We present a model of contour integration using intralaminar long-range horizontal connections in V1. We employ a novel connection pattern where the horizontal connections are confined to parallel, near colinear orientations. Unlike cocircular filters that have been employed previously, connections between parallel, near colinear orientations have been found in vivo How is this accomplished? We suggest a computational framework involving long-range connections, modulating feedback, and recurrent interactions. The task of contour extraction cannot be solved solely on the basis on the incoming data (the feedforward excitatory input), but requires additional constraints and making assumptions on the shape of salient contours. The feedforward input is insufficient to define a contour, because the initial measurements are fragmented and noisy. Thus, additional knowledge about the shape a salient contour needs to be incorporated in the visual system. A main property of salient contours is that they are smooth or colinear, as reflected in the Gestalt law of good continuation Motivated by empirical findings we present a model of recurrent long-range interaction in the primary visual cortex for contour processing. The model enhances initially weak oriented activity along a contour that fits into a global context, while suppressing spurious noisy activity. At the same time, activities for multiple orientations at corners and junctions are preserved. With the same parameter settings, the model is employed to simulate physiological data on contour grouping Methods In this section we describe the model and present the measures of contour saliency for the numerical evaluation of the model. Model definition We propose a biologically plausible model of contour integration using intralaminar long-range horizontal connections and interlaminar recurrent interactions in the primary visual cortex. The model connections create a recurrent network which transforms the feedforward input to a stable point where contours are more salient compared to noise. Motivation of the model components and the model architecture The model has several key components. The model incorporates (i) localized receptive fields for oriented contrast processing, (ii) cooperative horizontal long-range integration, (iii) inhibitory short-range connections, and (iv) feedforward and feedback processing. These components are motivated by empirical findings. Oriented contrast processing Simple and complex cells in the primary visual cortex respond best to a respond best to oriented edges and bars defined by a luminance contrast Inhibitory short-range connections Short-range connections are rather unspecific for a particular orientation Modulating feedback Several physiological studies indicate that feedback projections have a modulation or gating rather than generating effect on cell activities The framework builds upon previous work by Grossberg and colleagues The core model architecture that we propose here consists of three main stages: 1. an initial preprocessing of the input and a recurrent processing within the two following stages, 2. a combination stage of modulatory feedback and feedforward input, and 3. a cooperative-competitive stage of center-surround long-range interaction The key component of this architecture is the recurrent processing at the stages 2) and 3) Feedforward preprocessing In the feedforward path, the initial luminance distribution is processed by isotropic LGN cells, followed by Journal of Vision (2008) 8(8):8, 1-25 Hansen & Neumann 3 Downloaded from jov.arvojournals.org on 06/28/2019 orientation-selective simple and complex cells. The interactions in the feedforward path are governed by basic linear equations to keep the processing in the feedforward path relatively simple and to focus on the contribution of the recurrent interaction. A more elaborated processing in the feedforward path would make use of, e.g., nonlinear processing at the level of LGN cells and simple cells LGN on-and off-cells Retinal ganglion cells and cells in the LGN have receptive fields with a circular center-surround organization Here and in the following * denotes the spatial correlation operator and [x] + := max{x, 0} denotes half-waverectification. The DoG operator is parameterized by the standard deviation of the center and surround Gaussian (A c = 1, A s = 3), respectively. The chosen ratio of the size of the center and surround A s /A c = 3 is larger than the ratio 1.6 that would approximate a Laplacian Simple cells Simple cells in V1 have elongated subfields (on and off) which sample the input of appropriately aligned LGN responses. Input sampling is modeled by correlation with rotated, anisotropic Gaussians. The Gaussians are shifted perpendicularly to their main axis by C = T3 to model left and right subfields of an odd-symmetric simple cell. Thus, e.g., for the on-channel, the equations read The activations of the off-channel are computed analogously. Simple cells are modeled for two polarities (dark-light and light-dark) in O max = 4 orientations (E = 1, :/O max ,I, (O max j 1):/O max ). The standard deviations of the anisotropic Gaussians are set to A y = 1, A x = 3A y . For each orientation, the simple cell activity is computed by pooling the two subfield responses. The equation for light-dark (ld) and dark-light (dl) simple cells read We employed separate equations for the on and off subfields here because this makes the connection to a proposed a model variant of early feedback clearer (Model variant using early feedback section). In this variant of the model we replace Equation 2 by a feedback-controlled inhibition from cells of opposite contrast polarity. Complex cells Cortical complex cells are polarity insensitive. Their response is generated by pooling simple cells of opposite polarities. Before pooling, simple cells of opposite polarities compete and are spatially blurred. The corresponding equations read The final complex cell responses represent oriented contrast energy similar to the energy models proposed to measure motion energy Recurrent long-range interaction The output of the feedforward preprocessing defines the input to the recurrent loop which has two stages, namely a combination stage where bottom-up and top-down inputs are fused, and a stage of long-range interaction. Journal of Vision Solving the equation at equilibrium¯tV E = 0 results in a normalization of activity The weighting parameter % V = 2 is chosen so that dimensions of C E and W E are approximately equal, the decay parameter ! V = 0.2 is chosen small compared to net E , and " V = 10 scales the activity to be sufficiently large for the subsequent long-range interaction. For the first iteration step, feedback responses W E are set to C E . Long-range interaction At the long-range stage, the contextual influences on cell responses are modeled. Orientation-specific, anisotropic long-range connections provide the excitatory input. The inhibitory input is given by isotropic interactions in both the spatial and orientational domain. Longrange connections are modeled by a filter whose spatial layout is similar to the bipole filter as first proposed by Essentially, excitatory input is provided by correlation of the feedforward input with the long-range filter B E . A cross-orientation inhibition prevents the integration of cell responses at positions where responses for the orthogonal orientation also exist. The excitatory input is governed by where * denotes spatial correlation and [x] + := max{x, 0} denotes half-wave-rectification. The long-range filter is defined as a polar-separable function The angular function B ang is maximal for the preferred direction E and smoothly rolls off in a cosine fashion, being zero for angles deviating more than !/2 from the preferred orientation: B ang ð8Þ ¼ cosð2:=2!ðE j 8ÞÞ if j8 j Ej e !=2; else 0: The parameter ! that defines the opening angle of the long-range filter is set to 20 deg. The radial function B rad is constant for values smaller than r max = 25 and smoothly decays to zero in a Gaussian fashion for values larger than r max : B rad ðrÞ ¼ expðjr 2 =ð2AÞÞ if r 9 r max ; else 1: ð11Þ The standard deviation of the Gaussian is set to A = 3. The long-range filter is finally normalized such that the filter integrates to one. A plot of the long-range filter for a reference orientation of 0 deg is depicted in Responses are not salient if neighboring cells of random orientation show strong responses. Such activity has an inhibitory effect on the target cell (Kapadia , 1995). This inhibitory effect is modeled by an sampling of activity with isotropic Gaussians from the orientational g e A o ,E , A o = 0.5 and spatial neighborhood G A sur , A sur = 8: The standard deviation of the Gaussian in the spatial domain is set to A sur = 8 to model the smaller extend of the inhibitory short-range connections. This parameterization results in an effective spatial extension of about half the size of the excitatory long-range interaction modeled by the long-range filter. The standard deviation in the orientational domain is set to A o = 0.5 to give near-zero input for the orthogonal orientation. The orientational weighting function g e A o ,E is implemented by a 1D Gaussian g A , discretized on a zero-centered grid of size O max , normalized, and circularly shifted so that the maximum value is at the position corresponding to E. The spatial profile of the 2D Gaussians weighted by the orientational Gaussian is visualized in The equation is solved at equilibrium, resulting in the following nonlinear, divisive interaction at the long-range stage: where ! W = 0.2 is the decay parameter and ) + = 5, ) j = 2, and " W = 0.001 are scale factors. The multiplicative contribution of V E ensures that longrange connections have a modulating rather than generating effect on cell activities To understand the general behavior let us consider the second term of Equation 13, " W V E (1 + ) . This term denotes the soft-gating of the activity from the long-range integration net E + by the response of the combination stage V E . Stability concerning the boundedness of input and output activation is achieved by the combined effect of this soft-gating mechanism together with the divisive, or shunting, inhibition that is effective by incorporating the third term of Equation 13, ) In addition, the gating variable V E is also bounded by employing a shunting mechanism to achieve mass action for the excitatory term (compare Equation 7). The model is robust against parameter changes mainly due to the compressive transformation equations. For the combination of responses (Equation 7), however, it is crucial to have activities in both streams of similar order of magnitude. Also the relative RF sizes must not be substantially altered. The current parameter setting results in relative RF sizes of complex cells: isotropic short-range filter: long-range interaction of about 1:2.5:4, assuming a cut-off of the Gaussians at 2A (or 95% of the total energy). The model is implemented in Matlab; the mfiles of the model implementation are provided as Supplementary Material. Journal of Vision Model evaluation Two measures of contour saliency and orientation significance are used to numerically evaluate the competencies of the model. Contour saliency To quantify the contour enhancement, we use a saliency measurement as suggested by The relative enhancement of contour activity was then defined as the ratio of the mean saliency along the contour and the mean saliency measured over all positions: A second measurement compared the standard deviation of the saliencies at all positions A all with the difference of the mean saliencies: A salient contour is characterized by high values of r and z. Modifications of the saliency definition We also investigated modifications of the saliency definition as given above. In these modifications, we defined the net saliency as the sum across all orientations at a given position: Further, we used an alternative formulation for the ratio where the mean saliency along the contour was compared with the mean saliency of the background: