100 years of good Gestalt: new vistas
How good Gestalt determines low level vision
In classical models of vision, low level visual tasks are explained by low level neural mechanisms. For example, in crowding, perception of a target is impeded by nearby elements because, as it is argued, responses of neurons coding for nearby elements are pooled. Indeed, performance deteriorated when a vernier stimulus was flanked by two lines, one on each side. However, performance improved strongly when the lines were embedded in squares. Low level interactions cannot explain this uncrowding effect because the neighboring lines are part of the squares. It seems that good Gestalts determine crowding, contrary to classical models which rather predict that low level crowding should occur even before the squares, i.e. higher level features, are computed. Crowding and other types of contextual modulation are just one example. Very similar results were also found for visual backward and forward masking, feature integration along motion trajectories and many more. I will discuss how good Gestalts determine low level processing by recurrent, dynamic computations, thus, mapping the physical into perceptual space.
A century of Gestalt theory: The good, the bad, and the ugly
100 years ago Wertheimer published his paper on phi motion, widely recognized as the start of Gestalt theory. I will evaluate what it has offered to modern vision science. (1) The good: The emergence of structure in perceptual experience and the subjective nature of phenomenal awareness remained central topics of research. Using methods and tools that were not at the Gestaltists' disposal, much progress was made in outlining principles of perceptual grouping and figure-ground organization. (2) The bad: Gestalt theory was criticized for offering mere demonstrations with simple or confounded stimuli, and formulating laws with little precision for every factor that influenced perceptual organization. Köhler's electrical field theory was proven wrong and the underlying notion of psychophysical isomorphism not productive. Claims about Gestalt principles being preattentive, innate, and independent of experience appeared exaggerated. (3) The ugly: Several Gestalt notions do not fit well with the rest of what we know about vision. How can we understand the relationships between parts and wholes in light of the visual cortical hierarchy and dynamics? How can internal laws based on a general minimum principle yield veridicality in the external world? Establishing an integration of Gestalt theory within modern vision science provides serious challenges.
On grouping and shape formation: new results
The problem of perceptual organization was first studied by Gestalt psychologists in terms of grouping/segmentation by asking “how do individual elements group into parts that in their turn group into larger wholes separated from other wholes?”. The aim of this work is to use gestalt psychologists’ insights to answer the following questions: what is shape? What is its meaning? How does it pop out from grouping? What is the relationship between grouping and shape? Shape perception and its meaning were studied starting from the square/diamond illusion and according to the phenomenological approach traced by gestalt psychologists. The role of frame of reference in determining shape perception was discussed and largely weakened or refuted in the light of a high number of new effects, based on some phenomenal meta-shape properties useful and necessary to define the meaning of shape. On the basis of new illusions, it is suggested that the meaning of shape can be reconsidered as a multiplicity of meta-shape attributes that operate like meaningful primitives of the language of shape perception. Through these results, limits and advantages of the gestalt approach to perceptual organization within modern Vision Science are discussed.
Gestalt influences in modal and amodal filling-in
Rob van Lier
Whatever we see, its appearance belongs to the output of the perceptual system. Time and again it appears that relatively simple stimulus manipulations reveal extraordinary perceptual output - with varying degrees of phenomenological presence. The brain appears to fill in various properties that cannot be directly derived from the retinal image. With respect to that process, a distinction has often been made between modal and amodal filling-in. This modal-amodal dichotomy particularly holds for the phenomenological appearance of the filled in properties. Regarding brain processes, however, this distinction is much less obvious; especially amodal filling-in can be situated in a “grey zone”, somewhere in between seeing and thinking. Here Gestalt principles of visual organization may compete with influences of higher level aspects such as knowledge and familiarity. I will review recent studies on various filling-in phenomena and show how they help to understand the underlying mechanisms of perception.
Amodal completion and shape approximation
With few exceptions (Fantoni et al, 2008, Vision Research, 48, 1196–1216; Fulvio et al, 2009, Journal of Vision, 9(4):5, 1–19) the amodal completion of angles has been conceived as the production of a trajectory that interpolates veridically represented input segments. This is the case also for the Gerbino illusion, originally explained as the consequence of amodal additions based on good continuation (Gerbino, 1978, Italian Journal of Psychology, 5, 88-100). Alternatively, amodal completion might involve approximation. Curve fitting by polynomial functions makes the difference clear (Ullman, 1996, High-level Vision. MIT Press). Interpolation generates a curve that connects all points and minimizes the changes of direction; approximation generates a curve that minimizes the distances from points, with a variable error intrinsic to noisy data. In the Gerbino illusion approximation generates of a smooth hexagon that cannot match the arrangement of input segments, given the coincidental occlusion of vertices. Fantoni et al (2008) provided evidence of approximation in amodal completion of 3D surfaces. With reference to such phenomena I will discuss the assumption that perceptual experience includes representations not only of the optic input but also of the degree of mismatch between the input and approximated shapes.
Understanding perceptual organization: What, how, and why?
Koffka famously asked why things look the way they do. I will argue that answering it entails answering at least two important additional questions: what things look like and how they come to look that way. Gestalt psychologists answered the what-question by producing and discussing phenomenological demonstrations of geometrical image features (e.g., proximity and similarity of elements in grouping and surroundedness and small size of regions in figure/ground perception), the how-question by hypothesizing holistic brain processes that settle into minimum-energy states, and the why-question by appealing to simplicity (Prägnanz). I will contrast this classical Gestalt approach with modern approaches to perceptual organization based on behavioral, neuroscientific, and ecological methods. I will argue that direct behavioral reports of phenomenology are epistemologically primary to other kinds of evidence and thus indispensable. I will also characterize several important developments in answering the why-question as involving the explication of ecological factors that support the perception of environmental surfaces, much as expected from making Helmholtzian unconscious inferences.
Shading Gradient Based Cues to Depth and Figure-Ground Perception
Tandra Ghose and Stephen Palmer
Rubin (1921) first identified the problem of figure-ground organization (FGO) in ink-blot like images and isolated several factors (cues) that influence the process. Since then, for almost 75 years similar flat-2D bipartite displays were used to investigate FGO leading to the identification of many more cues to FGO, until very recently, border ownership was discussed in asymmetrical luminance profiles in the watercolor illusion. However, none of the studies, thus far had discussed the role of important information provided by shading and texture gradients that are available in natural and artificial images. I will discuss the FGO cues of Extremal Edges and Gradient Cuts that exploit the regularity in shading gradients and influence the interpretation of images because they reflect the structure of bounded surfaces in the 3-D world. I will also discuss how the discovery of such "powerful" cues to depth and FGO opened up ways to studying important open questions in FGO that were not possible with the "weaker" Gestalt cues [e.g., recent study by Brooks & Palmer, 2011, J Cogn Neurosci., 23(3):631-44].
Definition of Shape
Zygmunt Pizlo, Yunfeng Li, Yun Shi, Tadamasa Sawada and Robert Steinman
Gestalt Psychology made shape perception important 100 years ago but we still do not know what shape is. Most assume that all patterns and objects have shape. This is unsatisfactory because our commonsense and perceptions tell us that a random-dot-pattern has less shape than a butterfly. Today, we propose a new analytical definition of shape, based on the amount of symmetry it contains. Symmetry, here, is understood broadly, i.e., as any type of spatial regularity, measured by its self-similarity. This definition makes it possible to classify objects along a one-dimensional shape continuum, with amorphous objects, such as bent-wires, crumpled papers and potatoes having little, even zero, shape. Implications derived from our definition can explain: (i) why shapes are perceived veridically; (ii) how the shapes of non-rigid, as well as rigid objects, can be handled; (iii) how content-addressable memory for shapes can be organized, and (iv) how informative a priori shape constraints (priors) allow veridical perception of unfamiliar shapes. Note that our “shapes” are measured by applying the Minimum Description Length Principle, making it a modern version of the Gestalt Law of Prägnanz. It is also similar to Leeuwenberg’s Structural Information Theory.
A pluralist approach to Gestalts
Peter van der Helm
According to the law of Prägnanz, Gestalts result from a nonlinear process: like any physical system, the brain tends towards relatively stable neural states characterized by cognitive properties such as symmetry, harmony, and simplicity. This idea has led, initially, to representational approaches modeling those cognitive properties, and later, to dynamic-systems approaches modeling those neural states. Not surprising, this modeling duality triggered a controversy about which of these two kinds of approaches might be the better one. Now, however, it is time to realize that these two kinds of approaches are complementary, and that both stories are needed to tell the whole story. Future research may reveal whether the two complementary stories remain different or can be merged into one story, but a bridging function may be played by connectionism -- not so much because of its theoretical ideas, but rather because of the modeling tools it borrowed from mathematics.