Self Improvement

6 pages

Fuzzy perceptual grouping in image understanding

Please download to get full document.

View again

of 6
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
The image understanding work we present is part of a navigation support system. We explore the use of fuzzy techniques for image understanding via perceptual organization. A brief review of previous work on perceptual organization introduces our
  Fuzzy perceptual grouping in image understanding* Anca L. Ralescu** and James G. Shanahan Laboratory for lntemational Fuzzy Engineering Research (LIFE) Siber Hegner Building 4th floor 89-1 Yamashita-cho, Naka-ku, Yokohama 231, Japan (e-mail: jimi@ Tel: (+81)(45)212-8239 / 212-8258) (fax: (+81)(45)212-8255) Abstract The image understanding work we present is part of a navigation support system. We explore the use of fuzzy techniques for image understanding via perceptual organization. A brief review of previous work on perceptual organization introduces our motivation for using fuzzy techniques for representation which in turns entails their use for reasoning about the result of the organization. The approach is supported by experimental results obtained for an ofice scene environment. 1. INTRODUCTION The opening phrase of [IO] paraphrases Aristotle in stating that vision is fo know what is where. Marr points out that indeed, this seems to be the essence of vision both from the plain man's and scientist's position. However, this definition is deceivingly simple as can be seen from the fact that, the nature of vision is yet to be understood despite studies carried out in philosophy, physics(optics), psychology/cognitive science, biology, neuroscience and, most recently, computer science (computer vision). Prior to the advent of computer vision, studies in vision were concemed with different aspects of the human vision system. On the other hand computer vision concems itself with producing theories of vision which can be implemented successfully in a computer. The hope, rather than the requirement, is that these theories can and will help to better understand the human vision system as well. This aspect of computer vision becomes more pressing as the desire to build machines which can interact with, or give some kind of support to the human user are strongly desired. Indeed, one can say that the ability to build such machines will be the most important contribution of the research in intelligent systems. Thus, to a large extent, the way humans understand images will offer valuable clues on how such machines should be built. It is beyond the scope of this paper to give a full review of the impact on computer vision of studies on The order ol the authors' names is strictly alphabetical. Correspondence can bc addressed to either author. Engineering University of Cincinnati, USA. On leave from the Department of Electrical and Computing * human visual perception. However, we must mention the Gestalt theory of perception on which the perceptual organization approach in vision [20] is based. The basic idea of Gestalt is embodied in its Pragnanz law, according to which the human activity in general, and perception in particular consist in identifying wholes. Wholes are organized according to properties such as similarity, proximity, closure, good continuation, regularity, symmetry, simplicity, etc. However, the Getsalt psychologists have not succeeded to explain just how these wholes could be obtained. According to [IO] there are two reasons for this failure: lack of mathematical knowledge and of an information processing approach. n important aspect of visual perception which will be useful for vision is that perception is an active inferential process. This has been the starting point for many studies on perception [5] or in visual perceptual grouping 161, [91, [Ill, 1121, 1151, [161. In 151 perception is viewed as a process of generating hypotheses, the outcome of the perception corresponding to the most probable explanation of stimulation received. In [9] perceptual organization is controlled by a probabilistic mechanism based on the notion of 'non accidental grouping'; in [I51 the perception is viewed as inference in a Bayesian network. These approaches make use of prior knowledge in the form of the prior probabilities needed in the Bayesian approach, knowledge about non accidental groupings, about probability of explanations given stimuli, etc. It is not clear to what extent prior knowledge is needed in order for perception to take place. An elaborate discussion on the nature of perception as a direct function of what is present in an image is given in [2] In more recent times, evidence from clinical neuroscience [I 71, [ 181, as well as psychological theories indicate that prior knowledge is not neccssary (and some cases of brain lesions, it is even useless). Gibson's 13, 41 concept of 'direct perception' maintains that the optic information in the image provides more than enough to enable perception. Marr [IO] also emphasizes that image representations must be in terms of tokens which can be extracted reliably and repeatedly from the image and which correspond to changes in the viewed surface. 0-7803-2461-7/~5/ 4.00 99.5 IEEE 1267  A thorough review of perceptual organization work in vision is given in [15], where different methods are discussed according to the level at which perceptual organization is applied, the computational approach, or the type of images to which they apply. Whether heuristics-based or analytical, these studies embody some aspect of the Gestalt Pragnanz principle. However, it is remarked in [15] that while most image processing and recognition tasks could be stated as perceptual grouping tasks there is yet no approach, no study which incorporates all the ideas put forward by the Gestalt movement. This indicates how difficult the issues are. In this paper we follow the approach we first described in [I31 and 181. Similarly to [lo] we consider the vision process as consisting of three tasks, selection, grouping and discrimination of tokens extracted from the image. Selection forbids combination of dissimilar tokens; grouping specifies properties used to combine similar tokens; discrimination controls grouping creating boundaries. Gestalt ideas are used in deciding grouping properties which are expressed as fuzzy predicates - hence fuzzy perceptual grouping (FPG). The fuzzy predicates are specified as fuzzy sets, triangular, trapezoidal, or semi-trapezoidal, requiring respectively three, four or two parameters. Although these parameters can be considered as prior knowledge they can be easily tuned making it unnecessary that they be specified exactly. The approach is bottom-up, based solely on the data in the image, no other prior knowledge except the fuzzy sets parameters is used. The work closer to ours is that of [7], in that fuzzy predicates are used to express grouping properties. However, unlike [7] in our work the use of fuzzy techniques extends to making inferences about the result of grouping, to constructing new tokens. 11. FUZZY ERCEPTUAL GROUPING OPERATORS The particular instance of our work is that of images to which noise reduction and edge operators have been applied. The input is a collection of line segments fitted from the results of edge detection. The goal is to obtain contours in the image which are reasonablc approximations of the objects present in the image. For many reasons (physical and mathematical) (illumination conditions, inter-object reflections, shadows, occlusion, general purpose edge detectors, etc .... many linear segments which should be present in the image get segmented and displaced making it combinatorially explosive to reason with these detected line segments. Also near junctionsicomers or close presence of other strong features, these line segments get displaced from the straight lines that correspond to the region boundaries, thus making simple collinearization impossible. Thus, in Gestalt terminology we can say that grouping is restoring the wholes (in this case longer line segments, and L-junctions). The properties of similarity and proximity play an important role in grouping. These properties are defined differently for constructing line segments and for L-junctions. In all cases though the vague nature of these properties lends itself to representation using fuzzy sets. A. Perceptual organization for obtaining straight line segments grouping and discrimination) Grouping: Given a collection S of straight segments (fitted from the results of edge detection, after noise reduction) we want to obtain a collection SI of straight segments that summarizes S in the following sense: (i) IS11 < IS1 , where I I denotes the cardinality of a set. (ii) Each segment in SI is obtained via a grouping operation from a subset of segments in S (iii) Each segment in S1 is at least as long as the longest segment in the grouping which produced it. The grouping properties for producing straight line segments are similarity (equal slope), proximity (close distance) shown in Figures 1 and 2 respectively. The proximity is measured in two ways (Fig 2 ), using the perpendicular distance and what we call the parallel distance (equivalent to the endpoint distance of [7]). These are integrated using the fuzzy logic min operator for conjunction to obtain the overall degree proximity. The parameters for the fuzzy sets are either provided by the user, or they can be learned, based on knowledge of scene, camera position, etc. The collinearity of two segments is defined as the aggregation of similarity and proximity, that is: Collinearity = H(similarity, proximity) where H : [0 1 x [0 11 + [0, 1 is an aggregation operator of fuzzy logic; in our experiments H(a, b) a+b -, but other aggregation operators are possible. 2 8 2 b) Figure I: The similarity between two segments 1268  Perpendicular Distance (ul = 4, u0 - 8) Parallel Distance (ul - 7, u0 = 11 Distance in pixels , 2 . ...-.------ . -* . . ul u Figure 2: The proximity of two segments ~~________ (c) perpendicular projection Figure 3 Determining the extent of the inferred segment (on the line obtained in i)) Given a linc segment Lo (usually the longest segment) coll-Lo is the collection of segments which satisfy the collinearity property with Lo. coll-Lo can be thought of as a fuzzy segment. Next this segment is defuzzified in order to obtain a crisp representative. This requires two operations: (i) determine location, and (ii) determine extent. For (i) the slope and one point on the line containing the representative segment are obtained by defuzzifying the fuzzy set of slopes coII-00, and the fuzzy set of midpoints coll-MO, for the segments in coll-Lo: COll-@o e l + eI/pI+ .. + ek/pk Coll-Mo= MdI + Ml/p{+ .. +Mk/pk Several defuzzification methods are available, the most common being Center of Gravity method COG) according to which we select and e=ZwiOi, M=ZwiMi PI , . where w'-- I, j=O, I ,..., I Zp For (ii) theextent of the segment is determined from the max/min projections of the endpoints of the segments in coll-Lg on the line determined in (i). These projections can be true projections (i.e. perpendicular on the this line) or along the x- y- axis (Fig. 3  a, b). iscrimination overlap of tw segments); Experiments show that callincarity is usually overridden for segments which overlap (in a sense to be explained below). This suggcsls that the overlap is acting as a discrimination opcrator betwccn the groupings to which the overlapping segments belong. In a strict sense the overlap is defined between segments which are strictly collinear, that is they are on thc same linc as follows: (i) two strictly collinear segments, S and T overlap if SnT 0 (ii) the degree of overlap, 0 of two strictly collinear segments is given by O= ~ s nT', where denotes the ISuTl length of a segment. In our problem, two segments S and T are seldom (if ever) on the same linc. We extend the concept of overlap to segments which are not necessarily on the same line as follows: Given two segments S=AIA2 and T=BIB2 and the usual distance between two points, d, we consider successively: (a) E= { dij ; dij-d(Ai, Bj), i, j=1, 2). the collection of endpoint distances; (b) dps(P, T), the distance from a point, P, to a segment, T: dps(P, T) inf{d(x, P) I XET}; It can be seen that dps(P, T)= min {d(P, BI), d(P, B2)} PQI otherwise QPT where A is the line on which T lies and Q=PrA(P) is the projection of P on A. e) dd(S, T), directed distance between two segments S and T: dd(S, T)=sup {dps(x, T) x E S}. It can be seen that dd(S, T) = max(dps(A1, T), dps(A2, T)}. We define now the quantity N= max {d; d&} - [dd(S, T) + dd(T, S)] and finally the overlap O S, T) as NvO O S, T) _____ max{d; dsE} It is easy to show that : - if S and T are strictly collinear O S, T) reduces to (ii) - if S and T are perpendicular O S, T)=O. - if S and T are parallel, the overlap decreases if dd(S,T)+ dd(T,S) increases. 1269  abs(Nh0) Remark: Similarly, we have NO(S, T)= max{d; deE}' the degree of non-overlap between S and T; for strictly collinear segments this is equal to the gap between segments relative to the largest distance between endpoints. B. Perceptual organization for obtaining junctions Once the line merging has reached a stable state, junction inference can be carried out. An image junction is a set of lines which co-terminate(Fig. 4). In this work only L-junctions are considered. Other possible junctions, are defined in terms of L-junctions which share a common segment and which satisfy the similarity and proximity criteria. In practice, lines rarely terminate exactly at the same point. Instead they terminate in a small common region. In constructing junctions the grouping properties of L-junction proximity and L-junction angle constraint are used. L- Junction proximity is defined in terms of the minimum endpoint distance between line segments as depicted in Figure S(a). In defining the L-junction angle constraint in terms of the inner angle, highly collinear segments will not be considered. The membership functions for L-junction angle constraint and L-junction proximity are shown in Fig. S(b), (c) respectively. Image Name Image Size Initially Fitted CPU time Line Segments Lines after FPG (seconds approx.) Desk on the left (Fig. 6) 512 X 432 65 4 291 11 Desk on the right (Fig. 7) 512 x 432 360 176 S Figure-4: Examples of junctions (L-junction. arrow, fork tree etc.) Euclidean distance b) C) Figure 5: L-junctions: a) example of L-junction (endpoint distance and inner angle); (h) proximity membership function; (c) L-junction angle membership function (dashed membership functions are for proximity of collinear segments). 111. EXPERIMENTAL RESULTS The approach presented in the previous sections is currently being implemented in the framework of a navigation support system using linguistic instructions (LINSS). In this system the navigating agent is assumed to posses (to various degrees) independent means of navigation (such as obstacle avoidance), the navigation task taking place in an unfamiliar environment. For test purposes we consider an office scene environment. In this section we present two experiments in this environment, and the inference results for straight line segments and L-junctions. The system is currently implemented in FRIL [I] (a prolog like, support logic based system, useful for fast program development), C, and Khoros (an integrated software development environment for image processing and visualization [ 141) on a Sun Sparc IO work station. Straight lin segments results: Figures 6(a-d) and 7(a-d) show the straight line inference results for two office scene images. Data concerning the two images and performance of the algorithm are summarized in Table 1. The CPU times shown do not include the initial noise reduction and edge1 extraction processing time. Junction resulrs: The output of straight line segment inferences input to the L-junction inference operator. Short lines (less than Ispixels long were eliminated). Fig. 8(a, b) show the results of inferring L-junction from Figures 6(c) and 7(c) respectively. Table 2 shows the performance of the algorithm. IV. CONCLUSIONS Fuzzy sets/logic based concepts are a natural choice for representing perceptual grouping criteria as well as for recursively constructing new structures from grouped input structures. This is supported by the evidence of the initial results, for straight line segments, and L-junctions presented in this paper. In future work we will address inference of higher order structures and developing an efficient algorithm for applying a collection of grouping operators. TABLE 1: Summary of experimental results for fuzzy perceptual grouping (straight lines) 1270   b) (d) Figure 6 Results for the image desk on the left (straight line segment inference): (a) Original gray scale image; b) Image after noise reduction and edge1 extraction; (c) Results of FPG d) Original image with overlaid FPG results (d) b) Figure 7: Results of FPG for the image desk on the right (straight line segment inference): (a) Original gray scale image; b) Image after noise reduction and edge extraction; (c) Results of FPG d) Original image with overlaid FPG results 1271
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks

We need your sign to support Project to invent "SMART AND CONTROLLABLE REFLECTIVE BALLOONS" to cover the Sun and Save Our Earth.

More details...

Sign Now!

We are very appreciated for your Prompt Action!