# «EXTRACTION OF CONTEXTUAL KNOWLEDGE AND AMBIGUITY HANDLING FOR ONTOLOGY IN VIRTUAL ENVIRONMENT A Dissertation by HYUN SOO LEE Submitted to the Office ...»

For this reason, an effective 3D reconstruction technique must have the following

**characteristics:**

Fast and full-scale 3D reconstruction method

The following section describes a new 3D reconstruction method and how the virtual ontology is generated.

## 5. 2 FUZZY DYNAMIC PROGRAMMING AND AMBIGUITY HANDLING

To solve for ambiguities, we apply fuzzy dynamic programming (FDP) to extract Z-depth information about image pairs which we then map to Metaearth architecture.FDP was originally proposed by Jadeh and Bellman and it has been used to solve optimization problems with multi-stages and ambiguities [65, 66].

The suggested approach is termed the “hybrid virtual ontology generation” method. Figure 49 shows the general procedure. Multi-view images are used as the inputs and rectified to eliminate image distortion. Next, the fuzzy color-based segmentation method (Section 4) is applied. The result is that each input image now has fuzzy color and segment. FDP is applied for handling color-related ambiguities and occlusion-related ambiguities, the more accurate epipolar constraints are generated and the disparity map is calculated. Finally, Z-depth is calculated and mapped into each region generated via the fuzzy color-based segmentation method.

** Figure 49. Procedure of Z-depth extraction using fuzzy dynamic programming.**

The suggested algorithm can be compared with Klaus et al.’s segment-based stereo matching method [67]. The main assumption of the segment-based algorithm is that the scene can be approximated by a set of non-overlapping planes. Segment-based

**methods generally perform with these consecutive algorithms [67] :**

1) Identify regions of homogeneous color using color segmentation methods

2) Determine disparities of reliable points using local window-based matching

3) Obtain disparity planes using the plane fitting algorithm

4) Generate an optimal disparity plane.

Even though Klaus et al.’s algorithm has been as a good disparity map generator experimentally, it is limited in terms of handling occlusion and is based on constant color information.

**The suggested algorithm consists of five steps:**

Step 1 : Image rectification Step 2 : Fuzzy color-based color segmentation method Step 3 : Calculation of epipolar constraint using FDP Step 4 : Calculation of Z-depth Step 5 : Mapping Z-depth information The following sections explain each step with examples.

## 5.2.1 IMAGE RECTIFICATION AND FUZZY COLOR-BASED SEGMENTATION

The objective of image rectification is to eliminate distortions from cameras’ lens and several projections such as affine transformation and projective transformation. In this dissertation, Fischler and Bolles [68] ’s RANdom Sample Consensus (RANSAC) algorithm is applied. RANSAC method is successful and robust method in image rectification [63]. The corresponding points are matched in image pairs and 2D homography is computed. Using this 2D homography, the input images are rectified via the Levenberg-Marquardt algorithm (LM method) [63], a nonlinear programming method. The detailed description of image rectification using RANSAC and LM method is described in Hartley and Zisserman [63]. Figure 50 shows the process.(a) Stereo vision using two cameras (b) Left image from stereovision system.

The image (Figure 50 (b)) from stereovision (Figure 50 (a)) may contain several distortions. Using RANSAC and LM method, each stereo image is rectified to get Figure 50 (c). As the result, the corresponding point of the targeted pixel on the epipolar line

**exists on the same vertical axis thus satisfying equations (8) and (9):**

In these equation, ( xl, yl ) and ( xr, yr ) represent the corresponding point pair in the left and right images. However, it is very difficult to satisfy equation (8), even with the application of the RANSAC estimator and LM algorithm. In real application,

**equation (8) is replaced by equation (10):**

The existence of d ' and d ' ' makes it difficult to extract exact Z-depth information. This issue is resolved in Section 5.2.2.

After rectifying the image, fuzzy color-based over-segmentation method is applied to each input image. Through this process, each image is segmented and each pixel of each image has fuzzy color. The fuzzy color is used as an input to FDP.

## 5.2.2 AMBIGUITY HANDLING USING FUZZY DYNAMIC PROGRAMMING

Using image rectification process and fuzzy color based over-segmentation method, corresponding points in left and right image are located in similar Y position.However, as d ' and d ' ' (in equation (9) and (10)) exist, accurate matching may not be possible. Most stereovision approaches use block matching methods for resolving this issue, but block matching methods with FDP is used. Figure 51 shows the block matching method using fuzzy colors.

A targeted pixel in the left image is compared with a region within a block in the right image. In this block matching, the compared measure is nothing but fuzzy color which is extracted using fuzzy color based over-segmentation. The usage of fuzzy color can contribute to mitigating color-based ambiguity. In addition, it makes it possible to use equation (7). With the assumption that fuzzy colors within small blocks are the same, equation (11) is derived.

This equation is used as one constraint of FDP. We modify it from Faugeras et al.’ dynamic programming for occlusion handling [64]. Figure 52 shows how dynamic programming can detect occluded and non-occluded regions.

** Figure 52. Faugeras et al.**

’s dynamic programming for detecting occluded regions [64].

While Faugeras et al.’s method has limitation in stereo image pairs with different color- based ambiguity, the suggested FDP overcomes the limitation by fuzzy colors. The objective function of FDP consists of the weighted sum of a fuzzy sum of absolute dissimilarity cost and a fuzzy gradient-based dissimilarity cost. Equation 12 defines the fuzzy sum of absolute dissimilarity cost.

This dissertation uses 5 as the initial block size ( N ), which means that a pixel in the left image can be corresponded to one pixel of 25 ( 5 5 ) pixels in the right image. The fuzzy gradient-based dissimilarity cost is defined in equation 13.

Equation (14) is the weighted cost in terms of a specific y value. A value of 0.8 is used as an initial . In terms of the Y axis, the total cost function of FDP is defined in Equation (15).

The decision variable is the difference between a target pixel’s x position in the left image and the X position in the right image (= d ). As a constraint of d, we consider the range between the initial stage’s minimum and maximum cost function values. Using the backtracking approach of dynamic programming, we obtain the minimizing difference. Finally, the accurate matching pair and occluded regions are detected. Since it generates a more accurate epipolar constraint, the exact Z depth information can be extracted. Next, we describe how Z depth information can be extracted and mapped to regions.

## 5.2.3 CALCULATION OF Z DEPTH AND MAPPING PROCEDURE

After detecting d using FDP, a more accurate epipolar constraint is calculated.Z-depth is calculated using these epipolar constraints and the affined fundamental matrix approach. In this dissertation, two camera matrices are assumed as affined camera matrices. Since two corresponding point pairs ( x and x ' ) satisfy equation (16), the fundamental matrix( F ) is calculated using the Least Mean Square (LMS) method [69].

In this dissertation, the matching point x is selected from segmented regions. From a segmented region, several x are selected randomly and X is calculated using the suggested method. Finally, the region’s Z depth is defined as the average Z value of the calculated X in the region. Thus, each region’s pixels have the same Z depth.

## 5.3 GENERATION OF VIRTUAL ONTOLOGY WITH Z DEPTH INFORMATION

Z-depth information is extracted from the proposed FDP method as described in Section 5.2. The usage of fuzzy dynamic programming is caused from the extraction of fuzzy color and fuzzy color based segmentation. Then, each segmented region has the same Z depth using the method shown in section 5.2.3.Finally, Metaearth architecture with Z depth information is generated. As described in Sections 4.2, 4.3 and 4.4, the generation sequence and procedure are the same. The difference is that a vertex in the virtual space layer has Z depth information as shown in Figure 29. This characteristic causes a great influence on object-merging process (Section 4.2) and semantic-merging process (Section 4.3).

In the object-merging process, the depth-related merging condition is added.

Z depth-related merging condition

This constraint makes it possible to distinguish objects with similar color and different distance. This constraint is added in the semantic merging condition for more clearer mapping. Figure 54 is an application of FDP.

As another example, Tsukuba stereo image pairs are tested for generating virtual ontology with Z-depth. Figure 55 shows the sequence of generating virtual ontology using the suggested approach.

As a result of the proposed approach, the generated virtual model has virtual ontology with Metaearth architecture. Unlike the method in Section 4, the generated virtual ontology has Z depth. In other words, this model can be considered as real 3D virtual ontology. The remaining task is to map contexts to the generated Metaearth architecture.

This section describes how to map contexts into Metaearth architecture. Using the method presented in Sections 4 and 5, we generate the architecture from a 2D scene or multi-view scene. Recall that there are four layers (virtual space; mapping; library;

ontology). We generate the virtual space layer from the fuzzy color-based segmentation process and the object-merging process. The mapping layer and the library layer are generated from the semantic-merging process. Finding that context mapping is manually conducted by operators or pre-trained detectors (whereby a poorly trained operator or detector may cause poor mapping) in similar approaches, the mapping methodology relies on graph isomorphism. Since Metaearth architecture is a type of graph as a format of virtual ontology, isomorphic matching properties can be applied with little error. The following sections explain graph isomorphic matching and the suggested algorithm.

Graph isomorphism used for matching the two graphs using the two mapping functions and . If there are two mapping functions satisfying equations (23) and (24) in two graphs such as G1 (V1, E1 ) and G2 (V2, E2 ), it is defined as G1 is isomorphic ( G1 G2 ) to G2 [70].

We observe that the graphs are isomorphic using equations (22) and (23). Yet we can state only that the two scenes are identical if each depth’s nodes are identical.

In general applications, most problems belong to the category of “graph-subgraph isomorphism” which is defined as the existence of only one equation – (22) or (23).

In Figure 57 below the two graphs have a graph ( G2 )-subgraph ( G1 ) isomorphic relationship.

** Figure 57. Graph-subgraph isomorphism between G2 and G1.**

This concept can be used in many graph-based pattern matching problems.

Figure 58 is one example of the graph-subgraph isomorphism test. The problem is proven as NP-Complete problem by Gray [71] and a near-optimal solution is provided by researchers such as Ullmann [72] and Cordella et al. [73]. The comparisons among existing subgraph isomorphic matching algorithms are tested and compared in Battiti and Mascia [74] and Foggia et al.[75].

** Figure 58. Graph-subgraph isomorphism in CAD feature recognition.**

Even though it is an NP-complete problem, Messmer and Bunke [76] prove that it can be solved in polynomial time. In this dissertation, the context mapping is achieved using this graph-subgraph isomorphism. The following section describes the context mapping in Metaearth architecture.

## 6.2 METAEARTH-SUB METAEARTH ISOMORPHISM AND CONTEXT MAPPING

According to the section above, a subgraph can be considered as a pattern and a graph is considered as a target graph. A subgraph ( G1 in Figure 57) is used in a pattern and a graph ( G2 ) as a target graph with patterns which are to be detected. Our objective is to assign contexts into the ontology cores in Metaearth architecture. We use context library for the mapping. Figure 59 shows the conceptual context mapping process with context library and Metaearth architecture.** Figure 59. Context mapping process with context library and Metaearth architecture.**