# «EXTRACTION OF CONTEXTUAL KNOWLEDGE AND AMBIGUITY HANDLING FOR ONTOLOGY IN VIRTUAL ENVIRONMENT A Dissertation by HYUN SOO LEE Submitted to the Office ...»

Definition 5. Mathematical definition of virtual ontology

**Virtual ontology is defined as a graph:**

In definition 5, VC is a set consisting of a virtual space layer in the Metaearth architecture. C 1 and C 2 are the components consisting of the library layer and ontology layer. Finally, the context C 2 is mapped to C 1 using CR and C 1 is linked to each VC through MR. From this viewpoint, MR can be considered as a component of the mapping layer in Metaearth architecture. Virtual ontology can be represented using this Metaearth architecture. The next sections explain how we can automatically or semiautomatically generate Metaearth architecture from scenes.

## 4. CONSTRUCTION OF VIRTUAL ONTOLOGY USING ONE SCENE

This section describes how to generate Metaearth architecture with virtual ontology from a 2D image. The process starts by over-segmenting a scene (the 2D image) using the proposed algorithm. Recall that the suggested approach generates relationships between components which are not adjacent as an additional advantage.

** Figure 16. A general procedure of existing scene understanding algorithm.**

Next, we group the segmented regions via the object merging process and the semantic merging process. Finally, the 2D scene is converted into Metaearth architecture (described in Section 3.2). Figure 16 shows the general procedures of existing scene understanding techniques using IDEF 0 diagram and Figure 17 illustrates the procedure broken down by steps and sub-steps.

** Figure 17. The general procedures for generating virtual ontology from a scene.**

The three sub-steps depicted in Figure 17 and detailed in Figure 18 are: 1) fuzzy color-based over-segmentation; 2) object merging; and 3) semantic merging.

1) Fuzzy color-based over-segmentation The approach considers over-segmentation as a pre-processor which is dependent on prior knowledge or training mechanisms. Most methods such as Chung and Fung [58] use a pre-defined fuzzy color for segmentation, but the result and quality are not stable, because uniform color quantization or trained fuzzy color are not extracted from an input.

** Figure 18. Detailed sub-processes of virtual ontology generation from a scene.**

2) Object merging The segmentation process generates an initial graph structure using adjacency among segments. This graph forms our initial layout for generating the virtual space layer in Metaearth architecture.

3) Semantic merging Having constructed the virtual space layer, we can now generate the core layer and mapping layer via the semantic merging process.

The following sections explain the three sub-processes with suitable examples and implementations.

## 4.2 FUZZY COLOR-BASED OVER-SEGMENTATION AND GENERATION OF

Over-segmentation is considered as a preprocessing stage. Using an image with 1024 768 pixels, the initial number of overall pixels is 786,432( 1024 768). Note that handling around 700,000 variables is considered a large-scale optimization issue. In addition, each pixel has many properties such as true color, hue or lamination. In true color, a pixel has red (R), green (G) and blue (B). Each R, G, B value has one value out of 0 to 255 ( 28 ). If a pixel has a true color, the size of the variable increases to 786,432 2 24. Obviously, the increased size contributes to computation complexity;

thus, in order to reduce the variables we conduct random sampling to select a certain amount of pixels. Considering the input image’s height and width, we extract 1/ 700 pixels. Figure 19 shows the randomly sampled pixels.

The true colors in the extracted pixels are selected by 1/700 scales of our original image size. Figure 20 shows the R, G, B plotting from the extracted pixels.

** Figure 20. Red, Green and Blue color plotting from randomly extracted pixels.**

The objective of this process is to obtain K-means fuzzy color quantization from the input image. The problem is to determine the K value. In most K-means algorithms or KNN approaches, the value K is assumed or determined using learning methods.

Figure 21 shows the general K-means algorithm.

To determine an effective K value, we modify Ray and Turi’s method [59].

Their method uses the intra-cluster distance measure and inner-cluster distance measure.

In their method, K value is determined by some condition such as minimizing the innercluster scatter and maximizing the intra-cluster separation.

Definition 6. Inner distance

the desirable K. In this manner, optimal K is calculated using a nonlinear programming in Table 5.

** Table 5. K-fuzzy color determination using nonlinear programming.**

The objective function of the nonlinear programming has four types of constraints. The first and second guarantee a locally minimizing ratio distance and the

distance between two clusters is usually larger and can be a trivial solution. The nonlinear programming allows us to calculate an optimal K value from the input image.

Figure 22 shows this process. Thus, we obtain the value 12 as the optimal K minimizing the ratio distance.

** Figure 22. Determination of K using nonlinear programming.**

Finally, the R, G, B color values from the randomly extracted pixels (Figure 20) are segmented into 12 regions (Figure 23).

** Figure 23. Segmented regions using nonlinear programming.**

After segmenting color spaces are acquired using an input source, fuzzy colors are extracted from each segmented region. As this fuzzy color extraction is based on the input source, it is called “dynamic fuzzy color extraction”. Other methods such as Chung and Fung [58] and Konstantinidis et al. [60] use a predefined fuzzy color or trained parameters for extracting fuzzy colors. In this concept, existing methods can be considered as “static fuzzy color generations”. In this dissertation, a triangular fuzzy membership function [61] is used for defining fuzzy color. Definition 9 represents triangular based fuzzy color.

Definition 9. Fuzzy color

2. Input image-based fuzzy color generation has superiority.

The first advantage means that a specific pixel can belong to many segments (one ~ to many mapping) with I i, j utility values. This characteristic is useful for determining ordering the next best segment in a true color-based color.

Finally, the image is re-segmented using the generated fuzzy colors. Figure 25 shows that each red boundary-based segment is obtained by measuring the distance between a pixel’s color and the generated fuzzy colors. We select the closed fuzzy color as an alternating color of the pixel. If the 8 adjacent pixels near the pixel have the same fuzzy color, it is determined as the same group.

However, fuzzy color-based segmented regions exhibit several problems, one of which is over-abstraction. Even though fuzzy color can map one pixel to multiple colors, the existence of the small fuzzy colors (= K fuzzy small colors) occurs in overabstraction, i.e. comparatively distinct colors are grouped in one segment. Figure 26 shows an example of over-abstraction.

From the 557 initially segmented regions, we select the first and second regions.

The second region as shown in Figure 26 (b), represents part of the cloud. It can be considered a substantially over-segmented region because it has only one context (cloud). The first region with three contexts, sky, cloud and building wall, is considered “over-abstraction”. The occurrence of over-abstraction due to the small number of fuzzy colors requires re-segmenting some regions. To prevent over-abstraction, the “cell division using EM algorithm”, a statistical method which is guaranteed to converge to the maximum likelihood estimator (MLE) for estimating a parameter, is proposed. An abbreviation of the Expectation-Maximization algorithm, EM algorithm is based on the concept of replacing one difficult likelihood maximization with a sequence of easier maximization whose limit is the answer to the original problem [62]. Table 6 delineates the procedure.

Step 1 : In each fuzzy color segmented region, R, G, B histogram is generated.

Step 2 : Estimation of Gaussian mixture distributions In each R,G,B histogram, a Gaussian mixture distributions is mapped.

** Step 2.1 : An estimation of unimodal Gaussian mixture distribution using direct fitting - regression Step 2.**

2 : An estimation of multimodal Gaussian mixture distribution using EM algorithm

Here, we assume that each color histogram from a specific segment can be explained using Gaussian mixture distributions. Using the suggested algorithm, the overabstracted region is divided into regions. Figure 27 shows the result.

In this figure, one region is divided into 4 regions. The histogram of the red and blue color are fitted by a bimodal Gaussian mixed distribution and the histogram of the green color is fitted by a unimodal Gaussian mixed distribution. Finally, the 4 inner regions ( 4 2 1 2 ) are segmented using cell division.

Another problem with the suggested fuzzy color-based segmentation is noise.

Figure 25 shows that our segmented image comprises many small areas, or “noise regions”. However, even though each has a specific color and context, we can safely ignore in the interests of over-segmentation. We use 1,000 pixels as a predefined size.

Figure 28 shows the maximum size of the noise region (the blue box).

This noise region is merged with other noise regions or non-noise regions. As candidate cells to be merged with a noise region, adjacent regions are listed and fuzzy color intensities in each R,G,B domain are compared. If their differences are less than our pre-defined small value, we merge the noise cell with the targeted candidate cell. In this process, the threshold values are different in each R, G, B color domain. Each threshold value is determined by the estimated parameters of the Gaussian mixture distributions as shown in Table 5. From the means and variance estimator in each R,G,B domain, we calculate the range i, j i, j, where i represents each color domain ( 1 R, 2 G and 3 B ) and j means j th as j th estimator. The reason to use 1 is that it is enough for extracting the threshold values empirically. From each i, j i, j plotting, the minimum inter-distance is calculated and used for a threshold value. Figure 29 is a graphic illustration.

Figure 29. Extraction of a threshold value for noise region handling in the Red color

If the minimum value is 0 or negative value while extracting i;i 1, 2,3, the next positive value is determined as a i;i 1, 2,3. After extracting i;i 1, 2,3, the R,G,B intensity differences between the target noise region and adjacent regions are measured. If these intensities are less than i;i 1, 2,3, two regions are determined to be merged. Then we merge the region with smaller pixels into the pixel with larger pixels. Figure 30 shows the result after cell division and noise region handling.

Compared with the initial segmentations (Figure 25), our finalized image of 78 regions lacks noise and many over-abstracted regions.

The final step consists of constructing a virtual space layer for Metaearth architecture.

Figure 31 shows the fuzzy color-based over-segmented region after mapping its fuzzy colors. In this example our 1024 * 768 image is compressed into 78 regions.

Our objective is the construction of Metaearth architecture containing virtual ontology. The following section describes the first two steps shown in Table 4: the identification of virtual components from the real world and the generation of virtual space layers. In a general 2D image, we replace the first step with the fuzzy color-based segmentation so that each region represents one virtual component. This proposition is supported by the inference that each virtual component has unique color and the full virtual model can be assembled using these components. Thus, the over-segmented regions are converted into a graph structure using each region’s bonding intensity ( BI i, j )

**defined as:**

Definition 10. Bonding intensity ( BI i, j ) The bonding intensity ( BI i, j ) is the number of shared pixels between i th region and j th region.

Proposition 1. The symmetry of the bonding intensity

The proof of Proposition 1 is trivial. For the adjacency graph, the value in the i th row and the j th column is BI i, j ( BI j,i ). After constructing the adjacency graph, we construct the initial graph in which each vertex represents each region. The information for a vertex and an edge appears in Figure 32.

In this figure, the Z depth and texture information can be skipped by the input sources.

For example, the initial graph using Section 5’s methods has Z-depth information.

Figure 33 shows the generated initial graph from fuzzy color-based over segmented region.

** Figure 33. Conversion from fuzzy color-based segments into a graph.**

This generated graph becomes the virtual space layer in Metaearth architecture.