«Ear Recognition Biometric Identiﬁcation using 2- and 3-Dimensional Images of Human Ears Anika Pﬂug Thesis submitted to Gjøvik University College ...»
The ear can easily be captured from a distance, even if the subject is not fully cooperative. This makes ear recognition particularly interesting for smart surveillance tasks and for forensic image analysis. Nowadays the observation of characteristics is a standard technique in forensic investigation and has been used as evidence in hundreds of cases.
The strength of this evidence has, however, also been called into question by courts in the Netherlands . In order to study the strength of ear prints as evidence, the Forensic Ear identiﬁcation Project (FearID) was initiated by nine institutes from Italy, the UK, and the Netherlands in 2006. In their test system, they measured an EER of 4% and came to the conclusion that ear prints can be used as evidence in a semi-automated system .
The German criminal police use the physical properties of the ear in connection with other appearance-based properties to collect evidence for the identity of suspects from surveillance camera images. Figure 3.1 illustrates the most important elements and landmarks of the outer ear, which are used by the German BKA for manual identiﬁcation of suspects.
In this work we extend existing surveys on ear biometrics, such as ,,, or . Abaza et al.  contributed an excellent survey on ear recognition in March 2010.
Their work covers the history of ear biometrics, a selection of available databases and a review of 2D and 3D ear recognition systems. This work amends the survey by Abaza et al.
with the following:
• A survey of free and publicly available databases.
• More than 30 publications on ear detection and recognition from 2010 to 2012 that were not discussed in one of the previous surveys.
• An outlook over future challenges for ear recognition systems with respect to concrete applications.
In the upcoming Section we give an overview of image databases suitable for studying ear detection and recognition approaches for 2D and 3D images. Thereafter, we discuss existing ear detection approaches on 2D and 3D images. In Section 3.4 we go on to give an overview of ear recognition approaches for 2D images, and in Section 3.5 we do the same for 3D images. We will conclude our work by providing an outlook over future challenges and applications for ear recognition systems.
3. EAR BIOMETRICS: A SURVEY OF DETECTION, FEATURE EXTRACTION AND
3.2 Available Databases for Ear Detection and Recognition In order to test and compare the detection or recognition performance of a computer vision system, in general, and a biometric system in particular, image databases of sufﬁcient size must be publicly available. In this section, we want to give an overview of suitable databases for evaluating the performance of ear detection and recognition systems, which can either be downloaded freely or can be licensed with reasonable effort.
3.2.1 USTB Databases The University of Science and technology in Beijing offers four collections 1 2 of 2D ear and face proﬁle images to the research community. All USTB databases are available under license.
• Database I: The dataset contains 180 images in total, which were taken from 60 subjects in 3 sessions between July and August 2002. The database only contains images of the right ear from each subject. During each session, the images were taken under different lighting conditions and with a different rotation. The subjects were students and teachers from USTB.
• Database II: Similarly to database I, this collection contains right ear images from students and teachers from USTB. This time, the number of subjects is 77 and there were 4 different sessions between November 2003 and January 2004. Hence the database contains 308 images in total, which were taken under different lighting conditions.
• Database III: In this dataset 79, students and teachers from USTB were photographed in different poses between November 2004 and December 2004. Some of the ears are occluded by hair. Each subject rotated his or her head from 0 degrees to 60 degrees to the right and from 0 degrees to 45 degrees to the left. This was repeated on two different days for each subject, which resulted in 1600 images in total.
• Database IV: Consisting of 25500 images from 500 subjects taken between June 2007 and December 2008, this is the largest dataset at USTB. The capturing system consists of 17 cameras and, is capable of taking 17 pictures of the subject simultaneously.
These cameras are distributed in a circle around the subject, who is placed in the center. The interval between the cameras is 15 degrees. Each volunteer was asked to look upwards, downwards and eyelevel, which means that this database contains images at different yaw and pitch poses. Please note that this database only contains one session for each subject.
3.2.2 UND Databases The University of Notre Dame (UND) offers a large variety of different image databases, which can be used for biometric performance evaluation. Among them are ﬁve databases containing 2D images and depth images, which are suitable for evaluation ear recognition systems. All databases from UND can be made available under license 3.
• Collection E: 464 right proﬁle images from 114 human subjects, captured in 2002. For each user, between 3 and 9 images were taken on different days and under varying pose and lighting conditions.
• Collection F: 942 3D (depth images) and corresponding 2D proﬁle images from 302 human subjects, captured in 2003 and 2004.
1 http://www1.ustb.edu.cn/resb/en/doc/Imagedb 123 intro en.pdf 2 http://www1.ustb.edu.cn/resb/en/doc/Imagedb 4 intro en.pdf 3 http://cse.nd.edu/ cvrl/CVRL/Data Sets.html
Figure 3.2: Example images from the WPUT ear database .
The database contains ear photographs of varying quality and taken under different lighting conditions. Furthermore the database contains images, where the ear is occluded by hair or by earrings.
• Collection G: 738 3D (depth images) and corresponding 2D proﬁle images from 235 human subjects, captured between 2003 and 2005
• Collection J2: 1800 3D (depth images) and corresponding 2D proﬁle images from 415 human subjects, captured between 2003 and 2005 .
• Collection NDOff-2007: 7398 3D and corresponding 2D images of 396 human subject faces. The database contains different yaw and pitch poses, which are encoded in the ﬁle names .
3.2.3 WPUT-DB The West Pommeranian University of Technology has collected an ear database with the goal of providing more representative data than comparable collections 4 . The database contains 501 subjects of all ages and 2071 images in total. For each subject, the database contains between 4 and 8 images, which were taken on different days and under different lighting conditions. The subjects are also wearing headdresses, earrings and hearing aids, and in addition to this, some ears are occluded by hair. In Figure 3.2, some example images from the database are shown. The presence of each of these disruptive factors is encoded in the ﬁle names of the images. The database can be freely downloaded from the given URL.
3.2.4 IIT Delhi The IIT Delhi Database is provided by the Hong Kong Polytechnic University 5 . It contains ear images that were collected between October 2006 and June 2007 at the Indian Institute of Technology Delhi in New Delhi (see Figure 3.3). The database contains 121 4 http://ksm.wi.zut.edu.pl/wputedb/ 5 http://www4.comp.polyu.edu.hk/˜ sajaykr/IITD/Database c Ear.htm
3. EAR BIOMETRICS: A SURVEY OF DETECTION, FEATURE EXTRACTION AND
RECOGNITION METHODSFigure 3.4: SCface example images . These images show examples for the photographed pictures, not for the pictures collected with the surveillance camera system.
subjects, and at least 3 images were taken per subject in an indoor environment, which means that the database consists of 421 images in total.
3.2.5 IIT Kanpur The IITK database was contributed by the Indian Institute of Technology in Kanpur6 .
This database consists of two subsets.
• Subset I: This dataset contains 801 side face images collected from 190 subjects. Number of images acquired from an individual varies from 2 to 10.
• Subset II: The images in this subset were taken from 89 individuals. For each subject 9 images were taken with three different poses. Each pose was captured at three different scales. Most likely, all images were taken on the same day. It is not stated whether subset II contains the same subjects as subset I.
3.2.6 ScFace The SCface database is provided by the Technical University of Zagreb 7  and contains 4160 images from 130 subjects. The aim of the database is to provide a database, which is suitable for testing algorithms under surveillance scenarios. Unfortunately, all surveillance camera images were taken at a frontal angle, such that the ears are not visible on these images. However the database also contains a set of high resolution photographs from each subject, which show the subject at different poses. These poses include views of the right and left proﬁle, as shown in Figure 3.4. Even though the surveillance camera images are likely to be unsuitable for ear recognition studies, the high resolution photographs could be used for examining resistance to pose variations of an algorithm.
3.2.7 Shefﬁeld Face Database This database was formerly known as the UMIST 8 database and consists of 564 images of 20 subjects of mixed race and gender. Each subject is photographed in a range of different 6 http://www.cse.iitk.ac.in/users/biometrics/ 7 http://www.scface.org/ 8 http://www.shefﬁeld.ac.uk/eee/research/iel/research/face
Figure 3.5: Some example images from the NKCU face database, showing the same subject at different angles.
yaw poses, including a frontal view and proﬁle views.
3.2.8 YSU The Youngston State University collected a new kind of biometric database for evaluation forensic identiﬁcation systems . For each of the 259 subjects, 10 images are provided.
The images are grabbed from a video stream and show the subject in poses between zero and 90 degrees. This means that the database contains right proﬁle images and a frontal view image for each subject. It also contains hand drawn sketches from 50 randomly selected subjects from a frontal angle. However this part of the database is not of interest for ear recognition systems.
3.2.9 NCKU The National Cheng Kung University in Taiwan has collected an image database, which consists of 37 images for each of the 90 subjects. It can be downloaded from the university’s website9. Each subject is photographed in different angles between -90 degrees (left proﬁle) and 90 degrees (right proﬁle) in 5 degree steps. In Figure 3.5 some examples are displayed.
Such a series of images is collected at two different days for each of the subjects. All images were taken under the same lighting conditions and with the same distance between the subject and the camera.
As this data was originally collected for face recognition, some of the ears are partly or fully occluded by hair, which make this data challenging for ear detection approaches.
Consequently, only a subset of this database is suitable for ear recognition.
3.2.10 UBEAR dataset The dataset presented in  contains images from the left and the right ear of 126 subjects.
The images were taken under varying lighting conditions and the subjects were not asked to remove hair, jewelry or headdresses before taking the pictures. The images are cropped from video stream, which shows the subject in different poses, such as looking towards the camera, upwards or downwards.
Additionally, the ground truth for the ear’s position is provided together with the database, which makes it particularly convenient for researches to study the accuracy of ear detection and to study the ear recognition performance independently from any ear detection.
3. EAR BIOMETRICS: A SURVEY OF DETECTION, FEATURE EXTRACTION AND
RECOGNITION METHODSTable 3.1: Summary of automatic ear detection methods for 2D and 3D images
Original image, edge enhanced image and corresponding edge orientation model .
Figure 3.6: Examples for different ear detection techniques
3.3 Ear Detection This section summarizes the state of the art in automatic ear detection in 2D and 3D images respectively. Basically all ear detection approaches are relying on mutual properties of the ears morphology, like the occurrence of certain characteristic edges or frequency patterns.
Table 3.1 gives a short overview of the ear detection methods outlined below.
The upper part of the table contains algorithms for 3D ear localization, whereas the lower part lists algorithms designed for ear detection in 2D images.
Chen and Bhanu propose three different approaches for ear detection. In the approach from  Chen and Bhanu train a classiﬁer, which recognizes a speciﬁc distribution of shape indices, which are characteristic for the ear’s surface. However this approach only works on proﬁle images and is sensitive to any kind of rotation, scale and pose variation. In their later ear detection approaches Chen and Bhanu detected image regions with a large local curvature with a technique they called step edge magnitude . Then a template, which contains the typical shape of the outer helix and the anti-helix, is ﬁtted to clusters of lines. In  where Chen and Bhanu narrowed the number of possible ear candidates by detecting the skin region ﬁrst before the helix template matching is applied on the curvature 9 http://robotics.csie.ncku.edu.tw/Databases/FaceDetect PoseEstimate.htm