«BY GEORGIOS DIAMANTOPOULOS A THESIS SUBMITTED TO THE UNIVERSITY OF BIRMINGHAM FOR THE DEGREE OF DOCTOR OF PHILOSOPHY DEPARTMENT OF ELECTRONIC, ...»
A total of four subjects (all male, Caucasian, 25-35 years old) took part in this experiment. The subjects stood in front of the screen at a distance of 50cm and a chinrest was used to fix their head such that head-tracking can be turned off for EyeLink (head-tracking in EyeLink depends on four markers that need to be placed at the corners of the screen whose bounds must be within the specified tracking range ±30°/±°20). Additionally, the chinrest serves to eliminate errors from slight head-movement due to swaying of the subjects while standing, for both eye-trackers.
The full screen was divided in increments of 100 pixels resulting in a maximum of points; the centre of the grid was manually adjusted according to the subject’s height such that he could focus on it by gazing straight ahead. Before the experiment, each subject tested his This was found to be true experimentally. Outside of this range calibration failed too often making the 8 test very cumbersome and tiring for both the experimenter and subjects. Of course, unless a successful calibration is performed, no eye-tracker would be able to operate.
150 maximum field of view by gazing at the full grid; rows or columns of points that were beyond their maximum field of view were removed from the grid resulting in a total of 195/195/195/165 points (REACT) and 150/150/150/120 points (EyeLink-II) for each subject respectively. Fewer points were used for EyeLink-II as the eye-tracker’s visor blocked the top view beyond approximately 30°. During the trial the points were displayed in random order to avoid anticipatory eye-movements from the subject.
FIGURE 34: PHYSICAL AND VIRTUAL SCREEN FOR THE COMPARISON WITH EYELINK.
Calibration for the EyeLink involved displaying nine points on the virtual screen as described earlier. The REACT eye-tracker was calibrated using two different sets of nine points: a) the same set as the EyeLink and b) nine points that follow the same pattern but are spread across the complete field of view of the subject. Two calibration modes were used in order to perform a fair comparison to EyeLink as well as demonstrate the capability of REACT to operate and be calibrated beyond the range of the EyeLink-II.
Data was recorded for both EyeLink modes (pupil-only and PCR mode) and for each of the two different calibration modes of REACT thus resulting in four trials per subject. During calibration of the EyeLink, it was ensured that it performs optimally by adjusting the headband and position of the cameras as necessary such that a good calibration result (less than 1° average validation error) was obtained on each trial and for all subjects. Subjects were given ten minutes rest
Statistics for the error, in degrees of visual angle, of each mode across all subjects are shown in
Table 27. The statistics were calculated for four different groups of ranges:
a) The closest approximation (±23°/±23°) of the conservative ±20°/±18° tracking range as per the EyeLink specification.
b) The closest approximation (±32.5°/±23°) of the pupil-only tracking range (±30°/±20°) as per the EyeLink specification.
c) Beyond ±23°/±23°; that is, the complete range minus the ±23°/±23° range.
d) Beyond ±32.5°/±23°; that is, the complete range minus the ±32.5°/±23° range.
e) The complete range.
The statistics were calculated in the above groupings such that the accuracy of the eye-trackers within the ranges in question can be analysed separately. The ranges were not exactly matched to the specification for two reasons: a) to have an evenly spread grid while minimising the number of points and consequently the fatigue of the subject and b) because the primary focus of this experiment is to test the maximum tracking range and associated accuracy of the eyetrackers of the complete range.
As expected, the EyeLink achieves a low error within the two constrained tracking ranges (2.44±3.56°, 3.30±5.53° for pupil-only mode and 4.09±7.63°, 5.23±8.33° for PCR mode) though higher than in the specification. This is not surprising given that within these ranges, only a small number of points were tested (25 and 35 for each range respectively) and some of these points were slightly beyond the specified maximum tracking range thus resulting in loss of tracking and high errors which significantly raise the average. The errors within the constrained field of view are below 1° if these outliers9 were to be removed. Since the REACT tracker does not suffer from such losses of tracking, it displays significantly lower than EyeLink error for the limited range calibration (1.11±0.73°, 1.57±1.14). The same cannot be said for the full range calibration, where There is no reliable way to detect loss of tracking in the EyeLink. When tracking is lost, the eye-tracker 9 will either output incorrect values or if the pupil is completely lost, it will be interpreted as a blink.
The results for the two “difference” groups that examine eye-movements beyond the specified tracking ranges are quite different, with the lowest error achieved by the REACT tracker in full range calibration mode, followed by the EyeLink in pupil-only mode, then the REACT tracker in limited range calibration mode and finally the EyeLink in PCR mode. These results are coherent to the expectations that can be reasoned about the experiment.
The REACT tracker rarely suffers from loss of tracking as shown in the previous evaluation sections and thus its accuracy depends mostly on the accuracy of the pupil detection (reviewed previously) and the gaze mapping. In these groups, the gaze mapping algorithm performs best when calibrated with points that span the full range of targeted eye-movements and thus the REACT tracker in full range calibration mode comes first. The pupil-only mode of EyeLink is able to track over a larger range than the PCR mode and consequently suffers from loss of tracking less often (the pupil is easier to detect than the glint in large angle eye-movements), coming second. For the same reasons, the limited range calibration mode of the REACT tracker comes third while the EyeLink PCR mode last; for these groups, the glint will often fall onto the sclera thus resulting in loss of tracking for the EyeLink.
Having said all that, looking at the distribution of the error across the test grid is much more informative than the above statistics which are easily influenced by high errors within each group. The distributions for both eye-trackers and all four trial modes are shown in Figure 35 and Figure 36 respectively.
The EyeLink shows consistently fair performance within the middle of the grid (approximately ±32.5° horizontally and 23°/52° above and below the centre, vertically) in pupil-only mode. The highest errors are found outside this range, in all directions. In PCR mode, the map is much more inconsistent which is also not surprising given that it’s very easy for the glint to disappear in the sclera as mentioned several times before. Consistently low results can only be observed in a small central range of approximately ±23° in both directions.
As explained before, the REACT tracker suffers from loss of tracking a lot less and thus the map in both modes is much smoother. In limited range calibration mode, the largest errors are observed 153 at the left, top and right edges of the map while in full range calibration mode, the errors on the edges of the map are small to medium and the highest errors are observed slightly more centrally than the limited range calibration mode. Once again, this is because the gaze mapping will be best near the calibration points.
Cases where the hardware obscured the view of the subject can be clearly seen on the maps. For example, in the pupil-only EyeLink map, two symmetrically placed that are fairly high relative to its neighbours are observed near the centre, most likely where the camera mounts or the cameras themselves were in the field of view of the users. Similarly, one point with high error relative to its neighbours is observed near the bottom and to slightly to the right side of both REACT maps.
One other artefact that needs to be explained is that the error progressively reaches high values towards the top left corner of both REACT maps. This increase is caused by the orientation of the camera in reference to the tracked eye. It is evident that the camera is pointed upwards and placed off-centre, to the one side thus distorting the uniformity of the feature points in the image.
The resulting distortion is illustrated in Figure 37 where the pupil points that correspond to the calibration points for two subjects are shown. Of course, the camera can never be perfectly placed without a rigid setup and this problem exists with all other head-mounted eye-trackers including EyeLink (though because of the sophistication of the headband and the camera mounts, it is much easier to adjust the camera). Similarly higher error distributions are observed on the left side of the maps than on the right side for EyeLink though the pattern seems less severe than
with REACT. This is due to two reasons:
EyeLink appears to use a mapping function that uses a quadratic equation versus a linear function in the case of REACT. Hence, with a non-linear approach it is better able to cope with the non-linearity in question.
EyeLink, being a mature product that has been in development for at least 15 years, appears to have developed some additional corrections mechanisms beyond the nonlinear mapping function.
Both points above have been deduced from the debugging information that is written to the EyeLink output file.
154 In conclusion, the results of this experiment appear to support the specified tracking range of the EyeLink (±30°/±20° pupil-only, ±20°/±18° PCR mode). Even though the statistics calculated from the measurements taken in this experiment do not aid in precisely quantifying the range and accuracy within this field of view because of the high errors observed where there has been loss of tracking near the borders of this range, looking directly at the measurements for each subject reveals that when there was successful tracking the error is, in most cases, below 1°. This is not surprising as the EyeLink is a well-respected10 eye-tracker within academia and it would not have gained its reputation if it were not for its high accuracy and high sampling rate (250 or 500Hz for EyeLink-II, other EyeLink models go up to 2000Hz).
However, this experiment was primarily focused on exploring what happens beyond the tracking range that EyeLink is known to operate well within. It was expected that especially in PCR mode, loss of tracking would occur for visual angles beyond the specified tracking range because the corneal reflection would be placed in the sclera, thus disabling EyeLink from being used to successfully track eye-movements beyond ±30°/±20°. Indeed, it was found that, outside of this range, the EyeLink performs poorly in terms of accuracy and also performs inconsistently, especially in the horizontal direction and especially in PCR mode. In contrast, REACT performs satisfactorily well (average error 5.74° when it’s calibrated with points that span the full range) up to approximately ±56°/±52°, which is the full range of movements that were possible for the four subjects that participated in this experiment.
Finally, the quadratic gaze mapping function of EyeLink was found to be superior to the linear function used by REACT, which in fact causes high errors concentrated on the top left corner of the grid (though this will depend on the exact placement of the camera). This is not a concern for the thesis presented here as gaze mapping is not required functionality of the REACT eye-tracker for the target application but it was only implemented such that this comparison would be made possible.
SR Research’s website (http://www.sr-research.com/publications.html) mentions that EyeLink has 10 been cited in over 1400 peer-reviewed publications.
TABLE 27: ERROR STATISTICS FOR BOTH EYE-TRACKERS, IN DEGREES. DIFFERENT SETS OF STATISTICS
ARE CALCULATED FOR THE FULL TRACKING RANGE OF EYELINK ACCORDING TO ITS SPECIFICATION(±30°/±20°), THE CONSERVATIVE TRACKING RANGE FOR PCR MODE (±20°/±18°), OUTSIDE THE LATTER
TWO RANGES AND OVER THE COMPLETE RANGE.
2 4 6 8 10 12 14 16
FIGURE 35: EYELINK-II ERROR (IN DEGREES, AVERAGE ACROSS ALL SUBJECTS) HEAT MAP FOR THE
COMPLETE SET OF GRID POINTS BOTH FOR PUPIL-ONLY MODE (TOP) AND PUPIL AND CORNEALREFLECTION MODE (BOTTOM).
2 4 6 8 10 12 14 16
FIGURE 36: REACT EYE-TRACKER ERROR (IN DEGREES, AVERAGE ACROSS ALL SUBJECTS) HEAT MAP
FOR THE COMPLETE SET OF GRID POINTS FOR BOTH CALIBRATION MODES; SAME NINE POINTS USED
TO CALIBRATE EYELINK (TOP) AND NINE POINTS COVERING THE FULL FIELD-OF-VIEW (BOTTOM).
158 155 150 145 140 135 130 220 230 240 250 260 270 280 290 300 155 150 145 140 135 130 125 230 240 250 260 270 280 290 300
FIGURE 37: PLOTS OF THE PUPIL POSITIONS THAT CORRESPOND TO THE CALIBRATION POINTS
DISPLAYED ON THE SCREEN FOR TWO SUBJECTS (TOP, BOTTOM).
In the eye-tracking literature, usability is rarely assessed and the eye-tracker which forms the basis for the hardware design (Babcock and Pelz, 2004) of the REACT eye-tracker is no exception. This may be because the comfort of the user is not as important for applications where the subject does not interact with another human being as is the case for applications targeted by the REACT eye-tracker. Another reason may be that engineers are much more concerned with the functional aspects of eye-tracking than with the non-functional requirements.