We collected the MSP-GAZE database to study and evaluate appearance based gaze estimation approaches. This corpus considers important factors that affect the appearance based gaze estimation: the individual difference, the head movement, and the distance between the user and the interface's screen.
The MSP-GAZE database is collected in a laboratory setting where sufficient and steady illumination condition is guaranteed. The data collection process involves letting the participants look at and click at randomly projected points displayed on a monitor, and having their glancing behavior and mouse movement recorded.
A total of 46 students from different disciplines at the University of Texas at Dallas participated in the data collection. The average age of the participants is 22.7 (min 19, max 35). They are balanced between genders. Moreover, they are balanced between a diverse ethnic group including Caucasian, Asian, Indian, and Hispanic. The diverse ethnic representation introduces a variety of facial appearances, allowing us to evaluate the effect of individual differences on the appearance model. Each subject participated in 2 data collection sessions, each at different days following the same process. The purpose of collecting data from the same subject on different days is to evaluate the consistency of the appearance model across time. The average interval between the sessions is 7 days. Each data collection session consists of 14 recordings with different purpose (training and testig) and settings in terms of head movement (with or without), user monitor distance (near, medium, far and user-defined).
Our initial study focused on the eye pair appearance eigenspace approach, where the projections into the eye appearance eigenspace basis are used to build regression models to estimate the gaze position. We compare the results between user dependent (training and testing on the same subject) and user independent (testing subject is not included in the training data) models. As expected, due to the individual differences between subjects, the performance decreases when the models are trained without data from the target user. Our current study aims to reduce the gap between user dependent and user independent conditions.
For further information on the corpus, please read:
Once the corpus is fully processed (eye pair extraction and synchronized between two cameras), we will release it to the research community.
Copyright Notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.