Effects of depth of field on eye movement

: Depth of field is an important image feature, which can be used to enhance observers’ perception of stereopsis. To explore how the depth of field influenced observers’ attention in images, a within-subject experiment including two factors, two scenes and six levels of depth of field, was performed. A remote desktop eye tracker, Tobii X120 was used to record participants' eye movement during the experiment. The results showed that the limited depth of field could direct observers' attention on the sharp area. However, manipulating depth of field in the same scene may not influence the distribution of observers’ attention. In addition, it was concluded that depth from focus object influences observers' attention significantly, which means that the closer the object is, more attention the object obtains.


Introduction
When focusing on an object in the real world, this object will be projected in the fovea on the retina and hence be perceived as sharp. The images of other objects are more and more blurred with increasing distance from the focus plane. It is possible to create an image with proper blur in it to make it the same as the image of the real scene on the retina. The distance range within which objects are perceived as sharp in images is defined as depth of field.
Depth of field first used by cinematographer Gregg Toland and director Orson Welles in the movie 'Citizen Kane' in 1941. In this movie, a very small aperture was used to make details sharp everywhere; this is known as 'deep focus'. Since then, depth of field is widely used by photographers and cinematographers in photographs and moves as an important photographic technique. A small depth of field means a lot of blur, whereas a very large depth of field indicates that the image is almost sharp everywhere (see Fig. 1).
From a practical point of view, there are a couple of reasons that depth of field effect is popular in cinema and photography industry due to a couple of reasons. First, it can be controlled to create miniaturisation (diorama illusion) or magnification effect to impress audience [1]. Second, it is believed that depth of field can improve the aesthetic appeal or realism of photographs [2][3][4]. This probably could explain why we could see so many photographs and moves with a sharp central object and blurred background. Furthermore, the limited depth of field may enhance the feeling of stereopsis in 2D materials [5,6].
From a theoretical point of view, depth of field has attracted the attention of scientists in the area of visual perception since Pentland first proposed that depth of field is a depth cue in 1987 [7]. As concluded in Pentland's work, limited depth of field in photographs could help rebuild depth maps from realistic imagery. Since then, depth of field has been discussed a lot as a depth cue [3,5,6,8]. In addition, depth of field was studied for the use of directing observers' attention by manipulating the blurry part in an image [9][10][11].
We noticed that the above-mentioned research on depth of field mostly used psychophysical methods. The relationship between depth of field and eye movement has not been well documented though the eye movement data could directly show how depth of field directs viewers' attention. However, researchers focusing on virtual reality spent effort in investigating how depth of field and eye-tracking system together improved users' immersion, perception, and performance in a virtual environment [3,12,13]. Hillaire et al. used the eye-tracking system to get users' focus point on the screen to adapt the size of depth of field blur in real time in the virtual environment [3].
The purpose of the current study is to investigate how depth of field influences eye movement when observing photographs. The eye movement data, such as fixation count, fixation time, and area of interest, are discussed in order to further investigate how depth of field directs attention. To achieve the goal, we conducted a within-subjects experiment, in which participants were required to observe photographs with different levels of depth of field and their eye movement data were recorded by an eye tracker.

Participants
Six female and six male college students from Hohai University participated in the experiment. The mean age of the participants was 25.4 years with a standard deviation of 2.8 years. All the participants had normal or corrected-to-normal visual acuity. This research was approved by Hohai University and done according to Chinese Law and common local ethical practice.

Apparatus
This experiment was performed on a DELL least significance difference (LSD) display. The monitor was set to a screen resolution of 1280*800 pixels. The participants' eye movement was recorded by a remote desktop eye tracker, Tobii X120. The frame rate of the eye tracker was 120 Hz, which allowed a slight head movement. Hence, a chinrest was not used in the current experiment.

Stimuli
Stimuli used in the experiment were generated with an Olympus E-330 d-SLR camera with a 50-mm Olympus Zuiko macro lens. The aperture of the camera lens could be set from F2.0 (the smallest depth of field) to F22 (the largest depth of field). The angle of view of the camera was 13.2°(horizontally) × 9.9°( vertically). The size of the stimuli displayed on the screens was constrained by the visual angle of the camera. More details could be seen in the authors' previous work [14].
Two different scenes were built to make the stimuli (shown in Fig. 2). Each contained six different objects standing at regular intervals on a white ground. The depth (vertical distance) between every two objects was always 15 cm and the scale of the scene was fixed at 75 cm. When taking the picture of the scene, the focus object was always the foremost object: Woody or Apple. The other objects were gradually blurred depending on the vertical distance to the foremost object and the aperture size of the camera. As shown in Fig. 2, each scene was named after its focal object: the 'Woody' scene and the 'Apple' scene. It could be seen that the Woody scene contained more details and had a higher colour contrast than the Apple scene. In addition, the Apple scene had more overlapping.
For each scene, six levels of depth of field were created via manipulating the aperture size of the camera: F2, F3.2, F5, F8, F13, and F20. These six levels of depth of field were definitely able to be discriminated by observers [14]. F2 represented the smallest depth of field and F20 represented the largest depth of field. To be more intuitive, we calculated the size of depth of field with the following equations: In the equations, L is the focal length of the camera and a is the aperture size, β is the angular size of the blur circle and D is the distance from the focal object to the lens. The calculated size of depth of field in mm is shown in Table 1.

Procedure
This experiment was a within-subject design, including two independent variables: six levels of depth of field and two scene contents. The participants were seated in front of the LSD screen at 70 cm in a normal lit office room. No chin rest was used; however, participants were asked to try their best not to move their head during the experiment. The experiment was conducted with Tobii Studio. Before the experiment started, the eye tracker was calibrated for each participant to record the eye movement data. This experiment consisted of 12 stimuli (6 depth of field * 2 scene content) in total. An instruction image first appeared, and participants had to press the space bar when they were ready to do the experiment. Participants were required to free looking at all the stimuli for 10 s. Before each stimulus appeared, there was a grey image with a cross in the centre as a mask. When the mask appeared, participants had to focus on the cross for 3 s. All the stimuli appeared automatically,  following a fixed order. The experimental sequence is shown in Fig. 3. The main experiment lasted for <3 min.

Data preparation
Though 12 participants were involved in the experiment, 5 participants' quality of samples was <80%. It meant that > 20% of their eye movement data were not recorded. Hence, seven participants' data were included in the following analysis.

Heat maps
Fig. 4 provides a quick glance at the distribution of observers' visual attention. The red area represents more attention. In the Woody scene, it is obvious that observers' attention was mainly on Woody. With the increasing depth of field, observers' attention moved to other objects gradually. In the Apple scene, similar to the Woody scene, the focus object Apple was obtained most attention. When depth of field increased, the heat map was more symmetrical.
The results suggest that a small depth of field could direct viewers' attention on focus object. Viewers' attention could be manipulated by controlling the size of depth of field. The results confirmed previous research that claimed the relationship between depth of field and attention [9,10]. However, we could also see that the change in the distribution with a changed depth of field seemed different between the two scenes. Observers' visual attention was spatially more uniform in Apple scene than in Woody scene.
The reason could be that there was a lot of overlapping areas in the Apple scene.

Fixation count and fixation time on the whole image
To analyse the effects of depth of field on viewers' eye movement, the fixation count and fixation time were recorded. First, each participant's fixation count and fixation time on the whole stimulus was added. Figs. 5a and b show that there is fluctuation in total fixation count and time when depth of field increased from a small value to a relatively larger value. When depth of field is large, fixation count and time reach constant gradually. The mean fixation count and time follow a similar trend (see in Figs. 5c and d). The effects of depth of field on fixation count and time seem similar for both Woody scene and Apple scene.
We performed a 6 (depth of field) × 2 (scene) repeatedmeasures analysis of variance (ANOVA) and found that neither Depth of field nor Scene had a main significant effect on fixation count or fixation time. The post hoc LSD pair-wise comparisons showed that the fixation count was significantly different when depth of field was 5.7 and 7.2 mm (P = 0.023). In another word, fixation count in an image increased significantly when depth of field increased from 5.7to 9.2 mm. However, for fixation time, the post hoc LSD pair-wise comparisons showed that fixation time on an image decreased significantly from 4.03 to 3.58 s when depth of field increased from 9.2 to 14.5 mm.
The results indicate that when depth of field is very small, fixation count and fixation time change when depth of field changes. A very small depth of field means that there is a lot of blur in the image. Observers' area of interest is mainly in the sharp part. When depth of field gets larger, observers may find that the background gets sharper. Hence, more areas attract attention, leading to a higher fixation count. Meanwhile, the fixation time may get smaller because of more saccadic and scanning eye movement.

Fixation count on each object
The stimuli used in the current experiment included six objects at six different depths. The amount of blur of each object is decided by its depth and the level of depth of field. In this section, each object was selected as an independent area of interest. Figs. 6a and b illustrate the total fixation count over all participants on each object across six levels of depth of field in the Woody scene and Apple scene, respectively. It is obvious that the focus object (depth = 0 cm) received the most attention. The total fixation count is much higher than objects at other depths in both Woody scene and Apple scene. For the focus object, there is a decreasing trend with increasing depth of field. Contrary to the focus object, objects at depth of 15 and 75 cm receive an increasing fixation count with increasing depth of field in the Woody scene. However, for objects at 30, 45, and 60 cm, the relationship between depth of field and fixation count fluctuated. Fig. 7 shows how depth of each object influences the total fixation count in Woody scene and Apple scene separately. It is obvious that the total fixation count decreases with increasing depth from focus object. The trend seems not to be influenced by neither the scene content nor the size of depth of field.
A 6 (depth of field) × 6 (depth) × 2 (scene) repeated-measures ANOVA was performed on the data set. It was found that Depth was a significant main factor that influenced fixation count on each object (F (5, 30) = 19.53, P = 0.001). Though depth of field did not The results suggest that the farther away the object is from the focus object, the less attention the object obtains. The results are consistent with previous work which concluded that the objects closer to the objects received more attention [15,16]. The reason also could be that the farther away the object is from the focus object, the more blurred the object is. Hence, we may conclude that fixation count does not change significantly in the same scenes with different levels of depth of field; however, limited depth of field in a scene could affect fixation count on objects at a different depth. We performed a 6 (depth of field) × 6 (depth) × 2 (scene) repeated-measures ANOVA analysis and found that depth was a significant main factor that affected fixation time (F(5, 20) = 14.01, P = 0.001). Scene content and depth of field did not influence fixation time significantly. However, a post hoc LSD pair-wise comparisons showed that fixation time in the scene with depth of field of 9.2 mm was significantly different from that with depth of field of 14.5 and 39.1 mm (P = 0.05 and P = 0.001).

Fixation time on each object
The results shown in Figs. 8 and 9 suggest that observers spend more time perceiving the objects closer to them. It is also indicated that blur introduced by depth of field does not influence their distribution of attention.
In the current study, depth of field was found not a main significant main factor for neither fixation count and fixation time. These results suggest that depth of field does not influence observers' viewing behaviour. However, this is not a general conclusion. It could be explained from several aspects. First, participants were required to look at the stimulus wherever they wanted. The results may change when participants have other tasks such as change detection, quality evaluation, gaming, and so on. Second, scene content in the current experiment was kind of seminature and the content was not very interesting. Photographs from    daily life may contain more information and therefore led to a more natural eye movement.

Conclusion
In summary, the limited depth of field can direct observers' attention on the sharp area. The manipulating depth of field in the same scene may not influence the distribution of observers' attention. The authors also concluded that depth from focus object influences observers' attention significantly. The closer the object is, the more attention the object obtains.