A Comparative Study of Pointing Techniques for Eyewear Using a Simulated Pedestrian Environment

,


Introduction
Although introduced in the 1960s [54], until recently see-through head-mounted displays (eyewear) were essentially dedicated to military and research environments [35].Recently, however, there has been substantial commercial interest in developing eyewear technologies for public use, including Microsoft Hololens or Epson Moverio.While current devices have limitations, like narrow field-of-view, hardware is quickly improving, creating new possibilities for interaction in the office and home as well as during daily activities such as commuting.Indeed, academic and industry experts have suggested that eyewear displays might underpin a foundational change for the next generation of mobile interaction [7,30,38,27].One potentially important advantage of eyewear displays is that they enable head-up interaction, which may enhance the user's situational awareness while engaged in concurrent activities, such as walking along a busy sidewalk.Current phones, in contrast, encourage a posture in which the head is bent down rather than looking outwards at the environment, causing poor environmental focus and attention, and raising significant safety concerns [3,40,19,34,26].
Research on input methods for eyewear displays is in its infancy, especially when considering the interplay between the user's external urban activity and their internal (eyewear-driven) task.We focus on two-dimensional pointing as a basic interaction for input on eyewear displays.While other interaction modalities are being explored (for example gesture and voice), pointing remains fundamental in most vision-based human-computer interfaces.However, the design and evaluation of pointing techniques for eyewear displays poses particular challenges: their design must be properly adapted to mobility (e.g.efficient interaction while walking), and evaluations should account for environmental factors (e.g.navigating through a crowd), including impact on situation awareness and social acceptability.
As a first step in exploring the design space of eyewear pointing techniques, we examined practical solutions that can be readily adapted and implemented using today's off-the-shelf devices.We used a phone as an input device, a choice based on versatility (numerous sensors packaged in a small volume, allowing it to simulate handheld trackpads or in-air controllers for example), its ubiquitous ownership (smartglass users are likely to own and carry a phone) and its mobile pragmatics (for example bimanual techniques or bulky apparatus are impractical while walking).
We present an empirical study in which pointing techniques that have been proven useful in comparable contexts (like standing in front of an ultra-wall [39]) are adapted to phone input for eyewear displays-including variants of in-air pointing as well as using the phone as a hand-held trackpad.Direct touch on the phone, without eyewear, was used as control condition.We compare these techniques in three different environments: no simulator (while being stationary), simulated empty street and simulated busy street.Running such an experiment directly in a busy city street would put participants at risk, and is therefore ethically undesirable.Therefore, inspired by the work of Schwebel [49], we developed a street simulator that enabled us to gain insights on the use of these techniques in pedestrian environments, while keeping our participants safe and preserving internal validity.Finally, we looked at three key metrics: perceived social acceptability, performance as a function of the simulated environment, and impact on situation awareness.
Our results demonstrate that (a) the trackpad technique was the most socially acceptable, most accurate, and fastest for eyewear, and (b) the in-air techniques (which are increasingly integrated in commercial AR products) tended to perform poorly and were subjectively unacceptable.Importantly, while a key expected benefit of eyewear is that their head-up view should improve situation awareness and safety while walking [38,30,27], our results indicate that this may not be true: results indicate that situation awareness was worse when using the candidate techniques (i.e., with eyewear) than in the control condition without eyewear.
We make three specific research contributions: 1. empirical evidence of the perceived social acceptability of pointing techniques for eyewear, collected through interviews and web survey; 2. empirical evidence of the relative performance of eyewear pointing techniques in terms of speed, accuracy and ability to maintain situation awareness (for example avoid simulated pedestrian hazards); 3. demonstration of a Virtual Reality method for safely evaluating interaction techniques in a simulated pedestrian environment.

Background and Related Work
Two categories of previous research are briefly reviewed in the following subsections: first, general background research on pointing methods that might be adapted to eyewear displays, and second, research that focuses on interaction while engaged in other activities such as walking a busy street.

Pointing with Eyewear
Pointing to targets is an elemental component of interaction with graphically displayed content, and it is therefore important that efficient and acceptable pointing methods are developed for eyewear displays.While alternative selection methods that negate the need for pointing have been proposed-such as speech (e.g., [31]) or hand-gestures (e.g., [36])-pointing-based methods offer substantial advantages due to their familiarity and learnability ('see and point versus learn and remember' [52]).Abundant novel or improved pointing techniques are represented in the HCI literature, and many could be adapted for eyewear.A key requirement for pedestrian pointing, however, is that the method can be operated while standing or walking.When eyewear displays are explicitly considered, the most commonly suggested pointing techniques are based on in-air pointing, in which the movement of the hand or a hand-held object is mapped to cursor movement (e.g., [9,17,22]); a similar in-air pointing method is used with the Microsoft Hololens.Another approach is to use trackpad-like interaction, with a dedicated device held in the hand [39,33], on the body [15,6,5,47,16] or the environment [14,56].Finally, the use of eye-tracking or head-tracking is also possible [21,39].
All these pointing techniques are promising and most are valid candidates for eyewear pointing (providing the sensing mechanism can be made mobile).Research on large display interaction is also of interest as it often considers the users' need to stand or walk near the display [39,55,28].Only a few previous works specifically investigated pointing on eyewear.The work of Jalaliniya et al. on head and eye tracking [21], or Hsieh et al. on gloves [17] are examples.However, as far as we know, none of these works comprehensively tackle social acceptability and formally investigate pointing performance under realistic urban movement, or used exotic, unrealistically bulky hardware.As a result, it is still unclear what is currently the best solution.
We focused on one-handed pointing techniques rather than bi-manual techniques because users often carry objects while walking.We used phones as input devices because they are readily available without requiring users to acquire a specialised input device; they also embed sensors that provide both touch (trackpad) and movement sensing (in-air controllers).

Interaction in Pedestrian Environments
The design of interaction techniques for use in pedestrian environments raises special challenges, including the need for the user to maintain situation awareness during interaction (to reduce safety concerns such as collisions with people or vehicles) and the need for the movements or actions required for interaction to be socially acceptable.In addition, there are also challenges for researchers in evaluating new technologies for pedestrian environments.
Situation awareness Several recent studies have highlighted evidence that the use of mobile phones in urban areas is elevating the risk of personal injury [3,40].Rather than looking upwards and outwards at the environment, when interacting with a phone, the user's posture has the head bent down, causing poor environmental focus and divided attention [42,48,31].This leads to the emergence of the "phone zombies" phenomenon: pedestrians who pay insufficient attention to their environment while looking at their phones, sometime walking into other people or traffic [3,40,19,34].In an attempt to ease these problems, cities such as Singapore and Melbourne have started to install LED strips on pavements at pedestrian crossings [26].
Rather than altering the environment, another approach to improving situation awareness is to alter the interface [31], and to explore interaction mechanisms that are more fit for the challenges of pedestrian environments [30,57,31,43], such as eyes-free interaction [57,20] or the use of eyewear [30,17].Researchers have argued that eyewear displays allow more seamless integration between the display of information and the surrounding environment, and as a result, improve situation awareness [38,30,27].However, there is a lack of empirical study testing this assumption, possibly due to the risks associated with placing experimental participants in congested urban settings.
Social Acceptability Montero et al. define social acceptability as the combination of the user's social acceptance, which defines how comfortable a user is in executing a particular action, as well as spectators' social acceptance, which refers to the impression it makes on witnesses of such action [37].Interacting with eyewear displays need to be socially acceptable for public performancethis is especially relevant for eyewear given the current scepticism toward such devices [25].While the social acceptability of actions may change as technologies become widespread [37], the likelihood of technology adoption is greatly improved if its interaction requirements are socially acceptable [12,45].Several factors are known to influence the social acceptability of actions, including movement duration (the shorter the better) [8], and movement amplitude (small, discreet movements are better) [37,46].

Evaluating interaction techniques for pedestrian environments
There are well known trade-offs between lab and field studies, with lab studies facilitating internal validity at the cost of external validity, and field studies the inverse [24,18].
Beyond concerns of internal validity, there are additional and important safety concerns that complicate the potential conduct of field studies in urban pedestrian environments [49].Consequently, researchers have examined the use of simulations to reproduce some of the realistic interaction context in safe settings.When the research focus is on the act of walking (e.g., to understand motor perturbations to interaction caused by pacing) treadmills have been used [4,2,41].And when the research focus is on environmental artefacts, video projection [32] and virtual reality [18,23,50,49] can be used.We chose the later approach.Notably, we were inspired by the work of Schwebel et al. who used a simulated street environment to investigate child safety at road crossings [50].Using a simulated environment not only replicates common pedestrian constraints, but also provides better control of parameters and allows us to put participants in simulated risky situations without physical risks.

Perceived Social Acceptability
To reiterate, social acceptability is a key issue in the design of interaction techniques for use in public settings.We structured our investigation of social acceptability in two parts: (1) semi-structured interviews, (2) a large-scale web survey.The goal of the study was to seek participants' perception of the social acceptability of the investigated interaction techniques.We focused on the user's social acceptance [37].

Table 1. Techniques included in our social acceptability study
In-Air Techniques Front Translation Translation of the hand on a plane facing the user [39,55] Down Translation Translation of the hand on a plane parallel to the ground [22] Front Rotation Rotation of the wrist and forearm as if laser-pointing on a plane in front of the user (Fig. 2 left) [39,55]

Down Rotation
Rotation of the wrist and forearm as if laser-pointing on the ground [22] Front Taps in-air taps on a plane parallel to the user (back-and-forth movements of the forearm and the index finger) [9] On-Body Touch Techniques Finger Touch The tip of the index is used as a trackpad controlled with the thumb [5] Palm Touch The joined area of the four long fingers is used as a trackpad controlled with the thumb [6] Pocket Touch The pocket area is used as a trackpad [47] On-Device Touch Technique Trackpad A hand-held device is used as a touch surface (Fig. 2

right)
Our interview sessions were inspired by Rico and Brewster's methodology [45]: participants were asked to perform different gestures as if they were interacting with the device, and we gathered their feedback on the gestures' social acceptability (for public and private use).All interviews took place in a public setting within a local university campus.
For the web survey, participants watched online videos of the techniques, and were asked to rate their social acceptability.Videos are often used as a way to assess social acceptability [1,45,6,51].The web survey was included to broaden participation in the study.
In our interview sessions, we examined a set of nine pointing techniques selected from previous literature (c.f.Table 1).Five of them were in-air gestures and one used a hand-held device as a trackpad.Though not the main focus of this work, we also included three body-touch techniques to learn about people's perception of less-common input methods.
After the interview sessions, we discarded the techniques with very poor rankings and kept Front Rotation, Down Rotation, Trackpad, Finger Touch and Pocket Touch for the web survey.Palm Touch was excluded because it was comparable to Trackpad.Front Rotation was slightly modified to allow movements from both the elbow and the wrist.Pocket Touch, which received polarized feedback in our interviews, was also modified so that the control area was shifted to the side of the thigh, further away from the genitals.

Participants and Procedure
For the interview sessions, we recruited eight participants (5 female), aged 22 to 45 years old (M = 32.1,SD = 6.5) from our students and university staff.Seven participants lived in Singapore and one in France.No compensation was offered.Each session lasted 45 minutes.
For each of the nine pointing techniques, we carried out the following procedure: (1) demonstrated the pointing movements for that technique and made sure its principles were understood, (2) asked the participant to perform the movements for approximately 30 seconds in a busy public area of our university campuses, (3) conducted a semi-structured interview focusing on their perception of the social acceptability of the techniques, at home or in the street.We finished the session by asking our participants to rank all techniques by order of social acceptability for usage in a private setting (e.g.home) and public setting (e.g. a street).
From our web survey, we gathered 56 responses (25 female, 1 preferred not to disclose) from 18 to 57 years old (M = 27.7,SD = 8.2, 7 preferred not to disclose).The web survey was advertised using our university's mailing lists.50% of the participants were students (undergraduate and post-grads), 26% were IT Professionals and 5% in Academia and Research.They were mostly from South-East Asia (n = 41) and Europe (n = 8).The survey was divided into six parts, one dedicated to each technique, and a summary.In each survey part, participants were shown a short video of an actor walking in a street and demonstrating the use of the technique.Then they were asked to provide feedback on the perceived social acceptability of these techniques.Finally, they ranked techniques in order of perceived acceptability both for public and private contexts (1, most acceptable; 6, least acceptable).

Results and discussion
During the interview sessions and in the web survey, we asked participants about their perception of the social acceptability of the techniques in private and in public.We did not observe any statistically significant effect of the participants' continent of origin on the recorded answers.
Private Use In the interviews, Finger Touch was ranked as being the most socially acceptable technique (M = 1.8 /9), followed by Phone Touch (M = 3.3 /9), Pocket Touch and Palm Touch (both M = 3.8 /9).Among the in-air technique, Front Taps was ranked as the least socially acceptable (M = 4.6 /9).A Friedman test showed a significant effect of technique on average ranking (χ 2 (8) = 37.3, p < .00001),although Bonferroni corrected analysis showed no pairwise differences.
In general, the interview results suggested that for private use, smaller onbody or on-device actions were perceived as more socially acceptable than larger in-air movements of the device.On-body movements were perceived to be less tiring (2 interview participants) and as a result easier to use in private, and on-device actions were reported as being familiar (5).The in-air techniques were considered as "tiresome" (4) and "intrusive" (5).Our web survey results tended to confirm this trend: we found a significant main effect of technique on the average ranking for private use (χ 2 (5) = 108.3,p < .0001,see Fig. 3
Consistent with previous work [37,46], participants expressed concerns with high amplitude movements.In particular, five interview participants reported that Front Translation exceeded their "personal space", and got "in the way of others".Participants also expressed strong concerns on Front Taps [9] that made them appear as though they were pointing at others, explaining the large ranking difference compared to a private setting.This is potentially important, as contemporary implementations such as Microsoft Hololens use this modality as a primary means for interaction.Finally, Pocket Touch was polarizing in our interviews: half of our participants expressed little social concern, the other half strongly opposed to what they perceived as sexually suggestive (2 participants, 1 male and 1 female, even entirely refused to perform the gesture in public as per protocol).
As suggested by one of the interview participants, in the web survey videos we moved the control area for Pocket Touch further to the outside thigh region.This improved the ranking of the technique compared to our interviews.The rest of our web survey results tend to confirm the trend observed during the interviews: we found a main effect of technique on the average ranking for public use (χ 2 (5) = 119.7,p < .0001,see Fig. 3.

Performance and situation awareness
We explored pointing techniques enabled by everyday devices and usable by pedestrians.We compared the three techniques presented in Fig. 2: Front Rotation, Down Rotation and Trackpad.Front Rotation requires positioning the phone flat (screen up), then pressing and holding the screen while rotating the wrist and forearm left, right, up, or down to move the cursor 6 , not unlike tilt techniques [44].Trackpad requires sliding the thumb on the screen to move the cursor.Relaxed Rotation requires holding the phone sideways while keeping the arm down in a relaxed position; the cursor is then moved by pressing and holding on screen while rotating the wrist left, right, up, or down.We designed Relaxed Rotation to require movements comparable in amplitude to Down Rotation.As a result, we believe that it should be perceived to have similar social acceptability (Down Rotation was perceived as the most socially acceptable in-air technique in the previous study).In all three techniques, target acquisition could be performed either by tapping on the screen or pressing one of the volume buttons.In practice, and due to the different grasps, the volume buttons were only used with Relaxed Rotation.
We included direct touch pointing on the phone display as a control condition, with participants instructed to hold and interact with the phone using one hand.Eyewear was disabled and removed in this condition, so participants had to look down while acquiring the targets.
Except for the control condition, all techniques made use of a mobile phone as an indirect, eyes-free controller.The controller was always manipulated with only one hand because pedestrians often need their other hand for activities such as opening doors, carrying bags, etc.In all techniques but direct touch (control), the visual feedback was exclusively displayed on the smartglasses.
Our social acceptability study included several on-body techniques that we did not include in this experiment because we wanted to focus on currently pragmatic phone-based techniques.Furthermore, our pre-tests and pilots indicated that the Down Rotation technique was excessively hard to control, so we eliminated it.In-air translation-based techniques and Front Taps were also excluded due to their poor social-acceptability findings in the previous study.
We compared the remaining techniques under three different environments: No Simulator in which participants stood while performing pointing task; Empty Street, where participants walked in an empty street simulation with no red lights or pedestrians; and Crowded Street, where participants walked in a street simulation including traffic lights and pedestrians (Fig. 1).
We formulated the following hypotheses: H 1 : Users achieve the fastest pointing with Phone because of their familiarity with traditional direct-touch pointing, H 2 : Users achieve the lowest walking speed and poorest situational awareness with Phone because they are required to look down (at the phone), H 3 : In the two street environments, users achieve faster pointing with Trackpad than with Front Rotation and Relaxed Rotation, because they are accustomed to trackpads and because the technique's input is arguably less sensitive to walking movements, H 4 : Users achieve the highest walking speed and best situational awareness using Trackpad because they are not required to look down, and Trackpad's input is arguably less sensitive to walking movements.

Street Simulation
Exploring safety or situation awareness in the wild implies putting participants at risk (e.g., within close vicinity to vehicle traffic), which is not ethically acceptable.Instead, inspired by previous works in social science [49], we rely on a street simulator (see Fig. 1) to investigate the ability of users to maintain situational awareness while interacting with the eyewear device.Participants stood in front of a wide display, and their body movements were tracked using fiducial markers.Walking on the spot caused the camera to move forward at a speed that the participant could control (treadmills, often used in previous works [4,2,41], do not allow pace control).Participants had to step sideways to avoid incoming pedestrians in the Crowded environment, and stop at red lights.
As realistic as it is, a simulation cannot be as externally valid as an inthe-wild experiment.The generalizability of our findings to real street scenarios remains for further work.Nevertheless, the method does require participants to remain aware of the situation and as a result provides actionable insights on situation awareness.
Street Elements Several factors influence a pedestrian's walking behavior, such as street layout, illumination, and other pedestrians.Previous work in social sciences have focused on distracted behavior in road-crossings [49,50,3].However, Oulasvirta et al. observed that the most attention-taxing situations encountered by pedestrians are when they walk in busy streets [42].After discussion and further observations of our own behavior in the street, we included incoming pedestrians and changing traffic lights.
We used a simple street layout: a series of blocks with the same length and walkway width, not unlike some North-American cities.We designed these blocks to appear shorter (in length) than usual, to increase the number of intersections encountered by the participants.
Layout and Traffic Lights Each street block was separated by a crosswalk and a traffic light.Traffic lights could have four different behaviors: Fixed Green, Fixed Red, Changing Green and Changing Red.Fixed Green remained green, Changing Green and Changing Red changed from one to the other when participants were 0.016 to 0.039 blocks away.Changing Green and Fixed Red switched to green after a wait time of 1 to 2.5 seconds.The ordering of Light behaviors were randomized, but we ensured that each behavior appeared at least once every four lights.Audio feedback of a car honk was played if participants jaywalked.
Pedestrian Behavior Simulated pedestrians walked towards the participants at a speed randomly assigned between 2.46 and 3.78 blocks per minutes.They walked in straight lines, stopped to avoid "bumping" into participants, and respected traffic lights.Audio feedback of a pedestrian shouting "hey!" was also played if participants collided into them.In the Crowded Street condition, the street contained approximately 8 pedestrians per block (see companion video).
Steps and Position Tracking Fiducial markers [10] were attached on the participants' ankles to track stomping motions, as well as the participant's lateral position in front of the display.Our tracking algorithm enabled us to map the participants' simulated walking speed as a function of both their stomping pace and the vertical amplitude of their steps.

Vanishing Point Adaptation
The vanishing point of the scene was kept aligned with the participant's position in front of the display when they stepped sideways, as opposed to constantly fixed at the center of the display, to further support the realism and immersiveness of the simulation (see video figure).

Participants and Apparatus
Twelve right-handed participants were recruited from Singapore Management University's students and staff (7 female) aged 20 to 30 years old.Remuneration was the equivalent of 7.4 USD.All but one reported that they had used their phone while walking in the street at least once in the two days before the experiment.
The experimental software was run on an Epson Moverio BT-300 smart-glass, a Samsung S7 Edge smart-phone and two computers (one for the simulation, one for the devices).The simulation was run on a large TV monitor (75 inches diagonal, 1.65 × 0.93 meters) positioned 1 meter from the ground.Participants stood 1 meter from the display, and could move left and right in front of it.

Task
Participants were instructed to perform an ISO 9241-9 standard Fitts multidirectional pointing task as established by Soukoreff et al. [53] (see Fig. 5) using one of the four techniques (see Fig. 2).We chose a Fitts' Law task type for internal validity: controlling pointing distance and size simplifies comparison between techniques and with previous and future work.Except for the Phone condition, the display area on the eyewear appeared to be approximately 199.2 × 199.2 mm one meter away from the user (720 × 720 px), the radius of the targets layout (see Fig. 5) was 84.7 mm (306 px) and the radius of the targets was 10.5 mm (38 px).In the Phone condition, the total pointing area was 65.4 × 65.4 mm (1, 376 × 1, 376 px), the radius of the target layout was 27.9 mm (585 px) and the radius of the targets was 3.5 mm (73 px).In both cases, the ratio of the layout radius on the target radius remains constant ( 306 38 = 585 73 = 8), yielding the same Index of Difficulty [11].The extra space around the targets discouraged the use of edge pointing.
All three eyewear techniques (Front Rotation, Relaxed Rotation, and Trackpad) were indirect and relative.We defined transfer functions to map participant's input to cursor movements using Nancel et al.'s sigmoid function [39]: with v t and G t respectively the input speed and gain at time t and λ a constant.Reporting generalizable (typically, physical) display units is complicated with smart glasses: the pixels can be perceived as if they were at any distance from the user's eyes.Furthermore, since the virtual display is displayed as a flat surface facing the user, rather than a spherical one, angular units cannot be used consistently.For simplicity and generalizability, we report distance and speeds on the display as if the display was projected one meter away from the user's eyes.The 1280 × 720 pixel map of the Moverio BT-300 corresponds to a 354 × 199 mm area one meter away, so one pixel is 0.277 mm wide 7 .We tuned its parameters separately for each technique (see Table 2).In the No Simulator condition, participants executed the task while standing.In both street conditions, participants completed the task while navigating through the simulated street.Specifically in the Empty Street condition, participants were only required to walk on the spot.They were instructed to strive to maintain a natural walking speed.In the Crowded Street condition, they were asked to also avoid pedestrians and respect traffic rules, as in real world.If they failed to meet these rules, the simulation flashed red while the corresponding audio feedback was played (a man shooting "hey!" for pedestrians, and a car honk for the lights).We simplified this experiment by standardizing the walk to a straight path, under the hypothesis that it is reasonably similar to following a well-known path in term of cognitive load.This method was designed to simulate the most common external constraints encountered by pedestrians while enabling measures of pace, awareness, and interaction performance, as realistically as possible.
We finished the experiment with a short semi-structured interview during which participants provided subjective feedback.Before starting the experiment and as a training, participants were introduced to both conditions of the street simulator.During this training, we also asked participants to find what they thought was their usual walking speed and we recorded it for later comparison.Before each technique and under each environment, participants also had the opportunity to train themselves with the technique before starting.

Design
We used a 3 × 4 within-subjects design with the following factors and levels: environment {No Simulator, Crowded Street, Empty Street} and technique {Front Rotation, Relaxed Rotation, Trackpad, Phone}.To ensure consistency of the simulation across all participants, we generated four predefined crowded street configurations (including pedestrian position and speed, lights, etc.).
The experiment was divided into three parts, one for each environment condition.Each of these parts were divided into four blocks, each one dedicated to a technique.In accordance with ISO 9241-9 [53], each block started with the cursor centered and an initial unmeasured target selection, followed by four selections of each of the 9 targets in opposing order (see Fig. 5).environment and technique orders were counterbalanced using a Latin Square.In the street conditions, the simulation ran uninterrupted during a block, and restarted afterwards.For all trials, we measured selection time, wrong selections (clicks outside of the target), "walking" speed, number of pedestrian collisions, and jaywalking.Participants were allowed to take breaks between each in the simulation.We found a significant main effect of Street Environment (F Jaywalking and Collisions We only considered the Crowded Street scenario when investigating jaywalking and pedestrian collisions.Note that because of the differences in selection time (Fig. 6-top), we computed the average number of jaywalkings and pedestrian-collisions per block (a block is a unit of distance in the simulator).We did not found a significant effect of technique on jaywalking (p = .13,with 0.1 jaywalk/block on average), nor on collisions (p = .49,0.659 collisions/block on average).
Figure 7 shows these results.At the end of the experiment, participants also ranked the techniques from most-preferred to least-preferred, specifically in a crowded environment.Five participants ranked Trackpad best, while five others ranked it second-best.Accordingly, five participants preferred the Phone first, one participant secondbest, and another one third-best.One participant ranked Phone worst.Front Rotation was ranked second by three participants, while Relaxed Rotation was consistently ranked either third-best or least-preferred.Many participants felt that the Trackpad technique poses less danger on the streets, in comparison to frequent look-ups while interacting with the phone.

Ecological Validity Experiment
As an extra validation step, we ran an ecological validity experiment to challenge our simulation-bound results in a real street situation.For safety reasons, it was not ethically acceptable to use external participants.Four of the authors took part in the experiment to test the four pointing techniques in the (actual) wild.We used the same techniques and protocol, with two environment conditions: Wild and Inside, and 5 repetitions of each pointing target instead of 4 during the controlled experiment.In the Wild condition, the authors walked along a busy underground concourse.We measured an average 22.9 pedestrians /minute at this location and time, with a large variance.In the Inside condition, the authors performed the pointing tasks standing but without walking.We counterbalanced the order of the techniques using a Latin Square, and measured both selection times and selection errors.Due to the small population, we only report descriptive statistics, shown in Fig 8 .None of the four authors collided with a pedestrian.The results of this experiment show the same trend as our main experiment.Participants were generally faster and more accurate, which can be explained by a higher expertise with the techniques and by a lower pedestrian density.

Limitations
Techniques Some participants reported difficulties with Relaxed Rotation due to the width of the phone: it made it difficult to press-and-hold or click.Relaxed Rotation may perform differently with a more ergonomically adapted in-air controller.Two participants also reported visual fatigue.Hopefully, this issue can be resolved on future eyewear displays.
Generalizability Our street-walking simulator allows us to put participants in controlled situations resembling crowded streets.This novel methodology enables to gather preliminary insights on situation awareness without putting participants at risk.Concerns over generalizability to real-world situations are eased through our limited validity experiment, but further experimental validation is difficult due to risks of participant harm.When Schwebel et al. run into a similar problem, they argue that at least three indicators can still be considered: immersion, interactiveness, and realism [49].
Though not as immersive as a virtual-reality "cave", we join Schwebel et al. in the argument that a large display area covering most of the participant's field of view provides sufficient immersion.At least four of our participants agreed and described our simulation as "immersive".
We think interactiveness is a strength of our simulator as it included pace and position control (using feet tracking) instead of less interactive controller like a joystick, or pure absence of control as when using treadmills.
On the realism side, our participants' opinions were more divided: three reported the simulation as "unrealistic" and four stated that it was "realistic".Criticisms mostly concerned the in-place stepping mechanism, although two participants described the speed control as being "natural".This is a trade-off for the interactiveness required by our experiment.When VR treadmills are commercialized, allowing pace control, they may provide a better alternative.
Allowing participants to change direction, like walking around a corner, would allow more complex paths to be walked and add interesting factors.Similarly, half of our participants observed that real-world pedestrians typically give way when a collision is about to occur (rather than stop in front of the participant in our simulation).This behavior can easily be added, though we needed participants to actively avoid the pedestrians.
Less trivially, the stakes of colliding into a pedestrian or jaywalking remain limited compared to real-life.Keeping participants safe is of course the main point of using a simulation, but recent approaches such as force-feedback [29] could be put to use to produce physical sensations at no cost of safety.

Discussion
Participants were able to point faster and with fewer errors using the Phone technique.We therefore find support for H 1 .Contrary to our expectations, the better performance of Phone did not come at the cost of slower walking speed or worse situation awareness.Therefore, we reject H 2 and H 4 .This can be explained by a strong discrepancy in participants' prior experience between the control and candidate techniques.
Trackpad emerged as the best pointing technique for eyewear in terms of speed and error rate, in both the No Simulator and our street simulations.Therefore, we find support for H 3 .Trackpad was also perceived as more socially acceptable, easier to use and more enjoyable than the in-air techniques.These results could be influenced by the smaller movements or lower cognitive burden associated with highly familiar touch interaction.
The in-air techniques, Front Rotation and Relaxed Rotation, performed worse in every condition.Performance with these techniques was also more adversely affected by street crowding than the other techniques (selection times increased dramatically in the Street conditions).Despite the higher movement amplitude and the need to keep the forearm up, Front Rotation was found easier to use than Relaxed Rotation.This might be due to the additional joint involved (wrist + elbow vs. wrist only) [13].We observe significant differences in selection time for all environments, and increased errors in the Crowded Street compared to the No Simulator condition.The participants were also able to walk faster in Empty Street than in Crowded Street.
Interestingly, Trackpad and Phone performance were not significantly affected by the simulated Empty Street condition and were close to the No Simulator's.The two techniques did not significantly differ in term of situation awareness with our simulator.The two in-air techniques suffered substantially more from the simulator.
Though these results could only be safely obtained using a simulation, we believe they provide valuable insights sufficient to reliably recommend the use of the Trackpad for eyewear pointing by pedestrians (provided that it is implemented with an efficient transfer function 8 ).
Contrarily to previous assumptions [31,38,30,27] we were surprised that the use of eyewear did not improve situation awareness in our simulator; indeed, situation awareness with eyewear was worse than regular phone interaction with harder-to-use input techniques.User feedback was divided: five participants reported that it was easier to deal with divided attention using the smartglasses, while four others stated the opposite.One obvious caveat on this, however, is that people are highly familiar with current touchscreen interaction, and our participants' performance with eyewear conditions might improve with familiarization.In terms of pure input performance, as expected, Phone was superior.

Conclusion
This work contributed the first empirical study of eyewear pointing while mobile, taking into account environmental awareness and perceived social acceptability.In our street simulations or in a quiet building, participants were faster using a hand-held trackpad than with every investigated variant of in-air techniques.The trackpad was also perceived as the most socially acceptable technique.
Research on eyewear for pedestrians is still in its infancy.Our results indicate muted benefits regarding situation awareness.However, we remain confident that eyewear might expose situation awareness advantages, in particular for more passive tasks such as reading, and we plan to explore it as future work.Also, we would like to explore body-based techniques, like Finger-and Pocket Touch as they were deemed highly acceptable.But, there are also many other promising paths to investigate, as outlined in our related works.
We believe that our simulation-based method is promising, particularly given the ethical concerns associated with evaluation in-the-wild.While the generalizability of our simulation cannot be fully assessed, we argue it provides valuable insights on situation awareness.However, the generalizability of other aspects, such as path finding, might be easier to investigate in future work.Simulations will never be as externally valid as in-the-wild studies, but they require less resources and allows much greater control.Compared to other form of lab studies, we argue they do can provide opportune improvements of external validity.

Fig. 1 .
Fig. 1.Illustrating the simulator: using a phone to point on eyewear while walking.

Fig. 5 .
Fig. 5. Pointing task interface used during the experiment (black is transparent on the glasses).Each time a participant validates a target (currently the rightmost disk on the figure), a new one is highlighted until completion of the task.The superimposed arrow indicates the path to alternating targets following ISO 9241-9 standard procedure [53].