[Top] [Prev] [Next] [Bottom]
"...many people think about input only at the device level, as a means of obtaining improved time-motion efficiency through use of the sensorimotor system... this is a big mistake. Effectively structuring the pragmatics of input can also have a significant impact on the cognitive level of the interface."
Bill Buxton [29]
Chapter 8
The Bimanual Frame-of-Reference
8.1 Overview
This experiment investigates the synergy of the two hands for virtual object manipulation. The results suggest that the two hands together provide sufficient perceptual cues to form a frame of reference which is independent of visual feedback. The same is not true for one hand moving in empty space. My interpretation is that users may not have to constantly maintain visual attention when both hands can be involved in a manipulation.
The data suggest a transfer of skill from this experiment's bimanual condition to the unimanual condition. Subjects who performed the bimanual condition first seemed to learn a more effective task strategy for the unimanual condition. This further suggests that involving both hands in the physical articulation of a task can influence cognitive aspects of performance, in terms of the task strategy used.
8.2 Introduction
The central hypothesis of this study is that the combined action of the two hands can play a vital role for the interactive manipulation of virtual objects. Certainly, there is much informal evidence to support this position. Most everyday manipulative tasks involve both hands: for example, striking a match; unscrewing a jar; sweeping with a broom; writing on a piece of paper; dealing cards; threading a needle; or painting on a canvas, where the preferred hand holds the paintbrush, the nonpreferred hand the palette.
A few virtual reality applications and demonstrations, both on the desktop [141][149][80] and in fully immersive situations [121][119][164], have recognized the design possibilities for the two hands, and for years Buxton [27][95] has argued that one can improve both the naturalness and degree of manipulation of interfaces by employing both hands. Yet, beyond a possibly improved efficiency of hand motion, there has been little formal evidence of precisely what advantages, if any, the two hands can bring to virtual manipulation.
Moreover, just because a behavioral strategy is exhibited in the real world, this does not necessarily mean that it will be useful in a virtual environment. Virtual environments offer opportunities to violate the limitations of physical reality, and one only needs to mimic those qualities of physical reality which facilitate skill transfer or which form essential perceptual cues for the human participant to perform his tasks.
To establish the utility of the two hands in virtual environments, an experiment needs to formally demonstrate what users can do with two hands that they can't easily do with one, and address some questions of when, and why, a bimanual interface might offer some advantages. I do not claim to answer all of these questions, but the current study offers some data which suggests areas where involving both hands may have some advantages.
The experiment suggests that the two hands together form a hand-relative-to-hand frame of reference. A frame of reference is a centered and oriented perceptual coordinate system which is specified by a center point plus three directional axes. An interesting property of the bimanual frame of reference is that the information can be encoded by the hands themselves, and as such does not necessarily rely on visual feedback. As an intuitive example, it is easy to touch your index fingers behind your head, but this action is clearly not guided by vision.
8.3 Hypotheses
This study investigates the following specific hypotheses:
H1. The two hands together provide sufficient perceptual cues to form a frame of reference which is independent of visual feedback.
H2. When using just one hand, subjects can employ other body-relative cues (such as sense of joint angles or sense of torso midline) to make an unbiased estimate of a remembered hand position, but these cues are less precise. Thus, unimanual control is more dependent on visual feedback.
H3. The physical articulation of a task can influence cognitive aspects of performance, in terms of the task strategy used. Using two hands together encourages exploration of the task solution space, and this will allow subjects to get a better sense of what a good strategy is for the experimental task.
8.4 Cognitive aspects of performance
Hypothesis 3, which asserts that using two hands can influence cognitive aspects of performance, has previously been articulated by Buxton. Working with his Input Research Group, Leganchuk [108] has provided some preliminary evidence which suggests that "representation of the task in the bimanual case reduces cognitive load."
Leganchuk's experiment (fig. 2.18 on page 38) studied an "area sweeping" task in which subjects selected an area encompassing a target. This is similar to sweeping out a rectangle to select a set of targets in a graphics editing application. Using both hands allowed subjects to complete the task significantly faster than using just one hand. Furthermore, the difference in times could not be attributed to the increased time-motion efficiency alone. This was interpreted as evidence that the bimanual technique "reduces cognitive load."
Another way to investigate the hypothesis that bimanual control can influence cognitive aspects of performance is to take direct measures of cognition, such as quantifiable metrics of learning, memory, or transfer of skill. Leganchuk's strategy of taking differences between one and two-handed techniques relies on the assumption that differences beyond those clearly accounted for by increased time-motion efficiency can be attributed to differences in cognitive load. But if one can demonstrate a direct metric of cognition, this assumption does not have to be introduced.
8.5 The Experiment
8.5.1 Subjects
Seventeen unpaid subjects (13 female, 4 male) were recruited from the University of Virginia psychology department's subject pool. One subject (male) was left-handed. No subjects had experience with 3D input devices or two-handed computer interfaces.
8.5.2 Task
The task consisted of two phases. In the primary phase, users attempted to align two virtual objects. The purpose of this phase was to engage the user in an initial task which would require moving and placing the hand(s) in the environment. The second phase consisted of a "memory test" where users tried to reproduce the placement of their dominant hand without any visual feedback.
I used the neurosurgical interface props as input devices. In the current experiment, the doll's head controls the orientation and depth of a target object. The target object is an extruded triangular shape (fig. 8.1, left). The plate tool controls the position and orientation of a blue semi-transparent rectangular plane on the screen.
For the primary task, users were instructed to align and intersect the triangle and the plane so that they were coplanar (fig. 8.1, right). The triangle would highlight in yellow when the plane was aligned with it. The plane was considered to be aligned with the triangle if the plane was within 13 millimeters of all three corners of the triangle. Each edge of the triangle was 50 millimeters long. The triangle appeared at a new initial random orientation for each trial. The "stage" seen in the background of figure 8.1 served only as a simple perceptual aid and never moved.

Figure 8.1 Stimuli for the primary task.
A footpedal was used as a clutch for the plate tool. When subjects held the pedal down, the plane could move freely relative to the target object. When the pedal was released, the plane would stay embedded in the target object. If the two were aligned when the pedal was released, this ended the primary task.

Figure 8.2 The memory test.
At the end of the primary task, the computer recorded the position and orientation of the preferred hand (which was always holding the plate tool). A dialog then appeared telling the subject to "Get Ready for Memory Test!" (fig. 8.2, a). For the memory test, subjects were instructed to put their preferred hand down on a mouse pad at the side of the work space, close their eyes (fig. 8.2, b), and then to attempt to exactly reproduce the position and orientation of the plate tool without any visual feedback (fig. 8.2, c). At the end of each trial, the computer displayed the subject's best accuracy so far.
8.5.3 Experimental conditions
The experiment compared two conditions, a bimanual condition and a unimanual condition. In the bimanual condition, simultaneous motion of both input devices was possible. The doll's head was held in the nonpreferred hand, the plate in the preferred hand. Since the distance between the two objects does not affect the alignment required for the primary task, it is always possible for the doll's head and the plate tool to make physical contact when completing the task. Subjects were instructed to use this technique in the bimanual condition, since the purpose of the experiment was to test how well subjects could use the nonpreferred hand as a reference. The bimanual condition is shown in figure 8.2.
In the unimanual condition, subjects were instructed to always keep their nonpreferred hand in their lap. Subjects were only allowed to grasp one device at a time, using only their preferred hand. There was a definite device acquisition time required when switching input devices, but subjects were instructed that time to complete the task was not important -- only their accuracy on the memory test mattered. For the memory test, the unimanual condition was identical to the bimanual condition, except that the nonpreferred hand was no longer available as a reference.
Clearly, both conditions utilized a space-multiplexed design, with a separate input device for each function, as opposed to a time-multiplexed design, where a single device controls multiple functions by changing modes. Brooks [19] reports that overloading a device with multiple functions can often cause confusion. Thus, I chose a space-multiplexed design for the unimanual condition because I did not want the possible issue of users becoming confused over which "mode" they were in to interfere with the experiment itself.
For the unimanual condition only, a second clutch footpedal was needed to allow subjects to rotate the doll's head and leave it "parked" at a particular orientation, thus allowing them to put down the doll's head and pick up the plate tool. Users had no difficulty in using the two pedals: there were no experimental trials where a user clicked the wrong pedal in the unimanual condition.
Originally, I had planned to use two footpedals in the bimanual condition as well, but pilot studies suggested this was problematic. If the footpedal is used to "park" the target object in the bimanual condition, the user is again moving relative to the environment, not relative to the reference frame specified by the nonpreferred hand. Once pilot subjects developed some experience with the task, they would essentially always hold down the second footpedal to maintain the doll's head as a reference. Thus in the bimanual case the second footpedal seemed to introduce confusion without adding any new or helpful capabilities.
8.5.4 Experimental procedure and design
A within-subjects latin square design was used. Eight subjects (six female, two male) performed the unimanual condition first and nine subjects (seven female, two male) performed the bimanual condition first. Subjects performed 12 experimental trials for each condition.
During a practice session subjects were introduced to the equipment1 and allowed to become familiar with it. I gradually introduced each element of the experimental procedure and made sure that subjects could perform the task before moving on. Practice sessions lasted 10-20 minutes (prior to the first experimental condition) and 5-10 minutes (prior to the second experimental condition). There was not a fixed number of trials or set time limit for practice, but rather each subject practiced until he or she felt completely comfortable with the equipment and experimental procedure.
8.6 Results
Accuracy on the memory test was the only dependent measure. Accuracy was measured in terms of angle (shortest-arc rotation to align the remembered reference frame with the ideal reference frame) and distance (the translational offset between the reference frames). Distance was also logged as single-axis offsets in the reference frame of the plate tool (offsets along the left-right axis, front-back axis, and up-down axis). The left-right axis is the primary axis along which the hands made contact.
Figure 8.3 Overall means obtained in each experimental condition.
Figure 8.3 reports the overall means obtained in each experimental condition. An analysis of variance with repeated measures was performed on the within-subjects factor of Condition (unimanual vs. bimanual). Condition was not a significant factor for angle but was highly significant for measures of distance (fig. 8.4).
Figure 8.4 Significance levels for main effects.
This evidence strongly supports hypothesis H1, suggesting that subjects were able to utilize the perceptual cues provided by the nonpreferred hand to reproduce their six degree-of-freedom posture independent of visual feedback. Subjects were significantly more accurate with both hands than with just one, supporting H2.
The analysis also revealed a significant Condition 5 Order interaction for Distance (F(1,15) = 9.09, p < .01). It is often assumed that alternating the order of two conditions across subjects automatically controls for order effects caused by transfer of skill. But this is not true if there is a one way (or asymmetric) transfer of skill between the two conditions. A Condition 5 Order interaction is the statistical evidence for such an asymmetrical transfer effect [139].
The means grouped by order (fig. 8.5) show this effect. When performing unimanual first, subjects' distance was 5% better on the subsequent bimanual condition than those subjects who completed bimanual first. But when performing bimanual first, subjects performed 28% better on the subsequent unimanual condition than those subjects who completed unimanual first. My interpretation is that subjects learned a more effective task strategy in the bimanual condition, and were able to transfer some of this skill to the unimanual condition.
My qualitative observations also support this position. When performing the unimanual condition first, subjects had a tendency to avoid using the doll's head: only 2 out of 8 of these subjects consistently reoriented the target object with the doll's head. Subjects would instead adapt the plate tool to the initial (randomly generated) orientation of the target object. But for 8 out of the 9 subjects who tried the bimanual condition first, during the unimanual condition they would re-orient the doll's head on essentially every trial. As one subject explained, during the bimanual condition she had learned that "instead of accepting what it gave me, I did better when I moved [the doll's head]."
All of this evidence supports H3, suggesting that bimanual control can affect performance at the cognitive level by influencing a subject's task-solving strategy. To definitively demonstrate that there is a bimanual to unimanual transfer effect, future experimental work should compare the results obtained in this experiment to data from subjects who perform two blocks of the unimanual condition or two blocks of the bimanual condition.
Figure 8.5 Means grouped by order of experimental conditions.
Finally, an analysis of signed distance errors supported H2: with just one hand, subjects could make unbiased estimates of a remembered hand position. The estimate is said to be "unbiased" because in the unimanual condition the means of the signed errors along each axis did not significantly differ from zero, although means along the up-down and front-back axis did approach significance (fig. 8.6).
The analysis shows a highly significant bias for the bimanual condition along the left-right axis (fig. 8.6), and a bias nearing significance along the up-down axis. Although significant, the left-right bias effect is caused by an overall bias of less than 2 millimeters (fig. 8.7). This small bias effect is difficult to interpret, but for practical purposes, the small magnitude of the effect means there is essentially no bias in the bimanual condition either.
Figure 8.6 Tests for bias in remembered hand position.
Figure 8.7 Means of signed distance errors.
8.7 Qualitative results
The unimanual condition seemed to impose two chief difficulties for users. First, when orienting the target object with the doll's head, the subject had to think ahead to the next step to anticipate which orientations of the target would be easiest for the action of the plate tool. Second, since the unimanual condition requires movement relative to the environment, the user had to remember approximately where he or she had "parked" the doll's head.
The bimanual condition avoids both of these difficulties. When using both hands, it is much easier to see what orientations of the target will be easy to select with the plane. As one subject commented, "it was easier to get them to come together, and faster too." Another noted that "two hands have much more flexibility for how you solve the problem." And since the doll's head is always part of the interaction, subjects had no need to remember where they had "parked" it.
The bimanual condition does introduce the possibility of the two objects colliding. For example, if the plate is directly behind the doll's head and the subject needs to move it forward, he or she cannot do this directly since the doll's head is in the way. But the virtual plane is bigger than the physical plate, so one can solve this problem by (for example) holding the plate immediately to the right of the doll's head. On the screen, one sees the blue plane intersecting the target object even though the two input devices don't physically intersect. Two subjects initially found this to be confusing, but quickly adapted after they realized that one does not have to physically intersect the two objects2.
I asked subjects about their task strategies and the cues they had used to perform the memory test. Subjects generally tried to hold as many of the variables constant as possible, and then memorized the rest. For example, subjects often kept the elbow and wrist angles fixed and would try to maintain an invariant hand posture with respect to the input device. Remaining variables such as the height and depth of the hand placement were then estimated from memory.
In the bimanual condition, subjects seemed to have an innate sense of where they had touched (either on the doll's head or on their hand)-- as one subject explained, "the touch knew the position"-- and many subjects thought of the angle of the plane as a separate thing to memorize. Subjects certainly also made use of the physical landmarks on the doll's head (such as the ears or the features of the face). Without these landmarks, subjects probably would have been less accurate, but many subjects seemed to zero in on the exact spot even before physical contact was made. As one subject commented, "even before you touch the spot you know it."
In the Unimanual condition, the edge and surface of the desk served as a physical reference and most subjects tried to use this to their advantage, for example by resting their forearm against the desk and remembering the touch point. Subjects would often attempt to estimate the left-right placement of their hand using body-relative cues such as the torso midline or the positions of their legs. One subject commented that "because there was nothing to land on, your sort of lose your sense of balance." Another described her hand as "just floating in space, but using both hands gave you something else to reference."
8.8 Discussion
The experimental results have clear design implications for the role of visual feedback and attention in human-computer interaction. Users maintain a fairly precise, body-relative representation of space which does not depend on visual feedback. A relatively inaccurate environment-relative representation of the space is also maintained. My interpretation is that two hands and split attention go well together, opening up new possibilities for eyes-free interaction.
When using two hands, the user's attention does not necessarily have to constantly monitor the manipulation itself, and attention can be directed towards a secondary task, such as watching an animation or a representation of the manipulation from a second viewpoint. Unimanual control is more dependent on visual feedback and can therefore impede the user's ability to split attention between multiple tasks.
The Worlds-in-Miniature (WIM) interface metaphor [164] (fig. 2.12 on page 28) provides an example of these issues in action. The WIM provides the virtual reality user with a hand-held miniature representation of the immersive life-size world. Users manipulate the WIM with both hands, using the nonpreferred hand to orient the clipboard and the preferred hand to manipulate object on the WIM.
By using both hands, users can quickly develop a sense of the space represented by the WIM. Stoakley [164] reports that some users have manipulated objects in the WIM without even looking at it (that is, users have manipulated objects while holding the clipboard below the field-of-view seen in the immersive display). This is convenient for tasks such as hanging a picture, where it is useful to manipulate objects in the WIM while attending to the 1:1 scale view to check the picture's alignment and to evaluate how it fits the architectural space.
This is perhaps the strongest qualitative evidence that interaction techniques based on hand-relative-to-hand manipulation can allow users to focus attention on their tasks without necessarily becoming distracted by the interface technology.
8.9 Conclusion and future work
This study has provided some initial evidence which helps to support the high-level hypothesis that using both hands can help users gain a better sense of the space they are working in [66]. Immersive VR systems which use just one hand often do not offer users any physical reference points. Making use of both hands provides a simple way to increase the degree of manipulation and to let the user's own hand act as a physical reference. Another technique is to introduce a grounding object such as a drafting table; but even in this situation, using two hands plus the grounding object allows interesting design possibilities [1].
A second high-level hypothesis is that in some cases using both hands can change the way users think about a task. This experiment also provided some initial evidence in favor of this hypothesis, suggesting that it may be easier to explore alternative strategies for problem solving when both hands can be used. This was reinforced by the qualitative observation that subjects were more likely to "take what they were given" in the unimanual condition because they had difficulty anticipating which orientation of the target object would be easy to select with the plane.
The input devices used in this study were rich in tactile orientation cues and this probably helped subjects to perform the experimental task more precisely. If the experiment had used featureless spheres as input devices, for example, subjects probably would have had a less acute sense of the orientation of each device [85]. I also believe that allowing contact between the two hands was a factor in the experiment, but not the only factor. When using two hands, subjects could often come quite close to the original position even before contact was established. Further study is necessary to determine if this differs significantly from moving a single hand relative to the environment.
As a thought experiment, one can imagine using a single hand to move the plate tool relative to a doll's head mounted on a pivot (or a similar mechanical assemblage, such a set of gimbal rings). This would be analogous to using one hand on a tablet fitted with a physical template, which works well [30]. But the current experimental data suggest that the dynamic role of the nonpreferred hand also led to a cognitive performance benefit in terms of task strategy chosen. The task syntax supported by moving one hand relative to a reference object on a pivot is quite similar to that required by this experiment's unimanual condition. As such I speculate that using the pivot with just one hand would have some of the same limitations: users might have difficulty anticipating what orientation of the pivot object would be most facile for the action of the plate tool.
When using two hands, using a mechanical pivot (as opposed to the current free-moving doll's head) might have advantages for some tasks. A jewelers or a technician's workbench provides an example from the real world: a clamp or armature is often used to hold an object being worked on, so that both hands can perform detailed work on the object being held. An interesting issue for future research is to see if a mechanical apparatus is really necessary, or whether a virtual solution using some combination of constraint modes and clutching operations would be viable. I speculate that a combination of the two approaches could provide a powerful yet highly usable interface metaphor.
1
All experimental data was collected using a Polhemus FASTRAK [137] six degree-of-freedom magnetic tracking system.
2
I have occasionally seen this same problem in the context of the neurosurgery application.
[Top] [Prev] [Next] [Bottom]
Copyright © 1996, Ken Hinckley. All rights
reserved.