Sketch Recognition: Visual Similarity of Pen Gestures

In this paper, Long et al. undertook an ambitious mission acting as explorers and cartographers to map the human perceptual space to a computational model for gesture similarity.

Citing how gestures iconic nature lends themselves to memorability, Long et al. begin the paper by emphasizing the benefits and widespread use of gestures for UI control. Unfortunately, gestures are also difficult to design due to three main problems:

A gesture may be difficult to recognize computationally
A gesture may appear to be too similar to another gesture by the users making it difficult to remember
A gesture may be difficult to learn or remember

Exploring past work on human perceptual similarity, Long et al. note that

The logarithm of quantitative metrics correlate with similarity
If range of differences in gestures is small, the differences are linearly related to perceived similarity
The same person may use different metrics for similarity for different gestures

Long et al. ran two experiments.

In the first, a diverse gesture set was created that varied largely along multiple features and orientations.

In the second, the gesture sets were divided into several categories that features similar gestures varying along a particular features.

For both experiments, the participants were given display tablets with pens and were shown all possible combinations of three gestures at a time (called triads) and were told to choose the gesture that was most dissimilar. The results were then analyzed

to determine what measurable geometric properties of the gestures influenced their perceptual similarity

Measured by MDS (Multi-Dimensional Scaling) with dimensions 2 through 6. The best dimensionality was determined by stress and goodness-of-fit (r^2)
The distance between points were the reported dissimilarities given by the participants
Large distances between points along a dimensions means that the corresponding geometric property is the greatest determinant of similarity/dissimilarity

produce a model of gesture similarity that, given two gestures, could predict how similar people would perceive those gestures to be

Measured by running regression analyses to determine which geometric features correlated with reported similarity/dissimilarity

Weights correspond to force of contribution to similarity for the feature

In addition to Rubine's features, Long also uses the following features as candidates for similarity:

Long Feature 14

Aspect

abs( 45° - angle of the bounding box (RubineFeature4) )

Long Feature 15

Curviness

Long Feature 16

Total Angle Traversed / Total Length
Rubine Feature 9 / Rubine Feature 8

Long Feature 17

Density Metric 1

Total length / distance between first and last points
Rubine Feature 8 / Rubine Feature 5

Long Feature 18

Density Metric 2

Total length / Length of diagonal of bounding box
Rubine Feature 8 / Rubine Feature 3

Long Feature 19

Non-Subjective Openness

distance between first and last points / Length of diagonal of bounding box
Rubine Feature 5 / Rubine Feature 3

Long Feature 20

Area of Bounding Box

MaxX - MinX * MaxY - MinY

Long Feature 21

Log(area)
Log( Long Feature 20 )

Long Feature 22

Total angle / total absolute angle
Rubine Feature 9 / Rubine Feature 10

Long Feature 23

Log( total length )
Log( Rubine Feature 8 )

Long Feature 24

Log( aspect )
Log( Long Feature 14 )

The results from the first experiment showed that curviness, Log(aspect), total absolute angle, density 1, and to a lesser extent, angles between first and last points, initial angles and distance between first and last points were important features for distinguishing between features from the first set.

The results from the second experiment also found that Log(aspect), density 1, and total absolute angle were important features for discrimination between gestures. Additionally, Long et al. found that figures with horizontal or vertical orientations were perceived as more similar than figures with diagonal orientations.

The predictive model derived form experiment 1 had slightly more predictive power then the model derived from experiment 2.

Long et al end their paper by noting that their exploration of the human perceptual space is not exhaustive since that space has not yet been completely explored (my opinion: or may not even be entirely knowable).

Cited Work
A. Chris Long, Jr., James A. Landay, Lawrence A. Rowe, and Joseph Michiels. 2000. Visual similarity of pen gestures. In Proceedings of the SIGCHI conference on Human Factors in Computing Systems (CHI '00). ACM, New York, NY, USA, 360-367. DOI=10.1145/332040.332458 http://doi.acm.org/10.1145/332040.332458

Sketch Recognition

Friday, February 8, 2013

Visual Similarity of Pen Gestures

No comments:

Post a Comment