In 2007, Wobbrock et al. released a paper called, "Gestures without Libraries, Toolkits or Training: A $1 Recognizer for User Interface Prototypes," that has since revolutionized the way research is presented and algorithms are thought of. Traditionally, research and tools aimed to solve any given problem in computer science focuses on the effectiveness or efficiency of the solution. As such, the solutions tend to get progressively more complicated, involved, or sophisticated. While this growth in complexity is typical and should somewhat be expected, often times the people that could most benefit from these solutions are forgotten. Wobbrock et al's noticed this trend and attempted to address the need for a simple, understandable, cheap, solution for their target problem. They understood that the average programmer and interface specialist does not typically have access to the more advanced recognition techniques, such as Hidden Markov Models, neural networks, and feature-based statistical recognizers (Rubine). To address the needs of the community, Wobbrock et al sought to create a easy to implement recognizer that performed reasonably well. The result was their $1 recognizer.
Today, the $1 recognizer is an extremely well known algorithm even outside sketch recognition community simply for the fact that it has been readily adopted by the community and widely deployed. Their success highlights the value in understanding the beneficiaries of your work and making your solution both effective and simple.
Unlike the (previously discussed) Rubine algorithm, the $1 algorithm does not work by using a classifier. Rather, the $1 algorithm performs recognition on the fly and compares given candidate gesture to a template set, returns a distance value, and returns an list of matches ordered from most similar to least.
Before the $1 recognizer can calculate the distance, it first performs a number of steps aiming at orienting and positioning the candidate gesture into the same 'frame-of-reference' to the template gestures in a pre-processing phase. The steps that the recognizer takes in the pre-processing phase has the effect of making the comparison time, rotation, scale, and position invariant.
Pre-processing steps:
- Resample the point path to N points
- New points are equidistant along the path
- Effect: Temporal Invariance
- Rotate to set indicative angle to 0°
- Indicative angle is the angle between the centroid (xAvg, yAvg) and the first point (x0,y0)
- Effect: Rotational Invariance
- Scale and Translate
- Scale gesture (non-uniformly) to a reference square
- Effect: Scale Invariance
- Translate gesture so that centroid is at position (0,0)
- Effect: Position Invariance
Matching Step:
4. Find the average distance between the given candidate gesture and the templates (Ti).
- AvgDistance =
( sum from k=1 to N ( sqrt( ( C[k]x - Ti[k]x ) ^ 2 + ( C[k]y - Ti[k]y ) ^ 2 ) ) ) / N
After getting the average distances, we take the minimum, di*, and convert to a score in the range [0..1]
- Score = 1 - ( di* / ( ( 1/2 ) * sqrt( size^2 + size^2 ) ) )
Since the indicative angle is an approximation of the best angle to use, this step can optionally be followed up with finding the distance at the best angle. The fastest technique used to find this best angle (with a small amount of error) that was discussed was the Golden Section Search (GSS).
Wobbrock et al's analyzed the $1 recognizer alongside the Rubine classifier and the Dynamic Time Warping (DTW) recognizer and found that for...
- The Effect of Training of Recognition Errors
- $1 performed comparably to DTW
- $1 performed better than Rubine, especially for 3 training examples or less.
- The Effect of Speed on Recognition Errors
- $1 performed comparably to DTW
- $1 performed better than Rubine
- Recognition Scores in N-best List [0..1]
- $1 discriminated between other gesture classes slightly more sharply than DTW
- $1 discriminated between other gesture classes considerably more sharply than Rubine
- Speed of Recognition
- $1 overwhelmingly outperformed DTW
- 0.04 mins to 60.02 mins on 14,400 tests
- $1 outperformed Rubine
- 0.04 mins to 0.60 mins on 14,400 tests
$1 drawbacks:
- It's a geometric template matcher so result is the closest match in 2D space.
- It can't distinguish gestures who identities depend on
- specific orientations
- aspect ratios
- locations
- Horizontal and Vertical lines are deformed by non-uniform scaling
The $1 recognizer represents a great and simple solution for making basic sketch recognition on one stroke gestures simple.
Cited work: Gestures without Libraries, Toolkits or Training: A $1 Recognizer for User Interface Prototypes, Jacob Wobbrock and Yang Li, The Information School, University of Washington, Seattle, Washington, Andew D Wilson, Microsoft Research, Redmond, Washington, UIST 2007, Proceedings of the 20th annual ACM symposium on User Interface Software and Technology, Pages 159-168, ACM New York, NY, USA 2007, http://dl.acm.org/citation.cfm?id=1294238




