Wednesday, February 27, 2013
Protractor: A Fast and Accurate Gesture Recognizer
Published in 2010 and written by Yang Li, this paper presents a fast, flexible, and simple template-based recognizer that calculates the distance of a given gesture to the template set using an angular based metric.
Template based recognition lends itself well to situations where users create their own personalized gestures since the recognition is purely data driven and feature agnostic, making it flexible and responsive to the users provided input.
Li describes geometric and feature based recognizers as being parametric, complex, and inflexible methods that are unresponsive to input. This observation is not without merit and such recognizers do tend to inflexibility cause users to have to conform to the recognizer and existing training data that he recognizer is running off of.
Li also cites that users do not want to provide multiple instances of their gesture, hence having the flexibility to be able to accurately recognize with small amounts of input is paramount.
Interesingly, Li notes that Protractor can be either rotation invariant or rotation senstive which is not an option that I have ever seen offered in other recognizers where you're usually stuck with one mode or either. This choice is nice since some gestures aren't recognizable if you're stuck on one side of it.
Preprocessing
Like $1, Protractor resamples the points using 16 points and translates the centroid of the gesture, (Xavg, Yavg), to origin, (0,0). Unlike $1, Protractor then enters into a noise reduction phase for gesture orientation. If the recognition is set to be rotation invariant, the gesture is rotated so that the indicative angle, or the angle between the origin and starting point, is zero. Otherwise, the gesture is rotated to the nearest 8-way base orientation.
With the gestures processed for invariance, the 16 points make a vector used to calculate inverse cosine distance between gestures through a closed-form solution that approximates the minimum angular distance. This closed-form solution is much quicker than finding the optimal solution with an iterative process.
Study
Yi ran his Protractor recognizer against the $1 recognizer on a set of 4800 samples for 16 gestures and found that their recognition rates were similar. However, the time to recognize was much smaller for Protractor.
Yi also studied the effect of orientation sensitivity (invariant, 2 way, 4 way, 8 way base orientations) on error rates and found that 8 way was significantly less accurate that the other 3 sensitivity levels due to noise in the data.
Yi also points out that his recognizer requires 1/4 the memory space that $1 does which is important for the mobile devices that the recognizer would probably be used on.
I certainly wouldn't have come up with the idea to make a template recognizer based an angle based distance. I never really thought about the implications of designing recognition schemes around the target platform (mobile) but it makes sense and I can see how Protractor is strong in that respect.
I'd like to see how different kinds of data sets would affect Protractors recognition rates and performance compared to $1...
Yang Li. 2010. Protractor: a fast and accurate gesture recognizer. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '10). ACM, New York, NY, USA, 2169-2172. DOI=10.1145/1753326.1753654 http://doi.acm.org/10.1145/1753326.1753654
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment