Thursday, January 24, 2013

Rubine Algorithm - Specifying Gestures By Example


Written in 1991, Dean Rubine's paper called, "Specifying Gestures By Example" is one of the earliest and most widely used algorithms for single stroke sketch recognition today.

The Rubine recognizer is essentially a weighted linear discriminator that classifies given single stroke sketches based on a training set. This means that the Rubine recognizer 'learns' what the shapes look like through features much like how humans visually perceive the shapes. This is opposed to other recognizers at the time that hard coded the properties of the shapes that were supposed to be recognized. Another major strength of Rubine's algorithm comes from the simple set of features that have a few interesting properties:

  • Each feature is incrementally computable in constant time per input point.
    • Allows large gestures to be handled as efficiently as small ones
  • A small change in the input results in a small change in the feature value
    • Classification performs better since feature is not multi-modal
  • Each gesture is 'meaningful'
    • Feature can be used in gesture semantics as well as for recognition

Features:
  • Initial direction (angle) of gesture
    • f1 = cos(alpha) = (x2 - x0)  / sqrt( (x2 - x0)^2 + (y2 - y0)^2) )
    • f2 = sin(alpha) = (y2 - y0)  / sqrt( (x2 - x0)^2 + (y2 - y0)^2) )
      • Using both sin and cos helps avoid discontinuities as the angle passes through 2pi and 0
  • Length of the diagonal of the bounding box (size of gesture)
    • f3 = sqrt( (Xmax - Xmin)^2 + (Ymax - Ymin)^2 )
  • Angle of the bounding box
    • f4 = arctan( (Ymax - Ymin) / (Xmax - Xmin) )
  • Distance between first and last points
    • f5 = sqrt( (Xn-1 - X0)^2 + (Yn-1 - Y0)^2 )
  • The angle between the first and last points
    • f6 = cos(beta) = (Xn-1 - X0) / f5
    • f7 = sin(beta) = (Yn-1 - Y0) / f5
  • Total gesture length
    • f8 = sum from i = 0 to i = P-1 ( sqrt( deltaXi ^2 + deltaYi^2 ) )
      • Where deltaXi = Xi+1 - Xi  and deltaYi = Yi+1 - Yi
  • Total angle traversed
    • f9 = sum from i = 1 to i = P-2 ( THETAi )
      • Where THETAi = arctan( (deltaXi *  deltaYi-1  - deltaY * deltaXi-1) /
                                                ( deltaXi * delta Xi-1 + deltaYi * deltaYi-1 ) )
  • Total absolute values of the angles between each point
    • f10 = sum from i = 1 to i = P-2 ( abs(THETAi) )
  • Total squared values of the angles between each point (Sharpness)
    • f11 = sum from i = 1 to i = P-2 ( THETAi ^ 2 ) 

Rubine also discusses a less well remembered system used to implement gestural interfaces on top of existing drag-and-drop based programs called, "GRANDMA" which stands for Gesture Recognizers Automated in a Novel Direct Manipulation Architecture. Primarily, GRANDMA was aimed (lol) at making it easy to use sketch recognition since most recognition features were being custom built for every new program. Other than its (rather funny) name, I don't think GRANDMA is particularly memorable, but maybe that's coming from my perspective as a student in the year 2013 where gestural interfaces are fairly common on popular mobile devices.

Cited work:
Specifying Gestures by Example, Dean Rubine, Information Technology Center, Carnegie Mellon University, Pittsburgh, PA, SIGGRAPH 91, Proceedings of the 18th annual conference on Computer graphics and interactive techniques, Pages 329-337, ACM New York, NY, USA 1991, http://dl.acm.org/citation.cfm?id=122753

No comments:

Post a Comment