In this paper, Paulson and Hammond develop a free-form sketch recognition system called, "PaleoSketch" that very accurately (98-99%) and geomerically recognizes the following low level primitives
- Lines
- Polylines
- Circles
- Ellipses
- Arcs
- Curves
- Spirals <= Unique feature!
- Helixes <= Unique feature!
After recognition, the user is presented with a disambiguation menu to select from the shapes that fit the primitive test conditions. Once the user has selected their intended shape, the sketch is "beautified" by removing the actual user strokes and replacing it with a Java2D shape. The primitives recognized by the lower level PaleoSketch recognizer are restricted to single strokes. The PaleoSketch system also can recognize combinations of these primitives through a higher level hierarchical recognizer similar to other systems in the field.
The primary focus of PaleoSketch was to place the least amount of restrictions on the user and have the system conform to the user rather than the other other way around. As was mentioned in my previous post, the geometric recognizers have the benefit that recognition is not as heavily dependent on how the user drew the sketch as in features/gesture based recognizers. So, staying in line with their goals of supporting user-independent recognition, PaleoSketch primitive recognition is geometric.
Paulson and Hammond also cite that their use of the corner recognition and multistage recognition scheme (pre-recognition, primitive/shape recognition, beautification, and higher-level/hierarchical recognition [See Figure 1] ) was largely influenced by Sezgin et al.'s work on the SSD recognizer. Yu and Cai's work was also a major contributing inspiration, particularly as it related to corner recogniotion and a feature area error metric.
The Recognition Stages
In pre-recognition, PaleoSketch...
- Removes consecutive and duplicate points
- Constructs direction, speed, curvature, and corner graphs
- Calculates NDDE (Normalized Distance between Direction Extremes) <= New feature!
- curved stroke and polyline discrimination
- Calculates DCR (Direction Change Ratio) <= New feature!
- curved stroke and polyline discrimination
- Removes tails
- Tests for overtracing
- Tests for closed figures
In primitive/shape recognition, PaleoSketch...
does the Shape Tests for the primitives listed above. If a shape doesn't meet pass any of the shape tests, it is classified as a 'complex shape'. After this classification, the shape is broken into substrokes and recombined to see if any of the constituent combinations can be redefined as a primitive shape.
Then in beautification, PaleoSketch...
beautifies the stroke by returning the Java2D shape.
Finally, in higher-level/hierarchical recognition, PaleoSketch...
ranks/scores the primitive shapes and which contribute to a classification system that determines whether multiple strokes should be identified as polylines, curves, or 'complex' shapes. Complex shapes are reanalyzed for tails. Then the identified tails are removed and retested to determine if they can now be recognized as shapes. If they pass, they're reclassified as shapes. Otherwise, they remain complex shapes.
In the accompanying study, the full Paleo recognizer was tested against
- a version of Paleo without the two new features (DCR and NDDE)
- a version of Paleo without the ranking
- the SSD recognizer
The results clearly show that the full version of Paleo is the most accurate with the DCR and NDDE features making a very significant contribution giving the top interpretation and a lesser, but still major, contribution toward giving the correct interpretation.
The PaleoSketch multi-stage recognition in combination with a clearly thought out set of features and shape tests makes the paleo-recognizer a very accurate and impressive system. The examples that were given of of cases where the recognizer failed are clearly areas where people probably would have failed to classify them also which I think makes an interesting point all in itself: that recognizers can't always (100%) understand what any given user intends to convey or draw; People are, frankly, imperfect. Put another way, "To err is human."
The inclusion of so many thresholds in the paper feels disconcertingly arbitrary, however, and I would have liked to have seen some more discussion on why those thresholds were chosen and if they are still valid for under a broad range of domains and applications.
Brandon Paulson and Tracy Hammond. 2008. PaleoSketch: accurate primitive sketch recognition and beautification. In Proceedings of the 13th international conference on Intelligent user interfaces (IUI '08). ACM, New York, NY, USA, 1-10. DOI=10.1145/1378773.1378775 http://doi.acm.org/10.1145/1378773.1378775



No comments:
Post a Comment