The new technology makes it possible to use hand gestures to perform a host of complex operations. For instance, a user can wave his or her hand up and down to make a menu selection on a PC and then confirm the selection with a ‘push’ gesture, even from a distance. Maps and other images can also be expanded or shrunk with ‘push’ and ‘pull’ gestures.
In recent years, attention has been gathering around hand gesture recognition technologies, which enable devices to be controlled through natural human movements. This approach to operating a device through hand motions that are detected with a camera is typically employed in situations when a user is physically separated from a PC or tablet device, when the user’s hands are wet or dirty, or when the user does not want to directly touch a publically accessible device due to hygiene concerns.
When attempting to substitute hand gestures for mouse operations, it has been possible to detect vertical and horizontal hand motions that are similar to how a mouse cursor moves, but detecting depth, as in forward and backward hand motions, which are equivalent to mouse clicks, has been difficult.
Detecting forward and backward hand motions with a single camera relies on changes in the area of the user’s palm. When the user is wearing short-sleeved clothing, however, or when the background is a colour similar to the user’s skin tone, it is difficult to distinguish between the user’s palm and the background.
As a result, it has been impossible to accurately extract the area of the user’s hand from the background. Until now, methods using distance sensors that measure distance or multiple cameras have been used to detect forward and backward motion, but the equipment required for these approaches has made them quite costly.
Fujitsu Laboratories and Fujitsu R&D Center claim their new technology can accurately extract the area of the user’s hand from video captured on a single camera, thereby enabling constantly changing differences in the user’s hand size to be reliably tracked. Key features of the new technology are as follows.
The hand is initially detected in a captured image, from which it is clipped out using block units of set sizes. Calculations are then made to determine whether or not the hand’s characteristics (a palm model) exist as pre-registered data.
The actual position and size of the hand is determined by performing this operation against the backdrop of the overall image while changing the position and size of the blocks. Next, the hand’s area is extracted by taking the determined hand position and size and simultaneously changing the colour threshold of certain parts to express the hand’s original image. These operations ensure stable detection of the hand area.
The new technology showcased at the International Conference on Machine Vision Applications (MVA) 2013, commencing May 20, 2013, at Ritsumeikan University in Kyoto, Japan.