An Efficient Real Time Method of Fingertip Detection

Reading time: 5 minute
...

📝 Original Info

  • Title: An Efficient Real Time Method of Fingertip Detection
  • ArXiv ID: 1108.0502
  • Date: 2015-03-13
  • Authors: Jagdish Lal Raheja, Karen Das, Ankit Chaudhary

📝 Abstract

Fingertips detection has been used in many applications, and it is very popular and commonly used in the area of Human Computer Interaction these days. This paper presents a novel time efficient method that will lead to fingertip detection after cropping the irrelevant parts of input image. Binary silhouette of the input image is generated using HSV color space based skin filter and hand cropping done based on histogram of the hand image. The cropped image will be used to figure out the fingertips.

💡 Deep Analysis

Figure 1

📄 Full Content

Interactive systems based on gesture recognition needs a real time implementation to work with acceptable performance. In the literature many examples can be find out where gesture is used to control the system based on fingertip detection. Our focus is on hand gesture recognition in natural way without using any marker of sensor based gloves. Many researchers have proposed different methods for dynamic hand gesture recognition using fingertip detection, but several limitations can be seen in these approaches. Garg [Garg, P. et al., 2009] uses 3D images in his method to recognize the hand gesture, but this process is complex and also not time efficient. Processing time is very critical factor in real time applications as Ozer [Ozer, I.B.,, 2005] states "Designing a real-time video analysis is truly a complex task". Yang [Yang] analyses the hand contour to select fingertip candidates and find peaks in their spatial distribution and checks local variance to locate fingertips. These methods are not invariant to the orientation of the hand. There are other methods, which are using directionally Variant templates to detect fingertips [Kim, J.M. and Lee, W.K., 2008], [Sanghi, et al., 2008]. Few other methods are dependent on specialized instruments and setup like the use of infrared camera [Oka, K. et al., 2002], stereo camera [Ying], a fixed background [Crowley, J.L., et al., 1995], [Quek, F.K.H. et al., 1995] or use of markers on hand. This paper describes a novel method of motion patterns recognition generated by the hand without any kind of sensor or marker.

The detection of moving fingertips in video needs a fast and robust implementation of method. Many fingertip detection methods are based on hand Segmentation technique because it decreases pixel area which is going to process, by selecting only areas of interest. However most hand segmentation methods cannot do a clearly hand segmentation under some conditions like fast hand motion, cluttered background, poor light condition [Christian]. Poor hand segmentation method performance usually invalidates fingertip detection methods. Researchers [Oka, et al., May 2002], [Oka, et al., Dec 2002], [Sato, 2000] uses infrared camera to get a reliable segmentation. Few researchers [Crowle, et al., 1995], [Quek, et al., 1995], [Christian], [Tomita, et al., 1994], [Keaton, et al., 2002], [Wu et al., 2000] limits the degree of the background clutter, finger motion speed or light conditions to get a reliable segmentation in their work. Some of fingertip detection methods cannot localize accurately multidirectional fingertips. Researchers [Crowley, et al., 1995], [Quek, et al., 1995], [Brown, et al., 2000], [Tomita, et al., 1995] assumes that the hand is always pointing upward to get precise localization.

The skin filter is used on the current input image frame of video. It is based on HSV (can also be based on YC b C r ) colour space. In the HSV colour space the skin would be filtered using the chromacity (hue and saturation) values while in the YC b C r colour space, the C b , C r values would be used for filtering skin. The skin filters are used to create a binary image with background in black colour and the hand region in white. In the next step the binary image need to be smoothened using the averaging filter. Figure 2 shows different steps of skin filtering process. There can be many errors in the output image of skin filter step because of wrong pixel detection or some skin pixels in the background of hand. To remove these errors, the biggest BLOB (Binary Linked Object) is considered as the hand and rest the background as shown in figure 3(a). The biggest BLOB represents hand coordinates in ‘1’ and ‘0’ to the background. The filtered out hand is shown in figure 3(b) after removing all errors. The only limitation of this filter is that the BLOB for hand should be the biggest one.

Wrist end detection is based on the histogram of the binary silhouette. Histograms generating functions are:

Here imb represents the binary silhouette and m, n represents the row and columns of the matrix imb.

After a 4-way scan of image, we choose the maximum value of ‘on’ pixels coming out of all scanned (‘1’ in the binary silhouette). It was noted that maximum value of ‘on’ pixels represents the wrist end and opposite end of this scan would represent the finger end. Figure 4 shows the scanning process. The yellow bar showed in figure 4 corresponds to the first ‘on’ pixel in the binary silhouette scanned from the left to right direction.

Similarly the green bar corresponds to right to left, red bar corresponds to down to up, and pink bar corresponds to up to downward scan ‘on’ pixels in the binary silhouette. Now, it is clear that red bar had greater magnitude than other bars for that particular image frame. So we can infer that the wrist end is in downward direction of the frame and consequently the direction of finger is in the upward direction. Here the direction from wrist to fi

📸 Image Gallery

cover.png

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut