EE368: Digital Image Processing FACE DETECTION Siddharth Joshi, Gaurav Srivastava {[email protected], [email protected]} Abstract Human face detection has become a major field of interest in current research because there is no deterministic algorithm to find face(s) in a given image. Further the algorithms that exist are very much specific to the kind of images they would take as input and detect faces. The problem is to detect faces in the given, colored class group photograph. The approach, we take is a mixture of heuristic and known algorithms. To detect faces we can put a number of simple rejection blocks in series, until we get the faces. Deeper the rejection block, more specifically it can be trained to eliminate non-faces. Various methods like neural networks, template matching, maximal rejection, fisher linear discriminants and eigenfaces have been tried. Finally, a combination of skin color segmentation, morphological operations (erosion), and eigenfaces has been used. 1. Introduction A face detection algorithm is very specific to the kind of problem and cannot be guaranteed to work unless it is applied and results are obtained. We have followed a multiple algorithm approach for face detection, which is in effect a series of simple rejection blocks. In designing the final algorithm many different schemes have been tried. The first step is skin segmentation, which is good enough to reject most of the data. Thus this forms the first step of the final algorithm also. Neural networks have also been applied (which is described later) but have not been included in the final algorithm. As the data gets more compact and we need more specific rejection classifiers. Fisher Linear Discriminants and Template matching are found not to perform as well as eigenface method. So in the final version we used eigenface projection method. In the overall algorithm there are many parameters that have to be decided by experimenting, and are chosen with respect to optimality of result, runtime etc.2. Skin Color Segmentation Skin color segmentation is representing the RGB image into a new 3-D transformed space such that the various skin colors lie close to each other and this space is a small, constrained space. Skin color segmentation is nothing but initial method of rejection. The smaller the space in which the skin values lies, the better is the quality of segmentation, that implies better is the rejection and we have less data to worry about in the next block. We have used the YCbCr-space skin color segmentation. YCbCr gives better results than the HSV space. The transformation equations from RGB to YCbCr space are shown below. Figure 1. Transformation Equation RGB to YCbCr As shown in the figure, the skin color values form quite compact space, and are bounded by the lines shown. The equations for the lines are also shown below. Figure 2. Skin color values in YCbCr space.Figure 3. Equation in YCbCr space that bound the skin values This skin segmentation does well in marking out the areas where there is actually skin, i.e. faces, hands etc. But it also marks the points on unwanted objects like the wall, the bar, trees, skin colored jackets etc. These false positives do cover a lot of area as compared to the actual skin 3. Neural Networks The neural network is based on histogram approach rather than directly training the neural network of a fixed size image. The neural network first converts the RGB image to YES space. The equations for the conversion are shown below. The transformed skin color values have the E, S component close to zero. = BGR. -. .. . -.. . .SEY500025002500000050005000063068402530. Figure 4. Transformation Equation RGB to YCbCr The neural network takes input as an image, much smaller than a group photograph, usually a size which contains a single face; and tell whether the given image has a face or not. We feed the neural network with blocks cut from the given image and then according to the output keep the image for further processing otherwise reject the block. Exactdetails are described later. The given image (or block of image) is first converted to YES space. And then histograms are constructed for each of the dimensions. These histograms values are fed to the neural network. The number of histograms decides the complexity of the neural network. We have kept his value at 20. Still the neural operation is expensive in consideration with time, we must judiciously determine the inputs to the neural network. Since we have the marked skin segment data, we can use this information (leave the unmarked blocks) to cut out blocks and give them as input to the neural network. This operation also cannot be done for every pixel which has been marked. So we go through the image in blocks of size 25X25. Further the input block to the neural has a size or around 60X90, which when checked as face is not fed again to the neural, even if we traverse the lower part again. And if is detected to be a non-face we clear the 25X25 or a 25X50 block. This neural network eliminated quite a good amount of false skin marked data. It almost never loses a face, but objects like wall and jackets are also detected. Figure 5. A sample run of neural network on the image. In the final version of the algorithm we have not used the neural network as we emphasized more on morphological operation.4. Eigenfaces We are using eigenfaces for detection because the eigenspace formulation leads to a powerful alternative to standard techniques such as template matching or normalized correlation. The reconstruction error (or residual) of the eigenspace decomposition is an effective indicator of similarity. The residual error is easily computed using the projection coefficients and the original signal energy. The detection is equivalent to matching with a linear combination of eigentemplates and allows for a greater range of distortions in the input signal (including lighting, and moderate rotation and scale). We have used a training set of
View Full Document