Research Journal of Recent Sciences _________________________________________________ ISSN 2277-2502 Vol. 3(4), 20-26, April (2014) Res.J.Recent Sci. International Science Congress Association 20 Predicting Gender Using Iris ImagesBansal A., Agarwal R. and Sharma R.K.Department of ECE, G.L.A. University, Mathura, UP, INDIA Department of EIED, Thapar University, Patiala, Punjab, INDIA School of Mathematics and Computer Applications, Thapar University, Patiala, Punjab, INDIAAvailable online at: www.isca.in , www.isca.me Received 9th August 2013, revised 30th October 2013, accepted 29th November 2013Abstract Among various biometric authentication systems iris recognition system is considered to be more accurate and reliable. The main objective of these systems is to identify the user as an authentic or an imposter. These systems does not reveal about imposter’s gender or ethnicity. Majority of practices for gender classification utilize facial information. Very few references in the literature reported the identification of human attributes such as gender with the help of iris images. In this paper gender has been identified using iris images. Feature vector from an iris image is created by combining statistical features and texture features using wavelets. A gender prediction model using Support Vector Machine (SVM) has been developed and an accuracy of 85.68% has been achieved. Keywords: Iris recognition; gender prediction; support vector machine; wavelet transform; statistical features. IntroductionHuman beings can recognize gender very easily as compared to a machine. Gender classification is an important problem of computer vision. Gender recognition is useful in the area of forensic sciences, security systems, and many more. In recent years many researchers1-4 have made the most of facial images to classify gender whereas a few studies5,6 of this nature have utilized the properties of iris images to predict gender. Iris recognition an accurate and reliable biometric authentication system identifies an individual that has been enrolled. The user is identified as an authentic or an imposter. These days for security, it is equally important to determine the gender of an imposter. Iris images possess distinct phase information which spans more than 200 degrees of freedom. Human attributes such as ethnicity and gender can be determined using these distinct features. Khan et al. presented a critical evaluation and comparative study of different techniques used for gender classification. Qiu proposed a model to determine ethnicity from iris images9,10. Their model classifies an individual as an Asian or non-Asian. They employed -means clustering algorithm and Gabor filter bank to obtain commonly occurring fundamental texture elements. A correct classification rate of 88.3% on test set using SVM classifier has ben reported by them. Thomas et al. claimed to be first to predict gender from iris images. They used IrisBEE software for segmentation and normalization and extracted texture features from real component of log-Gabor-filtered normalized iris images. They also utilized certain geometrical features such as area of pupil, difference in the centre of iris and pupil, difference in the area of iris and pupil etc. Their model achieved accuracy close to 80% using C4.5 classifier. Lagree and Bowyer predicted both ethnicity and gender from iris textures. They reported an accuracy of more than 90% for ethnicity determination and 62% for gender determination. Here, significant features from iris images have been extracted using a statistical feature extraction technique along with a 2-Wavelet tree based feature extraction technique. Features extracted using two techniques have been combined to form a feature vector. Binary classifier SVM has been used to predict gender using feature vector created from iris image. Material and MethodsGender classification/prediction model using iris image requires four steps, namely, iris image capturing, image pre-processing, feature extraction and classification. These steps are discussed in brief in following sub-sections. Iris Image capturing: In this work, I-SCAN-2 Dual iris scanner of Cross Match Technologies (http://www.crossmatch.com/i-scan-2.php) has been used for capturing the iris image. Front view of I-SCAN-2 is shown in figure-1. It uses near-infrared illumination, and produces bmp images. Image of both irises is captured during single scan. A dataset of 400 iris images of 200 subjects (100 females, 100 males) generated using I-SCAN-2 has been considered for the experimentation. Image pre-processing: The process of converting the image of an eye into a form, from where the desired features can be extracted is known as Image pre-processing. Initially, image localization/segmentation is performed to segment iris region by detecting inner and outer boundaries of iris and removing eyelids and eyelashes that may occlude the iris region. Here, this step is being carried out using circular Hough transform11-14. Research Journal of Recent Sciences ______________________________________________________________ ISSN 2277-2502Vol. 3(4), 20-26, April (2014) Res. J. Recent Sci. International Science Congress Association 21 Figure-1 I-SCAN-2 Dual iris scanner Eye images of a person captured at a different instant may have dimensional inconsistencies that occur due to stretching of the iris caused by pupil dilation from varying levels of illumination, varying imaging distance, rotation of the camera, head tilt, and rotation of eye within eye socket. After segmentation normalization is carried out to remove these dimensional inconsistencies. Daugman proposed a homogeneous rubber sheet model for normalization in which doughnut shape iris region is converted into a rectangular image with angular and radial resolutions7,15. In normalization, with centre of pupil as reference, radial vectors are drawn on the iris region extending from iris-pupil boundary to iris-sclera boundary. The number of data points selected along each radial line is defined as the radial resolution and the number of radial lines going around the iris region is defined as the angular resolution. In 2-normalized iris image number of rows corresponds to radial resolution and number of columns corresponds to angular resolution. Here, radial resolution is considered to be 100 and angular resolution is considered to be 500. So, after normalization a 100 x 500 normalized iris image is obtained. Next, image enhancement is carried out to compensate low contrast and poor light source. Figures 2(a)–2(d) show the images of different stages of image pre-processing. Figure-2(a) Original eye image Figure-2(b) Segmented iris image Figure-2(c) Normalized iris image Figure-2(d) Enhanced normalized iris Image Feature Extraction: Feature extraction is most important task in applications employing image processing15,16. Different techniques such as Gabor filters7,17, Wavelet transform18, Hilbert transform13, cumulative SUM based change analysis19etc. have been employed by various researchers to extract the features from iris. A feature vector created from these features is used for classification. In this work, feature vector for an iris image is obtained by combining two different feature extraction techniques, namely, statistical and wavelet transform. Statistical Features: Statistical features have been computed along two different directions, i.e., angular direction and radial Research Journal of Recent Sciences ______________________________________________________________ ISSN 2277-2502Vol. 3(4), 20-26, April (2014) Res. J. Recent Sci. International Science Congress Association 22 direction. Every row of 2-D normalized iris image corresponds to a virtual circle drawn on iris region. Top-most row of 2-array represents a circular ring near to iris-pupil boundary and bottom-most row represents a circular ring near to iris-sclera boundary. Moving from top to bottom row indicates moving along radial direction. Statistical features computed along each row correspond to features computed along virtual circles. Similarly, statistical features have been computed along each column of 2- array that corresponds to radial vectors drawn from pupil to sclera in iris region at different angles20. Moving from one vector to another correspond to angular direction. Statistical features that have been considered in this work are mean, median and standard deviation. Statistical features computed in this work are different from the features computed by Thomas et al. They computed mean, standard deviation and variance on real part of a log-Gabor-filtered normalized iris image whereas here, statistical features have been computed using pixel values of normalized iris image. The statistical computation gives a feature vector F for an image. This vector is denoted as: (1) where X is a row vector of mean computed along each row, Md is a row vector of median computed along each row, s is a row vector of standard deviation computed along each row, Xis a row vector of mean computed along each column, Md is a row vector of median computed along each column, s is a row vector of standard deviation computed along each column. Size of one statistical feature computed along each row would be 100 and size of one statistical feature computed along each column would be 500 for a 100 x 500 sized normalized iris image. Thus, size of statistical feature vector for an iris image obtained is 1800. 2-D Wavelet tree: As discussed another set of features is extracted from an iris image using 2-DWT (Discrete Wavelet Transform). An image gets decomposed into four sub-sampled images by DWT. A N x N size image splits up into four sub-images each of size N/2 x N/2 containing information from different frequency components. After decomposition four sub-sampled images are approximation (LL), horizontal (HL), vertical (LH) and diagonal (HH). This can be noted that i. Diagonal sub-image is a high passed image in both horizontal and vertical directions, ii. Vertical sub-image is an image that has been low passed in the vertical and high passed in the horizontal direction, iii. Horizontal sub-image is an image that has been high passed in the vertical and low passed in the horizontal direction, and iv. Approximation means low pass filtered in both directions. An image can be decomposed more than once using DWT. There are mainly two ways for decomposition, namely, pyramidal and packet decomposition. In case of pyramidal decomposition, further decompositions are applied only to the approximation (LL) sub-band. At each level the approximation sub-band is further decomposed whereas, in case of packet decomposition, the decomposition is not limited to LL sub-band only rather it allows further decomposition of all sub-bands at each level. Figure-3 shows three level pyramidal decomposition and figure-4 shows two level packet decomposition. In this work, pyramidal decomposition at three levels is implemented. Figure-3 Three Level pyramidal decomposition Research Journal of Recent Sciences ______________________________________________________________ ISSN 2277-2502Vol. 3(4), 20-26, April (2014) Res. J. Recent Sci. International Science Congress Association 23 Figure-4 Two Level packet decomposition One can utilize different types of possible wavelet basis function for feature extraction. Here, three level pyramidal decomposition using db2 wavelet basis function has been considered for extracting features. Feature vector may be created by considering different possible combinations of LL, LH and HL components obtained at different level of decomposition. Here an attempt has been made to study the effect of different combinations of LL, HL and LH components obtained after third level of decomposition on system performance. Therefore, with 100 x 500 sized normalized iris image the size of feature vector is 819/1638/2457 considering 1/2/3 sub-band components. The size of the combined feature vector (statistical and wavelet) created is thus 2619/3438/4257. These features have been further used for predicting the gender of subjects using SVM classifier. Classification: In the proposed model for classification Support Vector Machine (SVM) has been used. SVM, a binary classifier that optimally separates the two classes, is based upon the statistical learning theory proposed by Cortes and Vapnik21. Burges22, Cristianini and Shawe23 provided in-depth information on SVM. In SVM, a hyper-plane is constructed as the decision surface in such a way that the margin of separation between positive and negative examples is maximized23,24. SVM employs kernel based learning algorithm. The effectiveness of SVM depends upon the selection of the kernel and the kernel parameters. Some of the kernel functions are polynomial, Gaussian and Radial Basis Function (RBF). Consider a linearly separable binary classification problem with as a feature vector and as corresponding label. The value of d is 0 for class-1 and 1 for class-2. The set (xi , d represents a pair of feature vector and corresponding label for a set of data where i = 1, 2, 3, .....N. SVM needs few data for training and produces a great generalized classification. The separating hyper-plane that defines boundary between class-1 and class-2 is given by: . x + b = 0 (2) where, is an input feature vector, w is an adjustable weight vector, and is a bias. For optimal hyper-plane optimum values of w and b is determined, with the condition that margin of separation between positive and negative examples, is maximized. For a given training set {(, y)}, the optimum weight and bias pair , b) must satisfy the equations (3) and (4). for (3) for (4) Optimal hyper-plane so obtained can be further used for binary classification for any input feature vector . The data points that satisfy equation (3) and (4) are known as support vectors. Results and Discussion In this work experiments have been conducted on a dataset of 400 images. Image processing module of MATLAB 7.1 has been used to implement the gender classification model. To validate the experiments, 10-fold cross validation technique has Research Journal of Recent Sciences ______________________________________________________________ ISSN 2277-2502Vol. 3(4), 20-26, April (2014) Res. J. Recent Sci. International Science Congress Association 24 been employed. Where, complete dataset is divided into 10 equally sized groups. Each group is used for testing once while the other nine groups are used to train the classifier. Hence, there are 10 iterations for training the SVM and testing. The accuracy of the proposed model is measured using Correct Classification Rate (CCR) independently for all iterations. The overall accuracy of the proposed model is the mean of accuracies obtained from individual iteration. In this work, as discussed two different feature extraction techniques, namely, statistical and wavelet transform have been combined to generate the feature vector. Effect of different combinations of LL, HL and LH components, obtained after third level of decomposition, along with same statistical features on system performance has been studied. Overall length of feature vector so obtained is 2619/3438/4257 considering 1/2/3 sub-band components respectively. An attempt has also been made to study the effect of kernel function on accuracy of the proposed model. Experiments have been conducted by selecting three different kernel functions, namely, polynomial, Gaussian and radial basis function for SVM. Table-1 illustrates the accuracy of the proposed gender classification model for polynomial kernel function. The accuracy of individual iteration and overall accuracy for a specific feature vector is given in table 1. It can be observed from table-1 that CCR varies from 79.0% to 85.8% for this kernel function. The maximum overall accuracy of the model with polynomial kernel function is 83.78% for feature vector generated as a combination of LL and HL sub-band wavelet coefficients along with statistical features. The accuracy of the proposed model with Gaussian kernel function is specified in table-2. One can note from table-2 that CCR of the proposed gender classification model varies from 80.0% to 87.4% for different possible combinations of feature vectors and iterations. The maximum overall accuracy obtained with Gaussian kernel function is 85.68% for feature vector generated by combining statistical features with LL and HL sub-band components. In other experiment the system accuracy with RBF kernel function is calculated and listed in table-3. Here, the system accuracy varies from 79.8% to 86.8% and the maximum overall accuracy obtained is 84.44% for feature vector generated by combining statistical features with LL and HL sub-band components. The overall accuracy of the proposed model for different length feature vector with distinct kernel functions is shown in figure-5 in bar chart form. From this it is evident that maximum efficiency of 85.68% is obtained for Gaussian kernel function with feature vector generated by combining LL and HL components of decomposed image along with statistical features. Table-1Comparison of accuracy for different feature vectors with polynomial kernel function Feature Vector/Iteration 1 2 3 4 5 6 7 8 9 10 Overall Statistical + LL 80.2% 79.0% 81.0% 84.0% 82.8% 82.8% 83.7% 81.1% 80.8% 83.6% 81.90% Statistical +HL 79.8% 79.6% 80.0% 81.4% 79.8% 80.6% 81.8% 82.6% 82.8% 81.6% 81.00% Statistical +LH 80.0% 80.2% 81.1% 82.6% 80.2% 82.0% 82.4% 83.6% 83.2% 82.8% 81.81% Statistical +LL+HL 82.6% 81.8% 82.5% 84.6% 82.5% 83.2% 84.6% 85.2% 85.0% 85.8% 83.78% Statistical +LL+LH 81.2% 82.0% 81.6% 82.2% 82.2% 81.2% 82.2% 81.0% 81.6% 82.8% 81.80% Statistical +HL+LH 80.6% 80.2% 79.9% 81.0% 80.2% 80.6% 81.6% 82.2% 82.4% 82.0% 81.07% Statistical +LL+HL+LH 79.8% 80.1% 79.8% 80.8% 81.0% 82.2% 82.8% 81.2% 80.8% 82.8% 81.13% Table-2 Comparison of accuracy for different feature vectors with Gaussian kernel function Feature Vector/Iteration 1 2 3 4 5 6 7 8 9 10 Overall Statistical + LL 82.2% 81.2% 83.2% 85.8% 84.8% 85.0% 85.8% 83.6% 83.0% 85.6% 84.02% Statistical +HL 81.6% 81.0% 82.4% 83.8% 81.6% 82.6% 83.6% 85.0% 84.6% 83.6% 82.98% Statistical +LH 82.2% 80.8% 83.8% 84.6% 82.0% 84.2% 84.2% 85.8% 85.4% 83.6% 83.66% Statistical +LL+HL 84.6% 84.0% 84.8% 86.2% 84.2% 85.0% 86.8% 87.4% 86.8% 87.0% 85.68% Statistical +LL+LH 83.1% 84.4% 83.8% 84.6% 84.0% 83.4% 84.4% 82.8% 83.2% 84.2% 83.79% Statistical +HL+LH 81.8% 82.0% 82.0% 83.4% 82.0% 82.8% 83.8% 84.2% 84.6% 83.8% 83.04% Statistical +LL+HL+LH 81.6% 81.8% 80.0% 82.0% 83.2% 84.6% 85.0% 82.8% 83.0% 84.6% 82.86% Research Journal of Recent Sciences ______________________________________________________________ ISSN 2277-2502Vol. 3(4), 20-26, April (2014) Res. J. Recent Sci. International Science Congress Association 25 Table-3 Comparison of accuracy for different feature vectors with RBF kernel function Feature Vector/Iteration 1 2 3 4 5 6 7 8 9 10 Overall Statistical + LL 81.8% 79.8% 82.8% 85.0% 83.6% 84.2% 85.8% 83.2% 82.2% 83.6% 83.20% Statistical +HL 81.0% 80.8% 81.6% 83.2% 81.2% 80.8% 83.6% 83.8% 83.2% 82.0% 82.12% Statistical +LH 81.8% 80.2% 82.4% 83.6% 81.6% 82.8% 84.2% 84.0% 83.8% 82.2% 82.66% Statistical +LL+HL 83.2% 83.0% 83.0% 84.6% 83.6% 83.6% 86.8% 85.2% 85.2% 86.2% 84.44% Statistical +LL+LH 82.2% 83.6% 83.6% 83.0% 83.0% 83.0% 84.4% 81.0% 82.8% 83.4% 83.00% Statistical +HL+LH 80.4% 81.8% 81.4% 83.2% 81.2% 82.6% 83.8% 83.2% 84.0% 82.6% 82.42% Statistical +LL+HL+LH 80.8% 80.6% 79.8% 81.2% 82.8% 84.0% 84.2% 81.8% 82.8% 84.0% 82.20% Figure-5 Overall accuracy of the proposed model System performance has also been measured in terms of specificity (1 - False Positive Rate) and sensitivity (1 – False Negative Rate) for each possible combination of different feature vector and different kernel function. The most effective specificity (1 – False Positive Rate) of the proposed model is 0.93 and the most effective sensitivity (1 – False Negative Rate) of the proposed model is 0.94. Conclusion A gender prediction model based on SVM using iris images, that combines statistical features with 2- DWT based features, has been proposed and implemented in this work. Experiments have been conducted to study the effect of different combinations of LL, HL and LH components, obtained after third level of decomposition, along with same statistical features on system performance. An attempt has also been made to study the effect of kernel function on accuracy of the proposed model. The maximum overall accuracy obtained is 83.78% with polynomial kernel function, 85.68% with Gaussian kernel function and 84.44% with RBF kernel function. Maximum overall accuracy obtained is for the LL and HL combination of decomposed image along with statistical features. Maximum accuracy of 85.68% obtained in this work is encouraging and demonstrates improved gender classification model as compared to the models proposed earlier. One can consider some more elaborating features and combinations of classifiers to improve system accuracy further. Research Journal of Recent Sciences ______________________________________________________________ ISSN 2277-2502Vol. 3(4), 20-26, April (2014) Res. J. Recent Sci. International Science Congress Association 26 References 1.Chu W.S., Rong C. and Song Chen C., Identifying gender from unaligned facial images by set classification, In Proceeding of 20th IEEE International Conference on Pattern Recognition, 2636 -2739 (2010) 2.Han X., Ugail H., andPahnar I., Gender classification based on 3D face geometry features using SVM, In Proceedings of IEEE International Conference on Cyber World, 114 -118 (2009) 3.Nazir M., Ishtaiq M., Batool A., Jaffar A., andMirza A.M., Feature Selection for efficient gender classification, In Proceedings of WSEAS International conference, 70-75 2010) 4.Rai P. and Khanna P., Gender classification using Randon and Wavelet Transform, In Proceedings of IEEE International Conference on Industrial and Information Systems, 448 – 451 (2010) 5.Lagree S. and Bowyer K.W., Predicting ethnicity and gender from iris texture, InProceedings of IEEE International Conference on Technologies for Homeland Securities (HST), 440-445 (2011)6.Thomas V., Chawla N., Bowyer K.W. and Flynn P.J., Learning to predict gender from iris images, In Proceedings of IEEE International Conference on Biometrics: Theory, Applications, and Systems (BTAS) (2007) 7.Daugman J., High confidence visual recognition of persons by a test of statistical independence, IEEE Transaction on Pattern Analysis and Machine Intelligence, 15, 1148-1161 (1993)8.Khan S.A., Nazir M., Akram S. and Riaz N.,Gender classification using image processing techniques: A survey,Multitopic Conference (INMIC), 2011 IEEE 14th International, 25,30 (2011)9.Qiu X.C., Sun Z.A., and Tan T.N., Global texture analysis of iris images for ethnic classification, Springer LNCS 3832: International Conference on Biometrics, 411-418 2006) 10.Qiu X.C., Sun Z.A., and Tan T.N., Learning appearance primitives of iris images for ethnic classification, In Proceedings of IEEE International Conference on Image Processing (ICIP), II, 405–408 (2007) 11.Kong W. and Zhang D., Accurate iris segmentation based on novel reflection and eyelash detection model,In Proceedings of 2001 International Symposium on Intelligent Multimedia, Video and Speech Processing, 263-266 (2001) 12.Ma L., Wang Y., and Tan T., Iris recognition using circular symmetric filters, In Proceedings of 16th International Conference on Pattern Recognition, 414-417 (2002) 13.Tisse C., Martin L., Torres L. and Robert M., Person identification technique using human iris recognition, In Proceedings of International Conference on Vision Interface, 294-299 (2002) 14.Wildes R., Asmuth J., Green G., Hsu S., Kolczynski R., Matey J. and McBride S., A system for automated iris recognition, In Proceedings IEEE Workshop on Applications of Computer Vision, 121-128 (1994) 15.ZahediMorteza and Mohamadian Zahra, A Fully Automatic and Haar like Feature Extraction-Based Method for Lip Contour Detection, Research Journal of Recent Sciences, 2(1), 17-20 (2013) 16.Mohamadian Zahra, Image Duplication Forgery Detection using Two Robust Features, Research Journal of Recent Sciences, 1(12), 1-6 (2012) 17.Daughman J., How iris recognition works?, IEEE Transaction on CSVT, 14(1), 21-30 (2004) 18.Bodade R.M. and Talbar S.N., Shift invariant iris feature extraction using rotated complex wavelet and complex wavelet for iris recognition system,In Proceeding Seventh International Conference on Advances in Pattern Recognition, 449-452 (2009) 19.Ko J.G., Gil Y.H. and Yoo J.H., Iris recognition using cumulative sum based change analyses, International symposium on Intelligent Signal Processing and Communication System, 275-278 (2006) 20.Bansal A., Agarwal R. and Sharma R.K., SVM based gender classification using iris images, In Proceedings of 4th IEEE, International Conference on Computational Intelligence and Computer Networks, 425-429 (2012)21.Cortes C. and Vapnik V., Support vector networks-Machine Learning, Kluwer Academic Publishers, Boston, 273–97 1995) 22.Burges C.J.C., A Tutorial on Support Vector Machines for Pattern Recognition, Kluwer Academic Publishers, Boston 1998) 23.Cristianini N. and Shawe T.D., An Introduction to Support Vector Machines and other kernel-based Learning Methods, Cambridge University press, Cambridge (2000) 24.Haykin S.,Neural Networks-A comprehensive foundation, Pearson Education, 2nded (2004)