Projects

Overview

The current projects in our lab (click link to jump to a project's description):
Graphical models for predicting missing/incomplete biographic data in biometric records
Autoencoders for imparting privacy to biometric images
Image Phylogeny Tree (IPT) for digital forensics
Soft biometrics for iris analysis
Sensor forensics for near-infrared iris imagery
Presentation attack detection in biometrics
Detecting stem cells in MRI
Speaker recognition from degraded audio samples
Generalized Additive Models for biometric fusion

Graphical models for predicting missing/incomplete biographic data in biometric records

Description

In classical face recognition, an input probe image is compared against a gallery of labeled face images in order to determine its identity. In most applications, the gallery images (identities) are assumed to be independent of each other, i.e., the relationship between gallery images is not exploited during the face recognition process. We propose a graph-based approach in which gallery images are used to generate a powerful network structure where the nodes correspond to individual identities (and consist of face images as well as biographic attributes such as gender, ethnicity, name, etc.) and the edge weights define the degree of similarity between two such nodes. One application of the graph-based gallery is prediction of biographic attributes. We use of the graph structure to model the relationship between the biometric records in a database. We then show the benefits of such a graph in deducing the biographic labels of incomplete records, i.e., records that may have missing biographic data.

Publications

T. Swearingen and A. Ross. "A label propagation approach for predicting missing biographic labels in face-based biometric records," IET Biometrics, Vol. 7, Issue 1, pp. 71 - 80, 2018.

T. Swearingen and A. Ross, "Predicting Missing Demographic Information in Biometric Records using Label Propagation Techniques," Proc. of the 15th International Conference of the Biometrics Special Interest Group (BIOSIG 2016), (Darmstadt, Germany), September 2016.

Autoencoders for imparting privacy to biometric images

Description

In this paper, we design and evaluate a convolutional autoencoder that perturbs an input face image to impart privacy to a subject. Specifically, the proposed autoencoder transforms an input face image such that the transformed image can be successfully used for face recognition but not for gender classification. In order to train this autoencoder, we propose a novel training scheme, referred to as semi-adversarial training in this work. The training is facilitated by attaching a semi-adversarial module consisting of a pseudo gender classifier and a pseudo face matcher to the autoencoder. The objective function utilized for training this network has three terms: one to ensure that the perturbed image is a realistic face image; another to ensure that the gender attributes of the face are confounded; and a third to ensure that biometric recognition performance due to the perturbed image is not impacted. Extensive experiments confirm the efficacy of the proposed architecture in extending gender privacy to face images.

Publications

V. Mirjalili, S. Raschka, A. Namboodiri, A. Ross, "Semi-Adversarial Networks: Convolutional Autoencoders for Imparting Privacy to Face Images," Proc. of 11th IAPR International Conference on Biometrics (ICB 2018), (Gold Coast, Australia), February 2018. [To Appear]

V. Mirjalili and A. Ross, "Soft Biometric Privacy: Retaining Biometric Utility of Face Images while Perturbing Gender," Proc. of International Joint Conference on Biometrics (IJCB), (Denver, USA), October 2017.

Image Phylogeny Tree (IPT) for digital forensics

Description

Consider an image that is subjected to a sequence of simple photometric transformations such as gamma correction, histogram equalization, brightness and contrast adjustment, etc. This would result in a family of transformed images. Given a set of such "near-duplicate" images, we develop a method that automatically deduces the relationship between these images and constructs an Image Phylogeny Tree (IPT) that captures their evolutionary structure (i.e., which image originated from which one). We advance the state of the art by (a) first using appropriate basis functions to model the relationship between every image pair, and (b) then using the estimated parameters of the model to depict the relationship between the image pair. It caters to applications pertaining to digital image forensics.

Publications

S. Banerjee and A. Ross, "Computing an Image Phylogeny Tree from Photometrically Modified Iris Images," Proc. of International Joint Conference on Biometrics (IJCB), (Denver, USA), October 2017.

Soft biometrics for iris analysis

Description

Recent research has explored the possibility of automatically deducing the sex of an individual based on near infrared (NIR) images of the iris. Previous articles on this topic have extracted and used only the iris region, while most operational iris biometric systems typically acquire the extended ocular region for processing. Therefore, in this work, we investigate the sex-predictive accuracy associated with four different regions: (a) the extended ocular region; (b) the iris-excluded ocular region; (c) the iris-only region and (d) the normalized iris-only region. We employ the BSIF (Binarized Statistical Image Feature) texture operator to extract features from these regions, and train a Support Vector Machine (SVM) to classify the extracted feature set as Male or Female. Experiments on a dataset containing 3314 images suggest that the iris region only provides modest sex-specific cues compared to the surrounding periocular region. This research further underscores the importance of using the periocular region in iris recognition systems.

Publications

D. Bobeldyk and A. Ross, "Predicting Eye Color from Near Infrared Iris Images," Proc. of 11th IAPR International Conference on Biometrics (ICB 2018), (Gold Coast, Australia), February 2018. [To Appear]

D. Bobeldyk and A. Ross, "Iris or Periocular? Exploring Sex Prediction from Near Infrared Ocular Images," Proc. of the 15th International Conference of the Biometrics Special Interest Group (BIOSIG), (Darmstadt, Germany), September 2016.

Sensor forensics for near-infrared iris imagery

Description

The field of digital image forensics concerns itself with the task of validating the authenticity of an image or determining the device that produced the image. Device or sensor identification can be accomplished by estimating sensor-specific pixel artifacts, such as Photo Response Non Uniformity (PRNU), that leave an imprint in the resulting image. Research in this field has predominantly focused on images obtained using sensors operating in the visible spectrum. Iris recognition systems, on the other hand, utilize sensors operating in the near-infrared (NIR) spectrum. In this scope of research, we focus on sensor identification in the context of near-infrared iris images. Research can be extended to investigate adversarial influences which may deliberately perturb the sensor-specific artifacts and deter sensor identification algorithms.

Publications

S. Banerjee and A. Ross, "From Image to Sensor: Comparative Evaluation of Multiple PRNU Estimation Schemes for Identifying Sensors from NIR Iris Images," 5th International Workshop on Biometrics and Forensics (IWBF), (Coventry, UK), April 2017.

Presentation attack detection in biometrics

Description

This work addresses the problem of presentation attacks against iris recognition systems. Iris recognition systems attempt to recognize individuals based on their iris patterns typically acquired in the near-infrared spectrum. However, it is possible for an adversarial user to circumvent the system by presenting a deliberately modified iris pattern or a fake iris pattern. These are called presentation attacks (PAs). Examples of PAs include (1) using printed images of another person’s iris, (2) presenting a fake eye, (3) displaying an eye image on a Kindle, or (4) wearing cosmetic contact lenses to mask one’s own iris pattern. To detect such attacks, we develop a deep convolutional neural network (CNN) that can determine if an input eye image corresponds to an attack or not. By sampling patches from the images, the proposed CNN is able to extract discriminatory features for effective presentation attack detection. Upon testing our algorithms on several image datasets of real and fake eyes, we observed True Detection Rates as high as 100% at a False Detection Rate of 0.2% in both intra-dataset and cross-dataset experiments.

Detecting stem cells in MRI

Description

A supervised learning system requires labeled data during the training phase. Obtaining labels can be an expensive process, especially in medical imaging applications where a qualified expert may be needed to carefully analyze images and annotate them. This constrains the amount of labeled data available. This study explores the possibility of incorporating labeling behavior (viz., labeling latency) in a supervised convolutional neural network (CNN) framework in order to improve its performance in the presence of limited labeled data. The problem of “spot” detection in MRI scans is considered in this work. In this two-class problem, (a) labeling behavior is available only during the training phase unlike traditional features that are available both during training and testing; and (b) the labeling behavior is associated with only one class (the positive samples) unlike other side information that is available for all classes. To address these issues, a new CNN architecture referred to as L-CNN is designed. The proposed method utilizes the labeling behavior of the expert to cluster the labeled data into multiple categories; a source CNN is then trained to distinguish between these categories. Next, a transfer learning paradigm is used where a target CNN is initialized using this source CNN and its weights updated with the limited labeled data that is available. Experimental results on an existing MRI database show that the proposed L-CNN performs better than a conventional CNN and, further, significantly outperforms the previous state-of-the-art, thereby establishing a new baseline for “spot” detection in MRI.

Publications

M. J. Afridi, A. Ross, E. Shapiro, "On Automated Source Selection for Transfer Learning in Convolutional Neural Networks," Pattern Recognition, Vol. 73, pp. 65 - 75, January 2018.

M. J. Afridi, A. Ross, X. Liu, M. Bennewitz, D. Shuboni, E. Shapiro, "Intelligent and Automatic in vivo Detection and Quantification of Transplanted Cells in MRI," Magnetic Resonance in Medicine, Vol. 78, Issue 5, pp. 1991 - 2002, November 2017.

M. J. Afridi, A. Ross, E. Shapiro, "L-CNN: Exploiting Labeling Latency in a CNN Learning Framework," Proc. of 23rd International Conference on Pattern Recognition (ICPR), (Cancun, Mexico), December 2016.

M. J. Afridi, X. Liu, E. M. Shapiro, A. Ross, "Automatic in vivo Cell Detection in MRI," Proc. of 18th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), (Munich, Germany), October 2015.

M. J. Afridi, M. Latourette, M. F. Bennewitz, A. Ross, X. Liu, E. M. Shapiro, "Machine Learning and Computer Vision based Quantification of Cell Number in MRI-based Cell Tracking," Proc. of 23rd Annual Meeting of the International Society for Magnetic Resonance in Medicine (ISMRM), (Toronto, Canada), June 2015.

Speaker recognition from degraded audio samples

Description

Detecting human speech and identifying its source, i.e., the speaker, from speech audio is an active area of research in Machine learning and Biometrics community. As with other types of digital signals such as images and video, an audio signal can undergo degradations during its generation, propagation and recording. Identifying the speaker from such degraded speech data is a challenging task and an open research problem. In this research project, we work on devloping deep learning based algorithms for speaker recognition from degraded audio signals. We use speech features like Mel-Frequency Cepstral Coefficients (MFCC) and Linear Predictive Coding (LPC) for representing the audio signals. We design convolutional neural networks (CNN) which learn speaker dependent features from the MFCC and LPC based speech represntations for performing speaker recognition.

Publications

A. Chowdhury and A. Ross, "Extracting Sub-glottal and Supra-glottal Features from MFCC using Convolutional Neural Networks for Speaker Identification in Degraded Audio Signals," Proc. of International Joint Conference on Biometrics (IJCB), (Denver, USA), October 2017.

Generalized Additive Models for biometric fusion

Description

Biometric systems recognize individuals based on their biological attributes such as face, fingerprints, or iris. However, in several scenarios, additional information such as gender, ethnicity, etc. may be available. Previous literature shows the impact of demographic factors on recognition performance, but there is limited work on systematically incorporating them into the biometric matching framework. In fact, most of the current approaches primarily use demographic data in the context of biometric identification, by restricting the search to only those identities in the database having the same demographic characteristics as the input probe. In this work, we develop a principled approach to combine demographic data with biometric match scores via a Generalized Additive Model (GAM) that is applicable to the biometric verification scenario. The proposed GAM learns a set of penalized spline-based transformation functions that describe the relationship between match scores and demographic factors. In this regard, the proposed approach has three significant benefits. One, it can be used to predict if the use of specific demographic information (e.g., ethnicity) can improve the accuracy of a specific biometric matcher (e.g., a fingerprint matcher). Two, it can more effectively combine demographic information with match scores by utilizing the aforementioned transformation functions. Three, the use of spline-based transformation functions enhances the resilience of the fusion framework to incorrect demographic labels. The proposed scheme is extensively evaluated on three databases: the MORPH face database, the LFW face database, and the WVU database consisting of face and fingerprint modalities. Experimental results with cross-validation clearly convey the benefits of the proposed GAM scheme.

Publications