Blog post subtitle


  1. Dynamic Facial Expression Recognition with Atlas Construction and Sparse Representation

In this paper, a new dynamic facial expression recognition method is proposed. Dynamic facial expression recognition is formulated as a longitudinal groupwise registration problem. The main contributions of this method lie in the following aspects: 1) subject-specific facial feature movements of different expressions are described by a diffeomorphic growth model; 2) salient longitudinal facial expression atlas is built for each expression by a sparse groupwise image registration method, which can describe the overall facial feature changes among the whole population and can suppress the bias due to large intersubject facial variations; and 3) both the image appearance information in spatial domain and topological evolution information in temporal domain are used to guide recognition by a sparse representation method. The proposed framework has been extensively evaluated on five databases for different applications: the extended Cohn-Kanade, MMI, FERA, and AFEW databases for dynamic facial expression recognition, and UNBC-McMaster database for spontaneous pain expression monitoring. This framework is also compared with several state-of-the-art dynamic facial expression recognition methods. The experimental results demonstrate that the recognition rates of the new method are consistently higher than other methods under comparison.


  1. Lossless Compression of JPEG Coded Photo Collections

The explosion of digital photos has posed a significant challenge to photo storage and transmission for both personal devices and cloud platforms. In this paper, we propose a novel lossless compression method to further reduce the size of a set of JPEG coded correlated images without any loss of information. The proposed method jointly removes inter/intra image redundancy in the feature, spatial, and frequency domains. For each collection, we first organize the images into a pseudo video by minimizing the global prediction cost in the feature domain. We then present a hybrid disparity compensation method to better exploit both the global and local correlations among the images in the spatial domain. Furthermore, the redundancy between each compensated signal and the corresponding target image is adaptively reduced in the frequency domain. Experimental results demonstrate the effectiveness of the proposed lossless compression method. Compared with the JPEG coded image collections, our method achieves average bit savings of more than 31%.


  1. On the Capacity of the Intensity-Modulation Direct-Detection Optical Broadcast Channel

The capacity of the intensity-modulation direct-detection optical broadcast channel (OBC) is investigated, under both average and peak intensity constraints. An outer bound on the capacity region is derived by adapting Bergmans’ approach to the OBC. Inner bounds are derived by using superposition coding with either truncated-Gaussian (TG) distributions or discrete distributions. While the discrete distribution achieves higher rates, the TG distribution leads to a simpler representation of the achievable rate region. At high signal-to-noise ratio (SNR), it is shown that the TG distribution is nearly optimal. It achieves the symmetric-capacity within a constant gap (independent of SNR), which approaches half a bit as the number of users grows. It also achieves the capacity region within a constant gap. At low SNR, it is shown that on–off keying (OOK) with time-division multiple-access (TDMA) is optimal. This is interesting in practice since both OOK and TDMA have low complexity. At moderate SNR (typically [0,8] dB), a discrete distribution with a small alphabet size achieves fairly good performance.


  1. Pixel modeling using histograms based on fuzzy partitions for dynamic background subtraction

We propose a novel pixel-modeling approach for background subtraction using histograms based on strong uniform fuzzy partitions. In the proposed method, the temporal distribution of pixel values is represented by a histogram based on a triangular partition. The threshold for background segmentation is set adaptively according to the shape of the histogram. Histogram accumulation is controlled adaptively by a fuzzy controller under a supervised learning framework. Benefiting from the adaptive scheme, with no parameter tuning, the proposed algorithm functions well across a wide spectrum of challenging environments. The performance of the proposed method is evaluated against more than 20 state-of-the-art methods in complex outdoor environments, particularly in those consisting of highly dynamic backgrounds and camouflaged foregrounds. Experimental results confirm that the proposed method performs effectively in terms of both the true positive rate and the noise suppression ability. Further, it outperforms other state-of-the-art methods by a significant margin.


  1. Layer-Based Approach for Image Pair Fusion

Recently, image pairs, such as noisy and blurred images or infrared and noisy images, have been considered as a solution to provide high-quality photographs under low lighting conditions. In this paper, a new method for decomposing the image pairs into two layers, i.e., the base layer and the detail layer, is proposed for image pair fusion. In the case of infrared and noisy images, simple naive fusion leads to unsatisfactory results due to the discrepancies in brightness and image structures between the image pair. To address this problem, a local contrast-preserving conversion method is first proposed to create a new base layer of the infrared image, which can have visual appearance similar to another base layer, such as the denoised noisy image. Then, a new way of designing three types of detail layers from the given noisy and infrared images is presented. To estimate the noise-free and unknown detail layer from the three designed detail layers, the optimization framework is modeled with residual-based sparsity and patch redundancy priors. To better suppress the noise, an iterative approach that updates the detail layer of the noisy image is adopted via a feedback loop. This proposed layer-based method can also be applied to fuse another noisy and blurred image pair. The experimental results show that the proposed method is effective for solving the image pair fusion problem.


  1. Adaptive Pairing Reversible Watermarking

This letter revisits the pairwise reversible watermarking scheme of Ou et al., 2013. An adaptive pixel pairing that considers only pixels with similar prediction errors is introduced. This adaptive approach provides an increased number of pixel pairs where both pixels are embedded and decreases the number of shifted pixels. The adaptive pairwise reversible watermarking outperforms the state-of-the-art low embedding bit-rate schemes proposed so far.


  1. Adaptive Part-Level Model Knowledge Transfer for Gender Classification 

In this letter, we propose an adaptive part-level model knowledge transfer approach for gender classification of facial images based on Fisher vector (FV). Specifically, we first decompose the whole face image into several parts and compute the dense FVs on each face part. An adaptive transfer learning model is then proposed to reduce the discrepancies between the training data and the testing data for enhancing classification performance. Compared to the existing gender classification methods, the proposed approach is more adaptive to the testing data, which is quite beneficial to the performance improvement. Extensive experiments on several public domain face data sets clearly demonstrate the effectiveness of the proposed approach.


  1. Patch-Based Video Denoising With Optical Flow Estimation

A novel image sequence denoising algorithm is presented. The proposed approach takes advantage of the selfsimilarity and redundancy of adjacent frames. The algorithm is inspired by fusion algorithms, and as the number of frames increases, it tends to a pure temporal average. The use of motion compensation by regularized optical flow methods permits robust patch comparison in a spatiotemporal volume. The use of principal component analysis ensures the correct preservation of fine texture and details. An extensive comparison with the state-of-the-art methods illustrates the superior performance of the proposed approach, with improved texture and detail reconstruction.


  1. Fusion of Quantitative Image and Genomic Biomarkers to Improve Prognosis Assessment of Early Stage Lung Cancer Patients

This study aims to develop a new quantitative image feature analysis scheme and investigate its role along with 2 genomic biomarkers namely, protein expression of the excision repair cross-complementing 1 (ERCC1) genes and a regulatory subunit of ribonucleotide reductase (RRM1), in predicting cancer recurrence risk of Stage I non-small-cell lung cancer (NSCLC) patients after surgery. Methods: By using chest computed tomography images, we developed a computer-aided detection scheme to segment lung tumors and computed tumor-related image features. After feature selection, we trained a Naïve Bayesian network based classifier using 8 image features and a Multilayer Perceptron classifier using 2 genomic biomarkers to predict cancer recurrence risk, respectively. Two classifiers were trained and tested using a dataset with 79 Stage I NSCLC cases, a synthetic minority oversampling technique and a leave-one-case-out validation method. A fusion method was also applied to combine prediction scores of two classifiers. Results: AUC (areas under ROC curves) values are 0.78±0.06 and 0.68±0.07 when using the image feature and genomic biomarker based classifiers, respectively. AUC value significantly increased to 0.84±0.05 (p<0.05) when fusion of two classifier-generated prediction scores using an equal weighting factor. Conclusion: A quantitative image feature based classifier yielded significantly higher discriminatory power than a genomic biomarker based classifier in predicting cancer recurrence risk. Fusion of prediction scores generated by the two classifiers further improved prediction performance. Significance: We demonstrated a new approach that has potential to assist clinicians in more effectively managing Stage I NSCLC patients to reduce cancer recurrence risk.


  1. Optimum Co-Design for Spectrum Sharing Between Matrix Completion Based MIMO Radars and a MIMO Communication System

Spectrum sharing enables radar and communication systems to share the spectrum efficiently by minimizing mutual interference. Recently proposed multiple input multiple output radars based on sparse sensing and matrix completion (MIMOMC), in addition to reducing communication bandwidth and power as compared to MIMO radars, offer a significant advantage for spectrum sharing. The advantage stems from the way the sampling scheme at the radar receivers modulates the interference channel from the communication system transmitters, rendering it symbol dependent and reducing its row space. This makes it easier for the communication system to design its waveforms in an adaptive fashion so that it minimizes the interference to the radar subject to meeting rate and power constraints. Two methods are proposed. First, based on the knowledge of the radar sampling scheme, the communication system transmit covariance matrix is designed to minimize the effective interference power (EIP) to the radar receiver, while maintaining certain average capacity and transmit power for the communication system. Second, a joint design of the communication transmit covariance matrix and the MIMO-MC radar sampling scheme is proposed, which achieves even further EIP reduction.


  1. Multivideo Object Cosegmentation for Irrelevant Frames Involved Videos

Even though there have been a large amount of previous work on video segmentation techniques, it is still a challenging task to extract the video objects accurately without interactions, especially for those videos which contain irrelevant frames (frames containing no common targets). In this essay, a novel multivideo object cosegmentation method is raised to cosegment common or similar objects of relevant frames in different videos, which includes three steps: 1) object proposal generation and clustering within each video; 2) weighted graph construction and common objects selection; and 3) irrelevant frames detection and pixel-level segmentation refinement. We apply our method on challenging datasets and exhaustive comparison experiments demonstrate the effectiveness of the proposed method.


  1. Multi-Viewpoint Panorama Construction with Wide-Baseline Images

We present a novel image stitching approach, which can produce visually plausible panoramic images with input taken from different viewpoints. Unlike previous methods, our approach allows wide baselines between images and non-planar scene structures. Instead of 3D reconstruction, we design a mesh based framework to optimize alignment and regularity in 2D. By solving a global objective function consisting of alignment and a set of prior constraints, we construct panoramic images, which are locally as perspective as possible and yet nearly orthogonal in the global view. We improve composition and achieve good performance on misaligned area. Experimental results on challenging data demonstrate the effectiveness of the proposed method.


  1. A Security-Enhanced Alignment-Free Fuzzy Vault-Based Fingerprint Cryptosystem Using Pair-Polar Minutiae Structures

Alignment-free fingerprint cryptosystems perform matching using relative information between minutiae, e.g., local minutiae structures, is promising, because it can avoid the recogni