SIGMAP 2015 Abstracts


Area 1 - Multimedia Signal Processing

Full Papers
Paper Nr: 3
Title:

Compressed Domain ECG Biometric Identification using JPEG2000

Authors:

Yi-Ting Wu, Hung-Tsai Wu and Wen-Whei Chang

Abstract: In wireless telecardiology applications, electrocardiogram (ECG) signals are often represented in compressed format for efficient transmission and storage purposes. Incorporation of compressed ECG based biometric enables faster person identification as it by-passes the full decompression. This study presents a new method to combine ECG biometrics with data compression within a common JPEG2000 framework. To this end, ECG signal is considered as an image and the JPEG2000 standard is applied for data compression. Features relating to ECG morphology and heartbeat intervals are computed directly from the compressed ECG. Different classification approaches are used for person identification. Experiments on standard ECG databases demonstrate the validity of the proposed system for biometric identification with high accuracies on both healthy and diseased subjects.

Paper Nr: 9
Title:

An Adaptive Reconstruction Algorithm for Image Block Compressed Sensing under Low Sampling Rate

Authors:

Cai Xu, Xie Zheng-Guang, Huang Hong-Wei and Jiang Xiao-Yan

Abstract: Block Compressed Sensing (CS) adapts to compressed sensing for an image. As the famous BCS with Smoothed Projected Landweber algorithm (BCS-SPL) shows bad performance when the sampling rate is in a low condition, we propose a novel algorithm called Total Variation based Sampling Adaptive Block Compressed Sensing with OMP (Orthogonal Matching Pursuit) (TVSA-BCS-OMP) to solve the following problem of BCS-SPL. TVSA-BCS-OMP blocks the whole image in an overlapping way to eliminate blocking effect. It assigns sampling rate depending on texture complexity of each block, which is measured by the block’s Total Variation (TV) so that the blocks with big TV can attain higher sampling rate. Then only limited nonzero coefficients in each block are retained according to the adaptively assigned sampling rate. At last, we sample the blocks and conducts OMP reconstruction respectively. The experimental results show that under the condition of low initial sampling rate (lower than 0.2), TVSA-BCS-OMP shows better reconstruction precision, especially can attain better reconstruction performance in the texture blocks than BCS-SPL. In addition, the new algorithm costs shorter reconstruction time than BCS-SPL algorithm.

Paper Nr: 13
Title:

Traffic Signs Detection and Tracking using Modified Hough Transform

Authors:

Pavel Yakimov and Vladimir Fursov

Abstract: Traffic Signs Recognition (TSR) systems can not only improve safety, compensating for possible human carelessness, but also reduce tiredness, helping drivers keep an eye on the surrounding traffic conditions. This paper proposes an efficient algorithm for real-time TSR. The article considers the practicability of using HSV color space to extract the red color. An algorithm to remove noise to improve the accuracy and speed of detection was developed. A modified Generalized Hough transform is then used to detect traffic signs. The current velocity of a vehicle is then used to predict the sign’s location in the adjacent frames in a video sequence. Finally, the detected objects are being classified. The developed algorithms have been tested using real scene images and the German Traffic Sign Detection Benchmark (GTSDB) dataset and showed efficient results.

Short Papers
Paper Nr: 1
Title:

Gramophone Noise Reconstruction - A Comparative Study of Interpolation Algorithms for Noise Reduction

Authors:

Christoph F. Stallmann and Andries P. Engelbrecht

Abstract: Gramophone records have been the main recording medium for seven decades and regained widespread popularity over the past few years. Records are susceptible to noise caused by scratches and other mishandlings, often making the listening experience unpleasant. This paper analyses and compares twenty different interpolation algorithms for the reconstruction of noisy samples, categorized into duplication and trigonometric approaches, polynomials and time series models. A dataset of 800 songs divided amongst eight different genres were used to benchmark the algorithms. It was found that the ARMA model performs best over all genres. Cosine interpolation has the lowest computational time, with the AR model achieving the most effective interpolation for a limited time span. It was also found that less volatile genres such as classical, country, rock and jazz music is easier to reconstruct than more unstable electronic, metal, pop and reggae audio signals.

Paper Nr: 8
Title:

Beat Discovery from Dimensionality Reduced Perspective Streams of Electrocardiogram Signal Data

Authors:

Avi Bleiweiss

Abstract: Spectral characteristics of ECG traces have identified a stochastic component in the inter-beat interval for triggering a new cardiac cycle. Yet the stream consistently shows impressive reproducibility of the inherent core waveform. Respectively, the presence of close to deterministic structures firmly contends for representing a single cycle ECG wave by a state vector in a low dimensional embedding space. Rather than performing arrhythmia clustering directly on the high dimensional state space, our work first reduces the dimensionality of the extracted raw features. Analysis of heartbeat irregularities becomes then more tractable computationally, and thus claims more relevance to run on emerging wearable and IoT devices that are severely resource and power constraint. In contrast to prior work that searches for a two dimensional embedding space, we project feature vectors onto a three dimensional coordinate frame. This merits an essential depth perception facet to a specialist that qualifies cluster memberships, and furthermore, by removing stream noise, we managed to retain a high percentile level of source energy. We performed extensive analysis and classification experiments on a large arrhythmia dataset, and report robust results to support the intuition of expert neutral similarity.

Paper Nr: 17
Title:

A Time Delay based Approach to Enhance Lung Diseases Diagnostic

Authors:

Fatma Ayari, Ali Alouani and Mekki Ksouri

Abstract: We are dealing with Chronic Obstructive Pulmonary Disease (COPD), using a new methodology based on Passive Time Delay Technique (PTDT). Lung sounds were recorded using a multichannel stethoscope on 28 healthy subjects and 20 COPD patients. The sensors were distributed on the posterior and anterior chest wall. During recordings, all participants were breathing at matching airflow rates. Calculated time delay (TD) was identified for inspiration phase and an average TD value was provided after three repetitive measurements for each inspiration phase. TD computed in COPD patients: 440 ± 87 % (P < 0.05) was remarkably greater than time delay computed with normal subjects: 160 ± 10 % (P < 0.05). Results were presented as mean ± SD, standard deviation of time delay in ms. Significant P values (P < 0.05) were indicated using Wilcoxon test. Preliminary results are very encouraging to develop this technique and enhance COPD monitoring.

Area 2 - Multimedia Systems and Applications

Short Papers
Paper Nr: 12
Title:

Contrast-to-Noise based Metric of Denoising Algorithms for Liver Vein Segmentation

Authors:

A. Nikonorov, A. Kolsanov, M. Petrov, Y. Yuzifovich, E. Prilepin and K. Bychenkov

Abstract: We analyse CT image denoising when applied to vessel segmentation. Proposed semi-global quality metric based on the contrast-to-noise ratio allowed us to estimate initial image quality and efficiency of denoising procedures without prior knowledge about a noise-free image. We show that the total variance filtering in L1 metric provides the best denoising when compared to other well-known denoising procedures such as non-local means denoising or anisotropic diffusion. Computational complexity of this denoising algorithm is addressed by comparing its implementation for Intel MIC and for NVIDIA CUDA HPC systems.

Paper Nr: 18
Title:

Foveated Model based on the Action Potential of Ganglion Cells to Improve Objective Image Quality Metrics

Authors:

Sergio A. C. Bezerra and Alexandre Pohl

Abstract: In this work, a foveated model (FM) based on the action potential of ganglion cells in the human retina is employed to improve the results obtained by traditional and perceptual image quality metrics. LIVE and VAIQ image databases are used in the experiments to test and validate this model. Statistical techniques, such as the Pearson Linear Correlation Coefficient (PLCC), the Spearman Rank-Order Correlation Coefficient (SROCC) and the Root Mean Square Error (RMSE), are used to evaluate the performance of Peak Signal-to- Noise Ratio (PSNR) and Structural SIMilarity (SSIM) metrics, as well as their versions improved by the FM. The results are encouraging because the model proposed improve the performance of the metrics investigated.

Posters
Paper Nr: 14
Title:

Computational Correction for Imaging through Single Fresnel Lenses

Authors:

Artem Nikonorov, Sergey Bibikov, Maksim Petrov, Yuriy Yuzifovich and Vladimir Fursov

Abstract: The lenses of modern single lens reflex (SLR) cameras may contain a dozen or more individual lens elements to correct aberrations. With processing power more readily available, the modern trend in computational photography is to develop techniques for simple lens aberration correction in post-processing. We propose a similar approach to remove aberrations from images captured by a single imaging Fresnel lens. The image is restored using three-stage deblurring of the base color channel, sharpening other and then applying color correction. The first two steps are based on the combination of restoration techniques used for restoring images obtained from simple refraction lenses. Color correction stage is necessary to remove strong color shift caused by chromatic aberrations of simple Fresnel lens. This technique was tested on real images captured by a simple lens, which was made as a three-step approximation of the Fresnel lens. Promising results open up new opportunities in using lightweight Fresnel lenses in miniature computer vision devices.

Paper Nr: 15
Title:

Objective Assessment of Asthenia using Energy and Low-to-High Spectral Ratio

Authors:

Farideh Jalalinajafabadi, Chaitaniya Gadepalli, Mohsen Ghasempour, Frances Ascott, Mikel Luján, Jarrod Homer and Barry Cheetham

Abstract: Vocal cord vibration is the source of voiced phonemes. Voice quality depends on the nature of this vibration. Vocal cords can be damaged by infection, neck or chest injury, tumours and more serious diseases such as laryngeal cancer. This kind of physical harm can cause loss of voice quality. Voice quality assessment is required from Speech and Language Therapists (SLTs). SLTs use a well-known subjective assessment approach which is called GRBAS. GRBAS is an acronym for a five dimensional scale of measurements of voice properties which were originally recommended by the Japanese Society of Logopeadics and Phoniatrics and the European Research for clinical and research use. The properties are ‘Grade’, ‘Roughness’, ‘Breathiness’, ‘Asthenia’ and ‘Strain’. The objective assessment of the G, R, B and S properties has been well researched and can be carried out by commercial measurement equipment. However, the assessment of Asthenia has been less extensively researched. This paper concerns the objective assessment of ‘Asthenia’ using features extracted from 20 ms frames of sustained vowel /a/. We develop two regression prediction models to objectively estimate Asthenia against speech and language therapists (SLTs) scores. These regression models are ‘K nearest neighbor regression’ (KNNR) and ‘Multiple linear regression’(MLR). These new approaches for prediction of Asthenia are based on different subsets of features, different sets of data and different prediction models in comparison with previous approaches in the literature. The performance of the system has been evaluated using Normalised Root Mean Square Error (NRMSE) for each of 20 trials, taking as a reference the average score for each subject selected. The subsets of features that generate the lowest NRMSE are determined and used to evaluate the two regression models. The objective system was compared with the scoring of each individual SLT and was found to have a NRMSE, averaged over 20 trials, lower than two of them and only slightly higher than the third.

Paper Nr: 19
Title:

Real-time Super Resolution Algorithm for Security Cameras

Authors:

Seiichi Gohshi

Abstract: Security is one of the most important things in our daily lives. Security camera systems have been introduced to keep us safe in shops, airports, downtowns, and other public spaces. Security cameras have infrared imaging modes for low-light conditions. However, infrared imaging sensitivity is low, and the quality of images recorded in low-light conditions is often poor as they do not always possess sufficient contrast and resolution; thus, infrared imaging devices produce blurry monochrome images and videos. A real-time nonlinear signal processing technique that improves the contrast and resolution of low-contrast infrared images and video is proposed. The proposed algorithm can be installed in a field programmable array.

Paper Nr: 20
Title:

Mosaic-based Privacy-protection with Reversible Watermarking

Authors:

Yuichi Kusama, Hyunho Kang and Keiichi Iwamura

Abstract: Video surveillance has been applied to many fields, specifically for detecting suspicious activity in public places such as shopping malls. As the use of video-surveillance cameras increases, so too does the threat to individual privacy. Therefore, video-surveillance technologies that protect individual privacy must be implemented. In this study, we propose a scheme in an MPEG2 video-encoding environment that successfully employs mosaicking, encryption, and restoration of faces captured in videos.

Paper Nr: 21
Title:

Similarity-based Image Retrieval for Revealing Forgery of Handwritten Corpora

Authors:

Ilaria Bartolini

Abstract: Authorship attribution is a problem with a long history and a wide range of applications. Recent works in non-traditional authorship attribution contexts demonstrate the practicality of automatic analysis of documents based on authorial style. However, such analyses are difficult to apply and few “best practices” are available. In this paper, we show how quantitative techniques based on image similarity search can be profitably exploited for revealing forgery of handwritten corpora. More in details, we explore the case where a document is represented by means of the image of the document itself. Preliminary experimental results conducted on real data demonstrate the effectiveness of the proposed approach.