SIGMAP 2014 Abstracts


Area 1 - Multimedia and Communications

Full Papers
Paper Nr: 47
Title:

Noise Mitigation over Powerline Communication Using LDPC-Convolutional Code and Fusion of Mean and Median Filters

Authors:

Yassine Himeur and Abdelkrim Boukabou

Abstract: In this paper, we propose a new impulse noise mitigation approach in Orthogonal Frequency Division Multiplexing (OFDM) signals over Powerline communication (PLC) channel. Recently LDPC-Convolutional Code (LDPC-CC) has received much interest as an alternative to LDPC codes for its advantages and low complexity. The proposed approach exploits the redundancy introduced by LDPC-CC and cyclic prefix (CP) added to the OFDM transmitter to recover noisy coefficients. It is based on the fusion of median and mean of their neighboring coefficients using a window and a dynamic threshold calculated on the basis of noise variance and the peak value of the noise in the received signal. Detection of noisy coefficients takes into consideration the neighboring coefficients. The proposed technique presents a good robustness to impulse noise performance without adding a big complexity to the transmission system. Promising results have been achieved by the proposed approach when compared to filtering and coding techniques alone.

Short Papers
Paper Nr: 11
Title:

Product Integral Binding Coefficients for High-order Wavelets

Authors:

Nick Michiels, Jeroen Put and Philippe Bekaert

Abstract: This paper provides an efficient algorithm to calculate product integral binding coefficients for a heterogeneous mix of wavelet bases. These product integrals are ubiquitous in multiple applications such as signal processing and rendering. Previous work has focused on simple Haar wavelets. Haar wavelets excel at encoding piecewise constant signals, but are inadequate for compactly representing smooth signals for which high-order wavelets are ideal. Our algorithm provides an efficient way to calculate the tensor of these binding coefficients. The algorithm exploits both the hierarchical nature and vanishing moments of the wavelet bases, as well as the sparsity and symmetry of the tensor. We demonstrate the effectiveness of high-order wavelets with a rendering application. The smoother wavelets represent the signals more effectively and with less blockiness than the Haar wavelets of previous techniques.

Paper Nr: 33
Title:

Combining Top-down and Bottom-up Visual Saliency for Firearms Localization

Authors:

Edoardo Ardizzone, Roberto Gallea, Marco La Cascia and Giuseppe Mazzola

Abstract: Object detection is one of the most challenging issues for computer vision researchers. The analysis of the human visual attention mechanisms can help automatic inspection systems, in order to discard useless infor- mation and improving performances and efficiency. In this paper we proposed our attention based method to estimate firearms position in images of people holding firearms. Both top-down and bottom-up mechanisms are involved in our system. The bottom-up analysis is based on a state-of-the-art approach. The top-down analysis is based on the construction of a probabilistic model of the firearms position with respect to the peo- ple’s face position. This model has been created by analyzing information from of a public available database of movie frames representing actors holding firearms.

Paper Nr: 42
Title:

A Fast and Robust Key-Frames based Video Copy Detection Using BSIF-RMI

Authors:

Yassine Himeur, Karima Ait-Sadi and Abdelmalik Oumamne

Abstract: Content Based Video Copy Detection (CBVCD) has gained a lot of scientific interest in recent years. One of the biggest causes of video duplicates is transformation. This paper addresses a fast video copy detection approach based on key-frames extraction which is robust to different transformations. In the proposed scheme, the key-frames of videos are first extracted based on Gradient Magnitude Similarity Deviation (GMSD). The descriptor used in the detection process is extracted using a fusion of Binarized Statistical Image Features (BSIF) and Relative Mean Intensity (RMI). Feature vectors are then reduced by Principal Component Analysis (PCA), which can more accelerate the detection process while keeping a good robustness against different transformations. The proposed framework is tested on the query and reference dataset of CBCD task of Muscle VCD 2007 and TRECVID 2009. Our results are compared with those obtained by other works in the literature. The proposed approach shows promising performances in terms of both robustness and time execution.

Paper Nr: 52
Title:

Efficient Rate Control for Intra-Frame Coding in High Efficiency Video Coding

Authors:

Feng Cen, Qianli Lu and Weisheng Xu

Abstract: In this paper, a coding tree unit (CTU)-level rate control (RC) scheme is proposed for intra-frame coding in the high efficiency video coding (HEVC). The CTU-level target bits are allocated based on the content complexity and the parameters of the Cauchy-based rate-quantization (R-Q) model of the current CTU is estimated according to the neighboring previously encoded CTUs. The proposed RC does not exploit any information of the adjacent frames such that it can inherently handle the initial frame and the scene change frames. The experimental results demonstrate the accurate rate estimation and stable video quality of the proposed RC scheme.

Posters
Paper Nr: 35
Title:

Protecting Digital Fingerprint in Automated Fingerprint Identification System using Local Binary Pattern Operator

Authors:

Ait Sadi Karima, Bouchair Imène, K. Zebbiche and Moussadek Laadjel

Abstract: The Local binary pattern (LBP) operators, which measure the local contrast within a pixel's neighbourhood, have been successfully applied to texture analysis, face recognition, and image retrieval. In the paper, we present a new application of the LBP operators for securing digital fingerprint in AFIS System (Automated Fingerprint Identification System), while inserting a robust watermark (ID image and Face image) to increase not only security but also to facilitate the recognition of the person. To improve the security of the embedding, the watermarks are scrambled using Arnold technique and are then hidden in the fingerprint image of the corresponding person. Experimental results show that the proposed watermarking method is robust against commonly-used image processing operations, such as additive noise, luminance change, contrast enhancement, and JPEG compression while does not change the fingerprint features and maintains a good visibility of the original fingerprint images.

Paper Nr: 44
Title:

Auditory Features Analysis for BIC-based Audio Segmentation

Authors:

Tomasz Maka

Abstract: Audio segmentation is one of the stages in audio processing chain whose accuracy plays a primary role in the final performance of the audio recognition and processing tasks. This paper presents an analysis of auditory features for audio segmentation. A set of features is derived from a time-frequency representation of an input signal and has been calculated based on properties of human auditory system. An analysis of several sets of audio features efficiency for BIC-based audio segmentation has been performed. The obtained results show that auditory features derived from different frequency scales are competitive to the widely usedMFCC feature in terms of accuracy and the number of detected points.

Paper Nr: 54
Title:

Image Stitching with Efficient Brightness Fusion and Automatic Content Awareness

Authors:

Yu Tang and Jungpil Shin

Abstract: Image Stitching, also be called photo stitching, is the process of combining multiple photographic images with overlapping fields of view to produce a segmented panorama or high-resolution image. Image stitching is challenging in two fields. First, the sequenced photos taken from various angles will have different brightness. This will certainly lead to a un-nature stitched result with no harmony of brightness. Second, ghosting artifact due to the moving objects is also a common problem and the elimination of it is not an easy task. This paper presents several novel techniques that make the process of addressing the two difficulties significantly less labor-intensive while also efficient. For the brightness problem, each input image is blended by several images with different brightness. For the ghosting problem, we propose an intuitive technique according to a stitching line based on a novel energy map which is essentially a combination of gradient map which indicates the presence of structures and prominence map which determines the attractiveness of a region. The stitching line can easily skirt around the moving objects or salient parts based on the philosophy that human eyes mostly notice only the salient features of an image. We compare result of our method to those of 4 state-of-the-art image stitching methods and it turns out that our method outperforms the 4 methods in removing ghosting artifacts.

Paper Nr: 69
Title:

Towards a Wake-up and Synchronization Mechanism for Multiscreen Applications using iBeacon

Authors:

Louay Bassbouss, Görkem Güçlü and Stephan Steglich

Abstract: TV sets and companion devices (Smartphones, Tablets, etc.) have outgrown their original purpose and are now playing together an important role to offer the best user experience on multiple screens. However, the collaboration between TV and companion applications faces challenges that go beyond traditional single screen applications. These range from discovery, wake-up and pairing of devices, to application launch, communication, synchronization and adaptation to target device and screen size. In this position paper, we will limit ourselves to two of these aspects and introduce an idea for a new wake-up and synchronization mechanism for Multiscreen applications using iBeacon technology.

Paper Nr: 70
Title:

Smoothed Surface Transitions for Human Motion Synthesis

Authors:

Ashish Doshi

Abstract: Multiview techniques to reconstruct an animation from 3D video have advanced in leaps and bounds in recent years. It is now possible to synthesise a 3D animation by fusing motions between different sequences. Prior work in this area has established methods to successfully identify inter-sequence transitions of different or similar actions. In some instances however, the transitions at these nodes in the motion path would cause an abrupt change between the motion sequences. Hence, this paper proposes a framework that allows for smoothing of these inter-sequence transitions, while preserving the detailed dynamics of the captured movement. Laplacian based mesh deformation, in addition to shape and appearance based feature methods, including SIFT and MeshHOG features, are used to obtain temporally consistent meshes. These meshes are then interpolated within a temporal window and concatenated to reproduce a seamless transition between the motion sequences. A quantitative analysis of the inter-sequence transitions, evaluated using three dimensional shape based Hausdorff distance is presented for synthesised 3D animations.

Area 2 - Multimedia Signal Processing

Full Papers
Paper Nr: 4
Title:

Clothes Change Detection Using the Kinect Sensor

Authors:

Dimitris Sgouropoulos, Theodoros Giannakopoulos, Sergios Petridis, Stavros Perantonis and Antonis Korakis

Abstract: This paper describes a methodology for detecting when a human has changed clothes. Changing clothes is a basic activity of daily living which makes the methodology valuable for tracking the functional status of elderly people, in the context of a non-contract unobtrusive monitoring system. Our approach uses Kinect and the OpenNI SDK, along with a workflow of basic image analysis steps. Evaluation has been conducted on a set of real recordings under various illumination conditions, which is publicly available along with the source code of the proposed system at http://users.iit.demokritos.gr/ tyianak/ClothesCode.html.

Paper Nr: 16
Title:

A Multi-Threaded Full-feature HEVC Encoder Based on Wavefront Parallel Processing

Authors:

Stefan Radicke, Jens-Uwe Hahn, Christos Grecos and Qi Wang

Abstract: The High Efficiency Video Coding (HEVC) standard was finalized in early 2013. It provides a far better coding efficiency than any preceding standard but it also bears a significantly higher complexity. In order to cope with the high processing demands, the standard includes several parallelization schemes, that make multi-core encoding and decoding possible. However, the effective realization of these methods is up to the respective codec developers. We propose a multi-threaded encoder implementation, based on HEVC’s reference test model HM11, that makes full use of the Wavefront Parallel Processing (WPP) mechanism and runs on regular consumer hardware. Furthermore, our software produces identical output bitstreams as HM11 and supports all of its features that are allowable in combination with WPP. Experimental results show that our prototype is up to 5.5 times faster than HM11 running on a machine with 6 physical processing cores.

Paper Nr: 30
Title:

Three-stage Unstructured Filter for Removing Mixed Gaussian plus Random Impulse Noise

Authors:

Fitri Utaminingrum, Keiichi Uchimura and Gou Koutaki

Abstract: Digital image processing is often contaminated by more than one type of noise, such as mixed noise. In this paper, we propose a three-stage process to develop K-SVD method not only for reducing Gaussian noise but also for mixed Gaussian and impulse noise with optimizing input system and preserving edge structure. A three-stage process is combining of impulse noise removal, edge reconstruction and image smoothing. Pressing of an impulse noise in the early stages by Decision Based Algorithm (DBA) and repairing edge structure by an edge-map are able to optimize the performance of the K-SVD method for smoothing an image. The performance of the filter is analysed in terms of Peak Signal to Noise Ratio (PSNR), Mean Structural Similarity (MSSIM) index and Blind Image Quality Index (BIQI). The simulation result is obtained a significant improvement over the previous research.

Paper Nr: 39
Title:

Self-calibration of Large Scale Camera Networks

Authors:

Patrik Goorts, Steven Maesen, Yunjun Liu, Maarten Dumont, Philippe Bekaert and Gauthier Lafruit

Abstract: In this paper, we present a method to calibrate large scale camera networks for multi-camera computer vision applications in sport scenes. The calibration process determines precise camera parameters, both within each camera (focal length, principal point, etc) and inbetween the cameras (their relative position and orientation). To this end, we first extract candidate image correspondences over adjacent cameras, without using any calibration object, solely relying on existing feature matching computer vision algorithms applied on the input video streams. We then pairwise propagate these camera feature matches over all adjacent cameras using a chained, confident-based voting mechanism and a selection relying on the general displacement across the images. Experiments show that this removes a large amount of outliers before using existing calibration toolboxes dedicated to small scale camera networks, that would otherwise fail to work properly in finding the correct camera parameters over large scale camera networks. We succesfully validate our method on real soccer scenes.

Paper Nr: 46
Title:

Real-time Local Stereo Matching Using Edge Sensitive AdaptiveWindows

Authors:

Maarten Dumont, Patrik Goorts, Steven Maesen, Philippe Bekaert and Gauthier Lafruit

Abstract: This paper presents a novel aggregation window method for stereo matching, by combining the disparity hypothesis costs of multiple pixels in a local region more efficiently for increased hypothesis confidence. We propose two adaptive windows per pixel region, one following the horizontal edges in the image, the other the vertical edges. Their combination defines the final aggregation window shape that rigorously follows all object edges, yielding better disparity estimations with at least 0.5 dB gain over similar methods in literature, especially around occluded areas. Also, a qualitative improvement is observed with smooth disparity maps, respecting sharp object edges. Finally, these shape-adaptive aggregation windows are represented by a single quadruple per pixel, thus supporting an efficient GPU implementation with negligible overhead.

Paper Nr: 53
Title:

Evaluation of Acoustic Feedback Cancellation Methods with Multiple Feedback Paths

Authors:

Bruno Bispo and Diamantino Freitas

Abstract: Acoustic feedback limits the maximum stable gain of a public address system and may cause the system to become unstable. Acoustic feedback cancellation methods use an adaptive filter to identify the impulse response of the acoustic feedback path and then remove its influence from the system. However, if the traditional adaptive filtering algorithms are used, a bias is introduced in the estimate of the acoustic feedback path obtained by the adaptive filter. Several methods have been proposed to overcome the bias problem but they are generally evaluated considering a public address system with only one microphone and one loudspeaker. This work evaluates some of the state-of-art methods considering a public address system with one microphone and four loudspeakers that results in multiple feedback paths and corresponds to a more realistic scenario of a typical system. Simulation results demonstrated that, with multiple feedback paths, the acoustic feedback cancellation methods are able to increase in 12 dB the maximum stable gain of the public address system when the source signal is speech.

Short Papers
Paper Nr: 14
Title:

Basic Concept of Cuckoo Search Algorithm for 2D Images Processing with Some Research Results - An Idea to Apply Cuckoo Search Algorithm in 2D Images Key-points Search

Authors:

Marcin Woźniak and Dawid Połap

Abstract: In this paper, the idea of applying cuckoo search algorithm to search for key-points in 2D images is formulated. For a set of test images we present and verify simplified version, it’s efficiency and precision. Research results are presented and discussed in comparison to classic methods like simplified SURF and SIFT algorithms to show potential efficiency of applied computational intelligence.

Paper Nr: 25
Title:

Lip Tracking Using Particle Filter and Geometric Model for Visual Speech Recognition

Authors:

Islem Jarraya, Salah Werda and Walid Mahdi

Abstract: The automatic lip-reading is a technology which helps understanding messages exchanged in the case of a noisy environment or of elderly hearing impairment. To carry out this system, we need to implement three subsystems. There is a locating and tracking lips system, labial descriptors extraction system and a classification and speech recognition system. In this work, we present a spatio-temporal approach to track and characterize lip movements for the automatic recognition of visemes of the French language. First, we segment lips using the color information and a geometric model of lips. Then, we apply a particle filter to track lip movements. Finally, we propose to extract and classify the visual informations to recognize the pronounced viseme. This approach is applied with multiple speakers in natural conditions.

Paper Nr: 40
Title:

3D Dual-Tree Discrete Wavelet Transform Based Multiple Description Video Coding

Authors:

Jing Chen, Canhui Cai and Li Li

Abstract: A 3D dual-tree discrete wavelet transform (DT-DWT) based multiple description video coding algorithm is proposed to combat the transmitting error or packet loss due to Internet or wireless network channel failure. Each description of the proposed multiple description coding scheme consists of a base layer and an enhancement layer. First, the input image sequence is encoded by a standard H.264 encoder in low bit rate to form the base layer, which is then duplicated to each description. Second, the difference between the reconstructed base layer and the input image sequence is encoded by a 3D dual-tree wavelet encoder to produce four coefficient trees. After noise-shaping, these four trees are partitioned into two groups, individually forming enhancement layers of two descriptions. Since the 3D DT-DWT equips 28 directional subbands, the enhancement layer can be coded without motion estimation. The plenty of directional selectivity of DT-DWT solves the mismatch problem and improves the coding efficiency. If all descriptions are available in the receiver, a high quality video can be reconstructed by a central decoder. If only one description is received, a side decoder can be used to reconstruct the source with acceptable quality. Simulation results have shown that the quality of reconstructed video by the proposed algorithm is superior to that by the state-of-the-art multiple description video coding methods.

Paper Nr: 57
Title:

Visual Attention in Edited Dynamical Images

Authors:

Ulrich Ansorge, Shelley Buchinger, Christian Valuch, Aniello Raffaele Patrone and Otmar Scherzer

Abstract: Edited (or cut) dynamical images are created by changing perspectives in imaging devices, such as videos, or graphical animations. They are abundant in everyday and working life. However little is known about how attention is steered with regard to this material. Here we propose a simple two-step architecture of gaze control for this situation. This model relies on (1) a down-weighting of repeated information contained in optic flow within takes (between cuts), and (2) an up-weighting of repeated information between takes (across cuts). This architecture is both parsimonious and realistic. We outline the evidence speaking for this architecture and also identify the outstanding questions.

Paper Nr: 64
Title:

An Integrated Approach for Efficient Analysis of Facial Expressions

Authors:

Mehdi Ghayoumi and Arvind K. Bansal

Abstract: This paper describes a new automated facial expression analysis system that integrates Locality Sensitive Hashing (LSH) with Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) to improve execution efficiency of emotion classification and continuous identification of unidentified facial expressions. Images are classified using feature-vectors on two most significant segments of face: eye segments and mouth-segment. LSH uses a family of hashing functions to map similar images in a set of collision-buckets. Taking a representative image from each cluster reduces the image space by pruning redundant similar images in the collision-buckets. The application of PCA and LDA reduces the dimension of the data-space. We describe the overall architecture and the implementation. The performance results show that the integration of LSH with PCA and LDA significantly improves computational efficiency, and improves the accuracy by reducing the frequency-bias of similar images during PCA and SVM stage. After the classification of image on database, we tag the collision-buckets with basic emotions, and apply LSH on new unidentified facial expressions to identify the emotions. This LSH based identification is suitable for fast continuous recognition of unidentified facial expressions.

Posters
Paper Nr: 8
Title:

Computational Complexity Analysis of the Graph Extraction Algorithm for Volumetric Segmentation Method

Authors:

Dumitru Dan Burdescu, Liana Stanescu, Marius Brezovan and Cosmin Stoica Spahiu

Abstract: The problem of partitioning images into homogenous regions or semantic entities is a basic problem for identifying relevant objects. Visual segmentation is related to some semantic concepts because certain parts of a scene are pre-attentively distinctive and have a greater significance than other parts. Unfortunately there are huge of papers for planar images and segmentation methods and most graph-based for planar images and very few papers for volumetric segmentation methods. The major concept used in graph-based volumetric segmentation method is the concept of homogeneity of regions and thus the edge weights are based on color distance. A huge number of approaches to segmentation are based on finding compact regions in some feature space. A recent technique using feature space regions first transforms the data by smoothing it in a way that preserves boundaries between regions. In this paper we extend our previous papers for planar images by adding a new step in the volumetric segmentation algorithm that allows us to determine regions closer to it. The key to the whole algorithm of volumetric segmentation is the honeycomb. Then the volumetric segmentation module creates virtual cells of prisms with tree-hexagonal structure defined on the set of the image voxels of the input spatial image and a spatial triangular grid graph having tree-hexagons as cells of vertices. The computational complexity analysis shows that our volumetric segmentation methods are linear.

Paper Nr: 12
Title:

Real-time Super Resolution Equipment for 8K Video

Authors:

Seiichi Gohshi

Abstract: Resolution is one of the most important things when assessing image quality, and image quality improves in proportion to resolution. Super Resolution (SR) is a resolution improving technology that is different from conventional image enhancement methods. SR methods, of which there are many, typically have complex algorithms that involve iterations, and it is not easy to apply them to video running in real time at 50 or 60 frames a second; i.e., each frame would have to be processed in 20 ms (50 Hz) or 16.7 ms (60 Hz). In this paper, a simple form of SR that uses non-linear signal processing is proposed to cope with the difficulties in real-time processing. Real-time hardware equipped with an FPGA (Field Programmable Gate Array) for 4K and 8K video is shown and its SR capability is discussed.

Paper Nr: 18
Title:

Methods and Algorithms of Cluster Analysis in the Mining Industry - Solution of Tasks for Mineral Rocks Recognition

Authors:

Olga Baklanova and Olga Ya Shvets

Abstract: It is described the algorithm for automatic segmentation of colour images of ores, using the methods of cluster analysis. There are some examples illustrated using of the algorithm in the solving of mineral rock recognition problems. Results of studies are demonstrated different colour spaces by k-means clustering. It was supposed the technique of pre- computing the values of the centroids. There is formulas translation metrics colour space HSV. The effectiveness of the proposed method lies in the automatic identification of interest objects on the total image, tuning parameters of the algorithm is a number that indicates the amount allocated to the segments. This paper contains short description of cluster analysis algorithm for the mineral rock recognition in the mining industry.

Paper Nr: 41
Title:

Image Enhancement for Hand Sign Detection

Authors:

Jing-Wein Wang, Tzu-Hsiung Chen and Tsong-Yi Chen

Abstract: This paper proposes compact hand extraction to assist in computerized handshape recognition. We devised an image enhancement technique based on singular value decomposition to remove dark backgrounds by reserving the skin color pixels of a hand image. The polynomial approximation YCbCr color model was then used to extract the hand. After alignment, we applied illumination compensation to the adaptable singular value decomposition. Experimental results for images from our database showed that our method functioned more efficiently than conventional ones that do not use compact hand extraction against complex scenes.

Paper Nr: 50
Title:

Gabor Fused to 2DPCA for Face Recognition

Authors:

Faten Bellakhdhar, Kais Loukil and Mohamed Abid

Abstract: Automatic visual recognition of human faces is an extremely attractive research subjects. It is motivated by the wide range of commercial and law enforcement application. In the last thirty years numerous algorithms for face recognition have been developed, for detailed surveys see. The state-of-the-art in human face recognition is the subspace methods originated by the Principal Component Analysis (PCA), the Eigenfaces of the facial images. Recently, a technique called Two-Dimensional PCA (2DPCA) was proposed for human face representation and recognition. In this paper, we use the phase and magnitude of Gabor’s representations of the face as a new representation followed by a face recognition algorithm, based on the 2D principal component Analysis approach. The performance of the proposed algorithm is tested on the public and largely used databases of FRGCv2 face and ORL. Experimental results on databases show that the use of 2D PCA, can achieve promising results, easier to evaluate the covariance matrix accurately and less time is required to computational.

Paper Nr: 63
Title:

Comparison of Adaptive Filters for Wavelet Generation

Authors:

A. Gayathri, M. S. Sinith and S. Chithra

Abstract: The adaptive filter algorithms can be used to find out the filter bank coefficients of standard wavelets. Here the algorithms like LMS, NLMS and RLS are used to find out the coefficients of dB2, dB3 and Coiflet1 wavelets. The LMS and NLMS algorithms are modified to form VSSLMS, NVSSLMS and time varying NLMS algorithms and their performances are also analysed. Best results are obtained with NLMS algorithm. Using the coefficients obtained, the scaling and the wavelet functions are regenerated. Thus it is proved that these algorithms can be used to find out the filter bank coefficients of the signal of our interest.

Paper Nr: 66
Title:

Image Denoising Algorithm with a Three-dimensional Grid System of Coupled Nonlinear Elements

Authors:

Atsushi Nomura, Yoshiki Mizukami and Koichi Okada

Abstract: This paper presents an image denoising algorithm with a three-dimensional grid system of coupled nonlinear elements. The system consists of a two-dimensional image grid and a one-dimensional grid representing a quantized image brightness. At each grid point, a FitzHugh-Nagumo type nonlinear element is placed and coupled with other elements placed at its nearest neighboring grid points. The FitzHugh-Nagumo element is described with a set of time-evolving ordinary differential equations, and is tuned to be excitable. When we externally stimulate the grid system with an image brightness distribution, we could observe that noise in the distribution was reduced and signal was strengthened as time proceeds. Thus, the image denoising algorithm utilizes this property of the grid system, in which we propose to modify external stimuli so as to have broad Gaussian distributions. We confirm performance of the algorithm on artificial and real images in comparison with two classical algorithms of a diffusion equation and median filtering.

Area 3 - Multimedia Systems and Applications

Full Papers
Paper Nr: 28
Title:

An Application Supporting Gastroesophageal Multichannel Intraluminal Impedance-pH Analysis

Authors:

Piotr Tojza, Dawid Gradolewski and Grzegorz Redlarski

Abstract: Due to a significant rise in the number of patients diagnosed with diseases of the upper gastrointestinal tract and the high cost of treatment, there is a need to further research on one of the most popular diagnostic tests used in this case – esophageal Multichannel Intraluminal Impedance and pH measurement. This may lead to finding new diagnostically relevant information, used to quicken and improve the diagnostic procedure. The paper presents an algorithm used in a new computer application dedicated for researchers and physicians interested in research connected with Gastroesophageal impedance and pH data analysis. A possibility to modify a wide range of the algorithms parameters as well as rich set of the programs functions allows researchers to search for new criteria to assess the pH and impedance data when diagnosing diseases of the upper gastrointestinal tract. This, in turn, may lead to improving the time and accuracy of the MII-pH analysis which will substantially affect the patient’s diagnosis time and treatment. Moreover, the diagnosing physician will be able to asses more tests, which is important, due to a significant rise in the number of patients seeking attention when speaking about the diseases of the upper gastrointestinal tract.

Short Papers
Paper Nr: 10
Title:

Design and Implementation of a Driver Drowsiness Detection System - A Practical Approach

Authors:

Aleksandar Čolić, Oge Marques and Borko Furht

Abstract: This paper describes the steps involved in designing and implementing a driver drowsiness detection system based on visual input (driver’s face and head). It combines off-the-shelf software components for face detection, human skin color detection, and eye state (open vs. closed) classification in a novel way. Preliminary results show that the system is reliable and tolerant to many real-world constraints.

Paper Nr: 17
Title:

Image Display System Using Bamboo-blind Type Screen that Can Discharge Smell

Authors:

Keisuke Tomono, Rei Shu, Mana Tanaka and Akira Tomono

Abstract: In order to promote the realistic sensations of visuals, a display system, in which smell along with air was discharged through screen to a viewer, was invented. A bamboo-blind screen where thin rods and spaces were arranged in the vertical direction was used. For the alley type screen, the visuals were displayed using projectors. Both airflow and scent generators were attached on the back of the screen. This work insinuated the following details; the direction of airflow was controlled by installing blades functions and smell was able to be directed and oozed toward expected locations. Also, if the screen was large enough and an animated series of visuals was presented, the alley type of display enabled to remain the high quality of visuals. Applications to digital signage and large-screen virtual game etc. can be expected.

Paper Nr: 19
Title:

HMM-based Breath and Filled Pauses Elimination in ASR

Authors:

Piotr Żelasko, Tomasz Jadczyk and Bartosz Ziółko

Abstract: The phenomena of filled pauses and breaths pose a challenge to Automatic Speech Recognition (ASR) systems dealing with spontaneous speech, including recognizer modules in Interactive Voice Reponse (IVR) systems. We suggest a method based on Hidden Markov Models (HMM), which is easily integrated into HMM-based ASR systems and allows detection of those disturbances without incorporating additional parameters. Our method involves training the models of disturbances and their insertion in the phrase Markov chain between word-final and word-initial phoneme models. Application of the method in our ASR shows improvement of recognition results in Polish telephonic speech corpus LUNA.

Paper Nr: 20
Title:

Food Image Presentation System that Discharge Smell Through Screen and Psychological Effect

Authors:

Akira Tomono, Mana Tanaka, Rei Shu and Keisuke Tomono

Abstract: The author et al. are currently engaged in a project for an image displaying system with a screen from which smells are discharged for users, together with images, aiming at applying it to digital signage and for other purposes in order to enhance the realistic sensation of the food images. The author et al. conducted experiments in presenting food images and discharging smells from the same position, and analyzed the users’ psychological impact. A subject questionnaire and a cerebral blood-flow meter were used for the analysis. In the first experiment, it was clarified that when an airflow and a smell were discharged in conformity with the image of cooking with a steaming hot pot, an inhaling action occurred and smell perception rate was enhanced. In the second experiment, when a smell fit with a food image was discharged, the realistic sensation and the oxyhemoglobin rose in the vicinity of the temple because the salivation central nerves became active.

Paper Nr: 51
Title:

A Multimedia Tracing Traitors Scheme Using Multi-level Hierarchical Structure for Tardos Fingerprint Based Audio Watermarking

Authors:

Faten Chaabane, Maha Charfeddine and Chokri Ben Amar

Abstract: This paper presents a novel approach in tracing traitors field. It proposes a multi-level hierarchical structure to the used probabilistic fingerprinting code; the well known Tardos code. This proposed structure is performed to address the problem of computational costs and time of Tardos code during its accusation step. The generated fingerprint is embedded in the extracted audio stream of the media by an audio watermarking technique operating in the frequency domain. The watermarking technique represents an original choice compared to the existing works in the literature. We assume that the strategy of collusion is known, we compare then the performance of our tracing traitors framework against different types of attacks. We show in this paper how the proposed hierarchy and the watermarking layer have a satisfying impact on the performance of our tracing system.

Posters
Paper Nr: 23
Title:

Hierarchical Optimization Using Hierarchical Multi-competitive Genetic Algorithm and its Application to Multiple Vehicle Routing Problem

Authors:

Shudai Ishikawa, Ryosuke Kubota and Keiichi Horio

Abstract: In this paper a new optimization technique which is effective for hierarchical optimization problem is proposed. This technique is an extension of the multiple-competitive distributed genetic algorithm (mcDGA). This method consists of two levels upper and lower. The solution space to be searched is determined at the upper level, and the optimum solution in a given solution space is determined at the lower level. The migration of the individual and competition is performed at the lower layer thereby optimal solution can be found efficiently. We apply the proposed hierarchical mcDGA to the mVRP to confirm the effectiveness. Simulation result shows the effectiveness of the proposed method.

Paper Nr: 27
Title:

Home Position Recognition Methods Using Polarizing Films for an Eye-gaze Interface System

Authors:

Kohichi Ogata, Kensuke Sakamoto and Shingo Niino

Abstract: In eye-gaze interface systems, users’ head movements during use result in detection errors. This problem causes inaccurate positioning of the mouse pointer on the display screen. A mechanism that allows the user to recognize the home position, which is an appropriate position for the head while using an eye-gaze interface system, can provide a useful solution to the problem because the user can then simply adjust his/her head position. The implementation of such a mechanism does not require special equipment, such as a position sensor to detect head movement and calculate compensation. We thus propose in this paper methods for recognizing the home position using polarizing films. Taking advantage of the characteristics of polarizing films, we propose two guidance methods to help an eye-gaze interface system user recognize whether or not his/her head is in the home position and adjust the position of the head if needed. The results of our experiments reveal that our proposed methods improve the usability of eye-gaze interface systems, and that one of our methods is more effective than the other. Therefore, our mechanism is useful in producing simple and low-cost interface systems.

Paper Nr: 45
Title:

Automatic Letter/Pillarbox Detection for Optimized Display of Digital TV

Authors:

Lúcia Carreira and Maria Paula Queluz

Abstract: In this paper we propose a method for the automatic detection of the true aspect ratio of digital video, by detecting the presence and width of horizontal and vertical black bars, also known as letterbox and pillarbox effects. If active format description (AFD) metadata is not present, the proposed method can be used to identify the right AFD and associate it to the video content. In the case AFD information is present, the method can be used to verify its correctness and to correct it in case of error. Additionally, the proposed method also allows to detect if relevant information (as broadcaster logos and hard subtitles) is merged within the black bars and, in the case of subtitles, is able to extract it from the bars and dislocate it to the active picture area (allowing the letterbox removal).

Paper Nr: 55
Title:

Why Using the Alpha-stable Distribution in Neuroimage?

Authors:

Diego Salas-Gonzalez, Juan M. Górriz, Javier Ramírez and Elmar W. Lang

Abstract: The main goal and overall objective of this contribution is to attract the attention of the potentialities and wide range of applications of the a-stable distribution in biomedical applications, specifically in neuroimaging. The a-stable density is a heavy-tailed, non-symmetric distribution with similar desirable properties to the Gaussian. Indeed, the Gaussian distribution is a particular case of the a-stable family. The Gaussian distribution is used ubiquitously in brain image processing. For this reason, we believe that the a-stable density can be potentially used as an alternative to the Gaussian distribution in several biomedical applications regarding brain imaging. Some of the proposed applications of the alpha-stable distribution considered in this work are the development of brain image processing approaches with applications to intensity normalization of SPECT images, MRI segmentation and feature extraction for the diagnosis of Parkinsonian’s syndrome.

Paper Nr: 65
Title:

Gender Classification Using M-Estimator Based Radial Basis Function Neural Network

Authors:

Chien-Cheng Lee

Abstract: A gender classification method using an M-estimator based radial basis function (RBF) neural network is proposed in this paper. In the proposed method, three types of effective features, including facial texture features, hair geometry features, and moustache features are extracted from a face image. Then, an improved RBF neural network based on M-estimator is proposed to classify the gender according to the extracted features. The improved RBF network uses an M-estimator to replace the traditional least-mean square (LMS) criterion to deal with the outliers in the data set. The FERET database is used to evaluate our method in the experiment. In the FERET data set, 600 images are chosen in which 300 of them are used as training data and the rest are regarded as test data. The experimental results show that the proposed method can produce a good performance.