SIGMAP 2009 Abstracts


Area 1 - Multimedia Communications

Posters
Paper Nr: 32
Title:

REAL-TIME SVC DECODER IN EMBEDDED SYSTEM

Authors:

Srijib Narayan Maiti, Amit Gupta, Emiliano Mario Piccinelli and Kaushik Saha

Abstract: Scalable Video Coding (SVC) has been standardized as an annexure to the already existing H.264 specification to bring more scalability into the already existing video standard, keeping the compatibility with it. Naturally, the immediate support for this in embedded system will be based on the existing implementations of H.264. This paper deals with the implementation of SVC decoder in SoC built on top of existing implementation of H.264. The additional processing of various functionalities as compared to H.264 is also substantiated in terms of profiling information on a four issue VLIW processor.

Area 2 - Multimedia Signal Processing

Full Papers
Paper Nr: 25
Title:

NOISE POWER ESTIMATION USING RAPID ADAPTATION AND RECURSIVE SMOOTHING PRINCIPLES

Authors:

François Xavier Nsabimana, Udo Zölzer and Vignesh Subbaraman

Abstract: In this paper we present an algorithm for the robust estimation of the noise power from the speech signals contaminated by high non stationary noise sources for speech enhancement. The noise power is first estimated by minimum statistics principles with a very short window. From the resulting noise power excess, the overestimation is accounted for using recursive averaging techniques. The performance of the proposed technique is finally compared with the different existing approaches using various grading tests.

Paper Nr: 38
Title:

DESIGN METHOD FOR WEDGE-SHAPED FILTERS

Authors:

Radu Matei

Abstract: We present an analytical design method for a particular class of two-dimensional filters, namely wedge filters. The method relies on a frequency mapping which is applied to a 1D IIR low-pass prototype filter of a desired shape. We used as prototypes a flat-top filter and a Gaussian filter. Such filters have applications in texture analysis based on spatial filtering using various filter banks. In this paper we approached the wedge filter design method, without actually presenting an application in texture classification or other image processing tasks, which are extensively treated in other works.

Short Papers
Paper Nr: 10
Title:

A LOSSLESS IMAGE COMPRESSION METHOD WITH THE REVERSIBLE DATA EMBEDDING CAPACITY

Authors:

Ju-Yuan Hsiao and Zhi-Yu Xu

Abstract: The current reversible data embedding techniques can be divided into 2 categories: those developed by the untreated original images and those developed by the images treated via the lossy compression as the reversible target. So far, there has not been any literature discussing the issue of the reversible data embedding action finished during the lossless still image compression. Our method employs Chuang and Lin’s simplified lossless image compression method to develop a reversible data embedding technique. Our method successfully embeds the secret data in the lossless image compression codes and have the secret data reversed to the original image as the secret data are retrieved wherefrom. According to the experimental results, the proposed method performs well on the embedding capacity and compression effects.

Paper Nr: 15
Title:

NEW APPROACH TO AUDIO SEARCH BASED ON KEY-EVENT DETECTION PROCEDURE

Authors:

Igor E. Kheidorov, Peter D. Kukharchik and Yan Jingbin

Abstract: In this paper a new approach to audio search in multimedia data bases is developed based on key-event detection procedure. The main idea of the proposed approach is to present each audio fragment as sequence of context dependent “key-events”, specially determined or taken from the data base. Wavelets and support vectors method were used as the basis for the created procedure of key-events selection and classifier training. The experiments show good results and perspectives for the proposed approach.

Paper Nr: 18
Title:

A HIGH-LEVEL KERNEL TRANSFORMATION RULE SET FOR EFFICIENT CACHING ON GRAPHICS HARDWARE - Increasing Streaming Execution Performance with Minimal Design Effort

Authors:

Sammy Rogmans, Philippe Bekaert and Gauthier Lafruit

Abstract: This paper proposes a high-level rule set that allows algorithmic designers to optimize their implementation on graphics hardware, with minimal design effort. The rules suggest possible kernel splits and merges to transform the kernels of the original design, resulting in an inter-kernel rather then low-level intra-kernel optimization. The rules consider both traditional texture caches and next-gen shared memory – which are used in the abstract stream-centric paradigms such as CUDA and Brook+ – and can therefore be implicitly applied in most generic streaming applications on graphics hardware.

Paper Nr: 27
Title:

ROBUST VOICE ACTIVITY DETECTION BASED ON PITCH AND SUB-BAND ENERGY

Authors:

Zhihao Zhang and Jinlong Lin

Abstract: A new Voice Activity Detection (VAD) method is proposed to track the various background noises and it can be robust in both stationary and variable noise environments. Many previous VAD methods assume that the background only contains certain kinds of noises, so they could not deal with the noise in practical applications efficiently. In proposed approach, determinate speech, determinate noise and potential speech regions are defined. The first two regions are located with extracted pitch contour information and the ambiguous region will be further retrieved using updated thresholds of sub-bands energy in obtained determinate noise’s frequency domain. Experiments are carried out with an exhaustive comparison to three standard VAD methods: G729b, ETSI AFE and AMR. The result shows that our approach has a more robust performance than others in the real circumstances.

Paper Nr: 31
Title:

HIGH-RESOLUTION IMAGE GENERATION USING WARPING TRANSFORMATIONS

Authors:

Gabriel Scarmana

Abstract: In recent years the emphasis for applications of 3D modelling has shifted from measurement to visualization. New communication and visualization technologies have created an important demand for photo-realistic content in 3D real-time animations, interactive fly-overs and walk-arounds, panoramic images, visualizations and simulations based on real-world data. These image-based approaches require acquisition procedures that are simple and flexible with the use of consumer photo- or video-cameras. Ideally the user should be able to move freely while acquiring the images with any device ranging from a mobile phone to a video camera. In this context, a device independent algorithm for the estimation of an enhanced resolution image from multiple low-resolution and distorted compressed video images having arbitrary views is proposed in this paper. This process of spatial image enhancement is demonstrated here in a controlled scenario whereby the different views of the same scene are warped, firstly, to a common orientation so that a rigorous least squares area-based matching technique can then compute the registration parameters needed for their accurate combination. The sequence is acquired using a digital camera in video mode, which samples the image of a static scene from different angles. The warping is an iterative process relying on manual intervention and is used here to compensate for the different range of scales and orientations of the low-resolution imagery. Once this imagery is brought into registration and complies with pre-established image correlation criteria, they are combined to recover a high-resolution composite. Although the quality and resolution of the sensor arrays used to capture digital data continue to evolve, it is important that any technique used to enhance spatial resolution must be device independent, thus capable of using input from not only low-resolution images but also from higher resolution devices.

Paper Nr: 43
Title:

ROBUST AUTHENTICATION USING LIKELIHOOD RATIO BASED SCORE FUSION OF VOICE AND FACE

Authors:

Messaoud Bengherabi, Lamia Mezai, Farid Harizi, Abderrazak Guessoum and Mohamed Cheriet

Abstract: With the increased use of biometrics for identity verification, there has been a similar increase in the use of multimodal fusion to overcome the limitations of unimodal biometric systems. While there are several types of fusion (e.g. decision level, score level, feature level, sensor level), research has shown that score level fusion is the most effective in delivering increased accuracy. Recently a promising framework for optimal combination of match scores based on the likelihood ratio test is proposed; where the distributions of genuine and impostor match scores are modelled as finite Gaussian mixture model. In this paper, we examine the performance of combining face and voice biometrics at the score level using the LR classifier. Our experiments on the publicly available scores of the XM2VTS Benchmark database show a consistent improvement in performance compared to the famous efficient sum rule preceded by Min–Max, z-score and tanh score normalization techniques.

Paper Nr: 45
Title:

THE NEED FOR IMPULSIVITY & SMOOTHNESS - Improving HCI by Qualitatively Measuring New High-Level Human Motion Features

Authors:

Barbara Mazzarino and Maurizio Mancini

Abstract: The aim of this paper is to develop algorithms to measure motion features by investigating concepts which are commonly used to describe movement characteristics in both research studies and everyday life: impulsivity and smoothness. We also aim to implement such definitions in our developing environment VisNet and finally test if they can effectively measure impulsivity and smoothness in the same way these characteristics are perceived by human users.

Paper Nr: 46
Title:

TOWARD A SEMI-SUPERVISED APPROACH IN CLASSIFICATION BASED ON PRINCIPAL DIRECTIONS

Authors:

Luminita State, Catalina Cocianu, Doru Constantin, Corina Sararu and Panayiotis Vlamos

Abstract: Since similarity plays a key role for both clustering and classification purposes, the problem of finding a relevant indicators to measure the similarity between two patterns drawn from the same feature space became of major importance. The advantages of using principal components reside from the fact that bands are uncorrelated and no information contained in one band can be predicted by the knowledge of the other bands. The semi-supervised learning (SSL) problem has recently drawn large attention in the machine learning community, mainly due to its significant importance in practical applications. The aims of the research reported in this paper are to report experimentally derived conclusions on the performance of a PCA-based supervised technique in a semi-supervised environment. A series of conclusions experimentally established by tests performed on samples of signals coming from two classes are exposed in the final section of the paper.

Paper Nr: 47
Title:

IMPROVED FUZZY-C-MEANS FOR NOISY IMAGE SEGMENTATION

Authors:

Moualhi Wafa and Ezzeddine Zagrouba

Abstract: Magnetic resonance (MR) imaging is an important diagnostic imaging technique to early detect abnormal changes in the bain tissues. However, a serious limitation of the MR images is the significant amount of noise which can lead to inaccuracte segmentation. In this paper, a robust segmentation method based on an improvement of the conventional Fuzzy-C-Means (FCM) by modifiying its membership function is realized. A neighborhood attraction depending on the relative location and features of neighboring pixels is incorporated into the membership function to make the method robust to noise. Simulated and real brain MR images with different noise levels are used to demonstrate the superiority of the proposed method compared to some other FCM-based methods.

Paper Nr: 53
Title:

MAMMOGRAPHIC IMAGE ANALYSIS FOR BREAST CANCER DETECTION USING COMPLEX WAVELET TRANSFORMS AND MORPHOLOGICAL OPERATORS

Authors:

V. Alarcon-Aquino, O. Starostenko, R. Rosas-Romero, J. Rodriguez-Asomoza, O. J. Paz-Luna, K. Vazquez-Muñoz and L. Flores-Pulido

Abstract: This paper presents an approach for early diagnostic of Breast Cancer using the dual-tree complex wavelet transform (DT-CWT), which detect micro-calcifications in digital mammograms. The approach follows four basic strategies, namely, image denoising, band suppression, morphological transformation and inverse complex wavelet transform. The procedure of image denoising is carried out with a thresholding algorithm that computes recursively the optimal threshold at each level of wavelet decomposition. In order to maximize the detection a morphological conversion is proposed and applied to the high-frequencies sub-bands of the wavelet transformation. This procedure is applied to a set of digital mammograms from the Mammography Image Analysis Society (MIAS) database. Experimental results show that the proposed denoising algorithm and morphological transformation in combination with the DT-CWT procedure performs better than previous reported approaches.

Paper Nr: 54
Title:

A REVERSIBLE DATA HIDING SCHEME TO INVERSE HALFTONING

Authors:

Jia-Hong Lee, Hong-Jie Wu and Mei-Yi Wu

Abstract: A new inverse halftoning algorithm based on reversible data hiding techniques for halfton images is proposed in this research. The proposed scheme has the advantages of two commonly used methods, the look-up table (LUT) and Gaussian filtering methods. We embed a part of important LUT templates into a halfton image and restore the image lossless after these templates been extracted. Then a hybrid method is performed to reconstruct a gray-scale image from the halfton image. In the image reconstruction process, the halfton image is scanned pixel by pixel. If the scanned pattern surrounding a pixel appeared in the LUT templates, a gray value is directly predicted using the LUT value, otherwise, it is predicted using Gaussian filtering. Experimental results show that the reconstructed gray-scale images using the proposed scheme own better quality than both the LUT and Gaussian filtering methods.

Paper Nr: 55
Title:

SPEAKER VERIFICATION SYSTEM THROUGH TELEPHONE CHANNEL - An Integrated System for Telephony Plataform Asterisk

Authors:

Alexandre Maciel, Weber Campos, Clêunio França and Edson Carvalho

Abstract: Presented in this work a system of speaker verification based in GMM approach to integration with Asterisk telephony platforms. Has developed a database itself with 30 speakers and tested for acceptance and rejection. The results were effective with success rates of 90%.

Paper Nr: 57
Title:

A PSYCHOACOUSTICALLY MOTIVATED SOUND ONSET DETECTION ALGORITHM FOR POLYPHONIC AUDIO

Authors:

Balaji Thoshkahna and K. R. Ramakrishnan

Abstract: We propose an algorithm for sound onset detection applying principles of psychoacoustics. A popular model of loudness perception in human auditory system is used to compute a novelty function that allows for a more robust detection of onsets. The psychoacoustics paradigm also allows us to define thresholds for the novelty function that are both physically and perceptually meaningful and hence easy to manipulate according to the application. The algorithm performs well with an overall accuracy of detection of 86% for monophonic audio and 82% for polyphonic audio.

Posters
Paper Nr: 4
Title:

A SHAPE ERROR CONCEALMENT TECHNIQUE FOR ROBUST MPEG-4 SYSTEM

Authors:

Shih-Chang Hsia and Cheng Hung Hsiao

Abstract: This paper presents a fast-efficient error concealment method for recovering the shape information. The proposed technique consists of the block classification, the edge direction interpolation and the filtering interpolation. The missing block with the logic criterion is classified to four types that are transparent, opaque, edge and isolated blocks. Most of the computations cost on the edge block and the isolated block to obtain better cost-performance tradeoffs. For the edge-block recovery, the edge slope is computed by referring the near available block, and then the missing shape is interpolated along the edge direction. We deal with the isolated block using the cascaded filter to approximate the real shape. The experimental results show that the proposed method can achieve better cost-performance to restore the shape information compared to the other competing algorithms in both of the numerical parameters and the shape images. The processing speed is faster about 2~3 times to the well-known methods. The adaptive algorithm employed the low computational load overhead to make it applicable to a real-time MPEG-4 coding system.

Paper Nr: 12
Title:

PARAMETER OPTIMIZATION IN TIME-FREQUENCY ε-FILTER BASED ON CORRELATION COEFFICIENT

Authors:

Tomomi Abe, Mitsuharu Matsumoto and Shuji Hashimoto

Abstract: Time-Frequency ε-filter (TF ε-filter) can reduce most kinds of noise from a single-channel noisy signal with preserving the signal that varies drastically such as a speech signal. The filter design is simple and it can effectively reduce noise. It can reduce not only small stationary noise but also large nonstationary noise. However, it has some parameters and we need to set them appropriately based on empirical control. So far, there are few studies to evaluate the appropriateness of the parameter setting of ε-filter in general. In this paper, we employ correlation coefficient of the filter output and the difference between the input and the filter output as the evaluation function of the parameter setting. We also show the algorithm to set the optimal parameter of TF ε-filter. We conducted the experiments to compare the value of the correlation coefficient and the mean square error when we change ε value. The experimental results show the applicability of our criterion in parameter setting of ε-filter.

Paper Nr: 13
Title:

NOISE REDUCTION BASED ON MEDIAN ε-FILTER

Authors:

Mitsuharu Matsumoto

Abstract: This paper describes a nonlinear filter, which can reduce the impulse noise with preserving the edge information labeled median ε-filter. ε-filter is a nonlinear filter, which can reduce the small amplitude noise with preserving the edge information. The algorithm is simple and it has many applications because it uses only switching and linear operations. Although it is difficult to reduce the impulse noise by using e-filter due to its features, we can reduce the impulse noise effectively with preserving the edge information by combining the concept of median filter and ε-filter. Due to its simple design, the calculation cost is relatively small the same as ε-filter. To show the effectiveness of the proposed method, we also report the results of some comparative experiments concerning the filter characteristics.

Paper Nr: 14
Title:

MIMO INSTANTANEOUS BLIND IDENTIFICATION BASED ON STEEPEST DESCENT METHOD

Authors:

Shen Xizhong, Hu Dachao and Meng Guang

Abstract: This paper presents a new MIMO instantaneous blind identification algorithm based on second order temporal property and steepest descent method. Second order temporal structure is reformulated in a particular way such that each column of the unknown mixing matrix satisfies a system of nonlinear multivariate homogeneous polynomial equations. The nonlinear system is solved by steepest descent method. We construct a general goal of the system and convert the nonlinear problem into an optimal problem. Our algorithm allows estimating the mixing matrix for scenarios with 4 sources and 3 sensors, etc. Finally, simulations show its effectiveness with more accurate solutions than the algorithm with homotopy method.

Paper Nr: 42
Title:

AN ADAPTIVE COMPUTATION-AWARE ALGORITHM FOR MULTI-FRAME VARIABLE BLOCK-SIZE MOTION ESTIMATION IN H.264/AVC

Authors:

Mariusz Jakubowski and Grzegorz Pastuszak

Abstract: Block-matching motion estimation (BME) is the most computationally expensive process in every video codec. The algorithm proposed in this paper takes into account almost all key elements of BME including integer-pixel ME (IPME), sub-pixel ME (SPME), variable block-size ME (VBSME) and multiple reference frame ME (MRFME). The algorithm is developed by adding MRFME method to the multi-path adaptive computation-aware ME strategy (MPS) introduced in our previous papers. The algorithm implemented in the H.264/AVC reference software achieves comparable results as the fast full search (FFS) method within less than 3% of execution time required by FFS.

Paper Nr: 61
Title:

FAST SPEAKER ADAPTATION IN AUTOMATIC ONLINE SUBTITLING

Authors:

Aleš Pražák, Z. Zajíc, L. Machlica and J. V. Psutka

Abstract: This paper deals with speaker adaptation techniques well suited for the task of online subtitling. Two methods are briefly discussed, namely MAP adaptation and fMLLR. The main emphasis is laid on the description of improvements involved in the process of adaptation subject to the time requirements. Since the adaptation data are gathered continuously, simple modifications of the accumulated statistics have to be carried out in order to make the adaptation more accurate. Another proposed improvement efficiently employs the combination of fMLLR and MAP. In the case of online adaptation no prior transcriptions of the data are available. They are handled by a recognition system, thus it is suitable to assign a well-applied confidence measure to each of the transcriptions. We have performed experiments focused on the trade-off between the adaptation speed and the amount of adaptation data. We were able to gain a relative reduction of WER 16.2 %.

Paper Nr: 62
Title:

TRAINING OF SPEAKER-CLUSTERED ACOUSTIC MODELS FOR USE IN REAL-TIME RECOGNIZERS

Authors:

Jan Vaněk, Josef V. Psutka, Aleš Pražák and Josef Psutka

Abstract: The paper deals with training of speaker-clustered acoustic models. Various training techniques - Maximum Likelihood, Discriminative Training and two adaptation based on the MAP and Discriminative MAP were tested in order to minimize an impact of speaker changes to the correct function of the recognizer when a response of the automatic cluster detector is delayed or incorrect. Such situation is very frequent e.g. in online subtitling of TV discussions (Parliament meetings). In our experiments the best cluster-dependent training procedure was discriminative adaptation which provided the best trade-off between recognition results with correct and non-correct cluster detector information.

Area 3 - Multimedia Systems and Applications

Full Papers
Paper Nr: 17
Title:

ROBUST AND REVERSIBLE NUMERICAL SET WATERMARKING

Authors:

Gaurav Gupta, Josef Pieprzyk and Mohan Kankanhalli

Abstract: Numeric sets can be used to store and distribute important information such as currency exchange rates and stock forecasts. It is useful to watermark such data for proving ownership in case of illegal distribution by someone. This paper analyzes the numerical set watermarking model presented by Sion et. al in “On watermarking numeric sets”, identifies it’s weaknesses, and proposes a novel scheme that overcomes these problems. One of the weaknesses of Sion’s watermarking scheme is the requirement to have a normally-distributed set, which is not true for many numeric sets such as forecast figures. Experiments indicate that the scheme is also susceptible to subset addition and secondary watermarking attacks. The watermarking model we propose can be used for numeric sets with arbitrary distribution. Theoretical analysis and experimental results show that the scheme is strongly resilient against sorting, subset selection, subset addition, distortion, and secondary watermarking attacks.

Paper Nr: 48
Title:

RECOMMENDER TV - A Personalized TV Guide System Compliant with Ginga

Authors:

Paulo Muniz de Ávila and Sergio Donizetti Zorzo

Abstract: With the advent of digital television and the possibility of transmission of new services (in the analogue system, channels) a lot of information will be released in comparison to traditional analog system. The electronic programming guide (EPG) responsible for organizing such information has become inefficient because of the large volume of data provided by service providers. So that viewers can dealing with this information overload, are necessary tools to identify their needs and preferences. Personalized recommendation systems emerge as a solution to that problem, providing the viewer programs relevant to your profile. In this paper we present a personalized recommendation system, the RecommenderTV consistent with the reference implementation of the middleware Ginga. The implementation of the system RecommenderTV demanded the inclusion of new features to the middleware Ginga-NCL none exists in the implementation of reference. The service providers (in the analog system, broadcasters) and its importance to the system RecommenderTV are discussed in this work. Finally, we are reported the results obtained from the experiments with the system of recommendation implemented.

Paper Nr: 49
Title:

DIGITAL RADIO AS AN ADAPTIVE SEARCH ENGINE - Verbal Communication with a Digital Audio Broadcasting Receiver

Authors:

Günther Schatter, Andreas Eiselt and Benjamin Zeller

Abstract: This contribution presents the fundamentals for the design and realization of an improved digital broadcast receiver. The concept is characterized on the one hand by a content repository, which automatically stores spoken content elements. On the other hand a speech-based and a graphical interface are implemented, which enable users to directly search for audio and data content. For that purpose a DAB/DMB receiver is augmented to simultaneously monitor a variety of information sources. Other digital communication methods such as Satellite and HD RadiosTM or DVB are applicable as well. Besides broadcast receivers, the system can be integrated into next-generation mobile devices, such as cell phones, PDAs and sophisticated car radios. Priority is given to the idea of establishing a compact embedded solution for a new type of receiver, promoting the convergence of service offers familiar to users from the internet into the broadcast environment.

Short Papers
Paper Nr: 39
Title:

THE NEW DIGITAL TELEVISION AND THE INTERACTIVITY IN ITS MULTIMEDIA APPLICATIONS - Finding the Interactivity Using a Low Cost Solution

Authors:

José Luis Redondo García, José Luis González-Sánchez, Alfonso Gazo Cervero and Javier Corral-García

Abstract: The new interactive digital television, based on transmission standards like DVB-T, offers an interesting variety of services to both producers and consumers. This technology requires special decoders (STB’s), which implement an intermediary software layer (MHP) that is able to execute applications. In order to achieve the interactivity of these applications, STB’s must have what is known as return channel. To implement all this theoretical content we have created a scenario for interactive MHP applications using the decoder Strong SRT 5510, trying to show that it is possible to build low cost solutions. With this infrastructure, UEX-TVi has been developed. Students at university can use it to access to the topics of every subject, make test and quizzes, and see Web Pages on TV. The potential of the iTV is clearly demonstrated. Finally, the role that MHP will play in the future has been analysed, as well as the importance of the arrival of the new IP Television.

Paper Nr: 51
Title:

PANORAMIC IMMERSIVE VIDEOS - 3D Production and Visualization Framework

Authors:

W. J. Sarmiento and C. Quintero

Abstract: Panoramic immersive video is a new technology, which allows the user to interact with a video beyond simple production line, because it enables the possibility to navigate in the scene from different points of view. Although many devices for the production of panoramic videos have been proposed, these are still expensive.In this paper a framework for production of virtual panoramic immersive videos using 3D production software is presented. The framework is composed by two stages: panoramic video production and immersive visualization. In the former stage the traditional 3D scene is taken as input and two outputs are generated, the panoramic video and path sounds to immersive audio reproduction. In the latter, a desktop CAVE assembly is proposed in order to provide an immersive display.

Posters
Paper Nr: 11
Title:

DCADH: A GENERATING ALGORITHM OF DELAY-CONSTRAINED MULTICAST ROUTING TREE

Authors:

Yu-xi Zhu and Ling Zhou

Abstract: Multicast is the ability of a communication network to accept a single message from an application and to deliver copies of the message to multiple recipients at different location. With the development of Internet, Multicast is widely applied in all kinds of multimedia real-time application: distributed multimedia systems, collaborative computing, video-conferencing, distance education, etc. In order to construct a delay-constrained multicast routing tree, average distance heuristic (ADH) algorithm is analyzed firstly, then by using which a delay-constrained algorithm called DCADH is presented.By using ADH a least-cost multicast tree can be constructed; if the path delay can’t meet the delay upper bound,a shortest delay path which is computed by Dijkstra’s algorithm will be merged into the existing multicast tree to meet the delay upper bound. Simulation experiments show that DCADH has a good performance in achieving a low-cost multicast tree.

Paper Nr: 29
Title:

QUALITY OF SERVICE ISSUES FOR MULTISERVICE IP NETWORKS

Authors:

Matej Kavacky, Erik Chromý, Lenka Krulikovská and Juraj Pavlovič

Abstract: Our paper deals with utilization of Quality of Service (QoS) mechanisms for backbone IP network capable of transport of voice, data and video services. The first part presents selected QoS mechanisms for multiservice IP networks. The next part discusses impact of selected QoS mechanisms on multimedia traffic through simulations. In the final part we have proposed combination of QoS methods for selected network configuration.

Paper Nr: 40
Title:

THE HIXOSFS MUSIC APPROACH VS COMMON MUSICAL FILE MANAGEMENT SOLUTIONS

Authors:

Nicola Corriero, Vittoria Cozza and Fabrizio Fattibene

Abstract: Hixosfs music is an extension of ext2 Linux filesystem, with additional features to easy accessing and categorizing musical files. Specially it extends the inode struct inside the Virtual file system, considerating as file proprieties meta information such as album, author, title related to the content of a musical file. Comparition have been done respect to the Linux file system in user space Musicmeshfs, the Linux ext2 xattr feature, and ad hoc user space programs for efficiently retriving multimedia data. Since Hixosfs music manages the musical tags at kernel level, it offers higher performances then the other solutions but with less flexibility.

Paper Nr: 41
Title:

CONTEXT AWARENESS USING ENVIRONMENTAL SOUND CUES AND COMMONSENSE KNOWLEDGE

Authors:

Mostafa Al Masum Shaikh, Keikichi Hirose and Helmut Prendinger

Abstract: Detecting or inferring human activity (e.g., an outdoor activity) by analyzing sensor data is often inaccurate, insufficient, difficult, and expensive. Therefore, this paper explains an approach to infer human activity and location considering the environmental sound cues and commonsense knowledge of everyday objects usage. Our system uses mel-frequency cepstral coefficients (MFCC) and their derivatives as features, and continuous density hidden Markov models (HMM) as acoustic models. Our work differs from others in three key ways. First, we utilize both indoor and outdoor environmental sound cues which are annotated according to the objects pertaining to the sound samples to build the idea regarding sounds and the objects which produce that particular sound. Second, use of portable microphone instead of having a fixed setup of an array of microphones to capture environmental sound we can also infer outdoor environments like being on the road, in a train station, etc., which previous research was limited to perform. Thirdly, our model is easy to incorporate new set of activities for further needs by adding more appropriately annotated sound clips and re-training of the HMM based recognizer. A perceptual test is made to study the human accuracy in the task and to obtain a baseline for the assessment of the performance of the system. Though the direct comparison of the system’s performance to human performance is somewhat worse but the preliminary results are encouraging with the accuracy rate for outdoor and indoor sound categories for activities being above 67% and 61% respectively.