SIGMAP 2007 Abstracts
CONFERENCE
Area 1 - Multimedia Communications
Area 2 - Multimedia Signal Processing
Area 3 - Multimedia Systems and Applications
Title:
IMS SECURED CONTENT DELIVERY OVER PEER-TO-PEER NETWORKS
Author(s):
Jens Fiedler, Thomas Magedanz and Alejandro Menendez
Abstract:
Effective content distribution, which is safe against denial of service attacks, is one of the greatest challenges for the content and service providers. Peer-to-peer technologies are known to be unaffected by such attacks, but lack any control by content owners or copyright holders. The work presented in this Paper combines the effective and reliable content availability, known from P2P, with the capabilities of IMS, which is used for access control, charging and service discovery. Commercial use cases are discussed for content consumption and provisioning.

Title:
A METHODOLOGY FOR THE DEPLOYMENT OF LIVE AUDIO AND VIDEO SERVICES
Author(s):
D. Melendi, X. G. Pañeda, M. Vilas, R. Garcia and V. Garcia
Abstract:
Since the development of the first live audio and video services in the 90s, the deployment of these services has always been a challenging issue. Not only is it necessary to deal with the problems of the delivery of continuous information and the high consumption of resources, but also with those imposed by the nature of these services. Service managers do not get a second chance to broadcast live contents so it is important to ensure that everything works as planned. Most service managers only work based on their own experience, but they rarely follow any standardized method. With the aim of improving the current situation, the authors have designed a methodology for the deployment of live audio and video which is presented in this paper. The methodology tries to cover almost all the issues that may arise while putting one of these services into operation and proposes mechanisms to deal with those issues from a management perspective. It has been successfully used by the authors in the deployment of several live services for different companies.

Title:
PERFORMANCE OF AUDIO/VIDEO SERVICES ON CONSTRAINED VARIABLE USER ACCESS LINES
Author(s):
M. Vilas, X. G. Pañeda, D. Melendi, R. Garcia and V. Garcia
Abstract:
Nowadays, it is more and more common for the same access line to be shared among different services and even among different users. This change in home users’ behaviour, that has given rise to resource consumption close to the maximum available in user access lines, is mainly due to the increase in subscriber access capabilities which have taken place in the last few years. At the same time, contracts fulfilled by customers and network operators only provide guarantees for a reduced percentage of the maximum download/upload capacity of the line. In this paper, a study of the effects on streaming services caused by variations on the access line and by the traffic of other services is carried out. One of the main conclusions of the paper is that the delivery rate of UDP streaming sessions is mainly guided by the quality of the contents and does not consider the congestion in the network. For this reason, a method for delivery rate estimation for UDP streaming sessions is presented.

Title:
LOW COMPLEXITY ,LOW DELAY AND SCALABLE AUDIO CODING SCHEME BASED ON A NOVEL STATISTICAL PERCEPTUAL QUANTIZATION PROCEDURE
Author(s):
Cesar Alonso Abad, Miguel Angel Martın Fernandez, Carlos Alberola Lopez
Abstract:
In this paper we present Fast Perceptual Quantization (FPQ), a novel procedure to quantize and code audio signals. It employs the same psychoacoustics principles used in the popular MPEG/Audio coders, but substantially simplifies the complexity and computational needs of the encoding process. FPQ is based on defining a hierarchy of privileged quantization values so that the masking threshold calculated through a psychoacoustic model is leveraged to quantize the real values to the privileged ones when possible. The computational cost of this process is very low compared to MP3’s or AAC’s quantization/coding loops. Experimental results show that it is possible to achieve nearly transparent coding using as few as approximately 100 quantization values. This leads to very efficient bit compaction using Huffman or arithmetic coding so that nearly state-of-the-art performance can be achieved in terms of quality/bit-rate trade-off. Since quantization and codification (bit compaction) procedures are completely independent here, efficient scalable decoding can be achieved either by parsing and entropy re-encoding the original quantized values or by coding the bit-planes independently and sorting them in order of perceptual significance. Very low delay performance is also possible to achieve, which makes the proposed coding scheme suitable for real-time applications.

Title:
DIFFUSE MATRIX An Optimized Data Structure for the Storage and Processing of Hyperspectral Images
Author(s):
Jose M. Chaves-González, Miguel A. Vega-Rodríguez, Pablo J. Martínez-Cobo Juan A. Gómez-Pulido and Juan M. Sánchez-Pérez
Abstract:
This paper proposes a new format for storing and processing hyperspectral images captured by spectrometer AVIRIS (Airborne Visible/InfraRed Imaging Spectrometer). Obtaining such images is difficult, because the sensor that takes the images is carried in an aircraft that suffers turbulences while the camera is taking photos. So, a geo-rectification process is necessary to correct the information of different bands. The format proposed in this paper, DMF (Diffuse Matrix Format), allows a more efficient storage, because a list with the original information received in the sensor is saved for each position (X,Y) of the scanned ground. The format of the list saves space and time because no redundant information is saved using it. To show the possibilities of this new format an application that makes some thresholding and filter operations has been built. This program, firstly, creates the diffuse matrix in memory from the file that stores the image information, and then, some filter operations are executed over the diffuse matrix to check it. In this way, we prove that diffuse matrix processing is fast and simple, as well as the space used in the disk for its storage is quite less than the space used by typical formats.

Title:
FACIAL EXPRESSION SYNTHESIS AND RECOGNITION WITH INTENSITY ALIGNMENT
Author(s):
Hao Wang
Abstract:
This paper proposes a novel approach for facial expression synthesis that can generate arbitrary expressions for a new person with natural expression details. This approach is based on local geometry preserving between the input face image and the target expression image. In order to generate expressions with arbitrary intensity for a new person with unknown expression, this paper also develops an expression recognition scheme based on Supervised Locality Preserving Projections (SLPP), which aligns different subjects and different intensities on one generalized expression manifold. Experimental results clearly demonstrate the efficiency of the proposed algorithm.

Title:
A NEURAL NETWORK-BASED SYSTEM FOR FACE DETECTION IN LOW QUALITY WEB CAMERA IMAGES
Author(s):
Ioanna-Ourania Stathopoulou and George A. Tsihrintzis
Abstract:
The rapid and successful detection and localization of human faces in images is a prerequisite to a fully automated face image analysis system. In this paper, we present a neural network–based face detection system which arises from the outcome of a comparative study of two neural network models of different architecture and complexity. The fundamental difference in the construction of the two models lies in approaching the face detection problem either by seeking a general solution based on the full-face image or by composing the solution through the resolution of specific portions/characteristics of the face. The proposed system is based on the brightness contrasts between specific regions of the human face. We show that the second approach, even though more complicated, exhibits better performance in terms of detection and false-positive rates. We tested our system with low quality face images acquired with web cameras. The image test set includes both front and side view images of faces forming either a neutral or one of the “smile”, “surprise”, “disgust”, “scream”, “bored-sleepy”, “angry”, and “sad” expressions. The system achieved high face detection rates, regardless of facial expression or face view.

Title:
MACRO BLOCK SKIPPING ALGORITHMS FOR HIGH DEFINITION H.264/AVC VIDEO CODING IN THE BASELINE PROFILE
Author(s):
Susanna Spinsante, Ennio Gambi and Damiano Falcone
Abstract:
This paper discusses different macroblock skipping algorithms to be applied in the H.264/AVC Baseline profile, in order to facilitate the adoption of High Definition video coding in real time applications. Moving from Standard to High Definition video coding, there is six times as much data to process: this motivates the search for suited Mode Decision strategies, to reduce complexity while preserving an acceptable video quality for the final user. The proposed schemes permit to speed up significantly the Mode Decision procedure, by forcing the selection of the SKIP mode over each frame, without affecting significantly the final quality.

Title:
SMOOTHED REFERENCE PREDICTION FOR IMPROVING SINGLE-LOOP DECODING PERFORMANCE OF H.264/AVC SCALABLE EXTENSION
Author(s):
So-Young Kim and Woo-Jin Han
Abstract:
It is well-known that multi-layer extension of H.264/AVC shows good spatial scalability performance mainly due to its efficient inter-layer prediction techniques. Although single-loop decoding is a kind of technique to reduce the decoder-side computational complexity by performing only one motion compensation to decode multi-layer data, its limited use of inter-layer prediction sometimes degrades the performance especially for fast-motion sequences. In this paper, smoothed reference prediction technique is proposed to improve the single-loop decoding performance by replacing base-layer information with current-layer information and simple block-based smoothing function. Experimental results show that the proposed method can improve the coding efficiency with all benefits of single-loop decoding mode. In addition, the proposed method was adopted to scalable extension of H.264/AVC standard Working Draft.

Title:
HIGHER-ORDER STATISTICS INTERPRETATION. APPLICATION TO POWER-QUALITY CHARACTERIZATION
Author(s):
Juan Jose Gonzalez de la Rosa, Africa Luque, Carlos G. Puntonet, J.M.Gorriz and Antonio Moreno Munoz
Abstract:
In this paper we perform a practical review on higher-order statistics interpretation. Concretely we focuss on an unbiased estimate of the 4th-order time-domain cumulants. Some synthetics involving classical noise processes are characterized using this unbiased estimate, with the goal of checking its performance and to provide the scientific community with another result, dealing with the interpretation of this signal processing tool. A real-life practical example is presented in the field of electrical power quality event analysis. The work also aims to present a set of general advice in order to save memory and gain speed in a real signal processing frame, dealing with non-stationary processes.

Title:
USING 3D FEATURES TO EVALUATE CORK QUALITY
Author(s):
Beatriz Paniagua-Paniagua, Miguel A. Vega-Rodríguez, Hiroshi Nagahashi, Juan A. Gómez-Pulido and Juan M. Sánchez-Pérez
Abstract:
In this paper we study different 3D features in cork material. We do this in order to solve a classification problem existing in the cork industry: the cork stopper/disk quality classification. Cork Quality Standard sets seven different cork quality classes for cork stopper classification. These classes are based on a complex combination of cork stopper defects. In previous studies we only analysed those features that could be detected/acquired with a 2D camera. In this study we work in a 3D environment, in order to extract those features that we could not be extracted in a 2D approach. As a conclusion we can say that the most important 3D cork quality detection feature takes into account dark and deep cork areas (usually, these areas indicate deep and important defects). Furthermore, the 3D features have widely improved the results obtained by similar features with a 2D approach, due to the 3D approach includes more information. This fact allows us to extract more complex features, as well as improve the classification results.

Title:
A NEW ADAPTIVE CLASSIFICATION SCHEME BASED ON SKELETON INFORMATION
Author(s):
Catalina Cocianu, Luminita State, Ion Roşca and Panayiotis Vlamos
Abstract:
Large multivariate data sets can prove difficult to comprehend, and hardly allow the observer to figure out the pattern structures, relationships and trends existing in samples and justifies the efforts of finding suitable methods from extracting relevant information from data. In our approach, we consider a probabilistic class model where each class Hh∈is represented by a probability density function defined on nR; where n is the dimension of input data and H stands for a given finite set of classes. The classes are learned by the algorithm using the information contained by samples randomly generated from them. The learning process is based on the set of class skeletons, where the class skeleton is represented by the principal axes estimated from data. Basically, for each new sample, the recognition algorithm classifies it in the class whose skeleton is the “nearest” to this example. For each new sample allotted to a class, the class characteristics are re-computed using a first order approximation technique. Experimentally derived conclusions concerning the performance of the new proposed method are reported in the final section of the paper.

Title:
LOCAL DISSONANCE MINIMIZATION IN REAL TIME
Author(s):
Julian Villegas and Michael Cohen
Abstract:
This article discusses the challenges of applying the tonotopic consonance theory to minimize the dissonance of concurrent sounds in real-time. It reviews previous solutions, proposes an alternative model, and presents a
prototype programmed in Pd that aims to surmount the difficulties of prior solutions.

Title:
UNSUPERVISED NON PARAMETRIC DATA CLUSTERING BY MEANS OF BAYESIAN INFERENCE AND INFORMATION THEORY
Author(s):
Gilles Bougeniere, Claude Cariou, Kacem Chehdi and Alan Gay
Abstract:
In this communication, we propose a novel approach to perform the unsupervised and non parametric clustering of n-D data upon a Bayesian framework. The iterative approach developed is derived from the Classification Expectation-Maximization (CEM) algorithm, in which the parametric modelling of the mixture density is replaced by a non parametric modelling using local kernels, and the posterior probabilities account for the coherence of current clusters through the measure of class-conditional entropies. Applications of this method to synthetic and real data including multispectral images are presented. The classification issues are compared with other recent unsupervised approaches, and we show that our method reaches a more reliable estimation of the number of clusters while providing slightly better rates of correct classification in average.

Title:
PHONETIC-BASED MAPPINGS INVOICE-DRIVEN SOUND SYNTHESIS
Author(s):
Jordi Janer and Esteban Maestre
Abstract:
In voice-driven sound synthesis applications, phonetics convey musical information that might be related to
the sound of an imitated musical instrument. Our initial hypothesis is that phonetics are user- and instrumentdependent, but they remain constant for a single subject and instrument. Hence, a user-adapted system is proposed, where mappings depend on how subjects performs musical articulations given a set of examples. The system will consist of, first, a voice imitation segmentation module that automatically determines noteto- note transitions. Second, a classifier determines the type of musical articulation for each transition from
a set of phonetic features. For validating our hypothesis, we run an experiment where a number of subjects imitated real instrument recordings with the voice. Instrument recordings consisted of short phrases of sax and violin performed in three grades of musical articulation labeled as: staccato, normal, legato. The results of a supervised training classifier (user-dependent) are compared to a classifier based on heuristic rules (userindependent). Finally, with the previous results we improve the quality of a sample-concatenation synthesizer by selecting the most appropriate samples.

Title:
FUSION PREDICTORS FOR DISCRETE-TIME LINEAR SYSTEMS WITH MULTISENSOR ENVIRONMENT
Author(s):
Ha Ryong Song and Vladimir Shin
Abstract:
New fusion predictors for linear dynamic systems with different types of observations are proposed. The fusion predictors are formed by summing of the local Kalman filters/predictors with matrix weights depending only on time instants. The relationship between them and the optimal predictor is discussed. High accuracy and computational efficiency of the fusion predictors are demonstrated on the first-order Markov process and the damper harmonic oscillator motion with multisensor environment.

Title:
SMALL TRICKS TO ENHANCE THE ACCURACY OF LICENSE PLATE CHARACTER RECOGNITION
Author(s):
Balázs Enyedi, Lajos Konyha, Kálmán Fazekas and Ján Turán
Abstract:
License plate recognition solutions to date are numerous and quite diverse. It is a complex problem field that can clearly be separated into two areas: localizing the actual license plate number and recognizing individual characters. Current professional literature devotes relatively small attention to individual steps of character recognition, which is exacerbated by the fact that the vast majority of solutions result in severe data losses due to inconsiderate discarding of information that could significantly enhance the accuracy of the end result that is, improve recognition reliability. Certain letters and numbers are very easy to mistake for one another, and some solutions focus too heavily on attempting to differentiate between them, complicating the recognition algorithm and possibly unnecessarily increasing its computation requirements. Instead, retaining certain information can result in much faster and more accurate recognition algorithms. This paper describes tricks to enhance accuracy and presents the points of potential significant data losses during the recognition process. The solutions described here are applicable along with any recognition algorithm, enhancing its accuracy and reliability.

Title:
UNSUPERVISED ALGORITHMS FOR SEGMENTATION AND CLUSTERING APPLIED TO SOCCER PLAYERS CLASSIFICATION
Author(s):
P. Spagnolo, P. L. Mazzeo, M. Leo and T. D’Orazio
Abstract:
In this work we consider the problem of soccer player detection and classification. The approach we propose starts from the monocular images acquired by a still camera. Firstly, players are detected by means of background subtraction. An algorithm based on pixels energy content has been implemented in order to detect moving objects. The use of energy information, combined with a temporal sliding window procedure, allows to be substantially independent from motion hypothesis. Then players are assigned to the correspondent team by means of an unsupervised clustering algorithm that works on colour histograms in RGB space. It is composed by two distinct modules: firstly, a modified version of the BSAS clustering algorithm builds the clusters for each class of objects. Then, at runtime, each player is classified by evaluating its distance, in the features space, from the classes previously detected. Algorithms have been tested on different real soccer match of the Italian Serie A.

Title:
A FAST SEARCH ALGORITHM FOR SUB-PIXEL MOTION ESTIMATION IN H.264
Author(s):
Dong-kyun Park, Hyo-moon Cho and Jong-Hwa Lee
Abstract:
We propose the advanced sub-pixel block matching algorithm to reduce the computational complexity by using a statistical characteristic of SAD (Sum of Absolute Difference). Generally, the probability of the minimum SAD values is highest when a searching point has one pixel distance from the reference point. Thus, we can overcome high computational complexity problem by reducing the searching area. The main concept of proposed algorithm is one of the fast searching algorithms based on TSS (Three Step Search) method. First, we find three minimal SAD points of all nine searching points on integer distance and then we obtain two minimal SAD points on 1/2-pixel between the second minimal SAD point and the first, and between the third and first, respectively. Finally, we can find matching point by comparing the SAD values among six points which are on triangle from the first minimal SAD point and two 1/2-pixel points including 1/4-pixels. The proposed algorithm in this paper needs only 14 searching points in sub-pixel mode, whereas the conventional TSS method needs total 25 searching points. Therefore, this algorithm improves the processing speed as 51%.

Title:
LIVE TV SUBTITLING Fast2-pass LVCSR System for Online Subtitling
Author(s):
Ales Prazak, Ludek Muller and J. V. Psutka,J. Psutka
Abstract:
The paper describes a fast 2-pass large vocabulary continuous speech recognition (LVCSR) system for automatic online subtitling of live TV programs. The proposed system implementation can be used for direct
recognition of TV program audio channel or recognition of a shadow speaker who re-speaks the original audio
channel. The first part of this paper focuses on preparation of an adaptive language model for TV programs,
where person names are specific for each subtitling session and have to be added to the recognition vocabulary. The second part outlines the recognition system conception for automatic online subtitling with vocabulary up to 150 000 words in real-time. The recognition system is based on Hidden Markov Models, lexical trees and bigram and quadgram language models in the first and second pass, respectively. Finally, experimental results from our project with the Czech Television are reported and discussed.

Title:
IMPROVEMENT OF H.264 SKIP MODE
Author(s):
Kyohyuk Lee, Woojin Han and Tammy Lee
Abstract:
H.264 (MPEG-4 AVC) is the state of the art international video coding standard which shows better coding efficiency compared to previous standards. This contribution is on the improvement of motion derivation process of H.264 SKIP mode. H.264 exploits temporal or spatial motion field correlation to derive current motion field. Temporal or spatial direct mode macroblock for B slice and skip mode macroblock for P slice are adopted for exploitation of motion field correlation. In general, H.264 SKIP mode macroblock has great impact on coding efficiency because about 30 ~ 70% of macroblocks are set as skip mode. SKIP mode macroblock derives one motion vector for whole 16x16 macroblock region from spatial correlation. In this contribution, we improved SKIP mode motion field further instead of setting one motion vector for 16x16 macroblock region. We split 16x16 macroblock into four 8x8 sub-partitions and set each sub-partition SKIP mode motion field separately. Experimental results showed average 2.05% and up to 18.63% bit rate reduction, especially higher coding efficiency in low bit rate condition.

Title:
FAST SOUND FILE CHARACTERISATION METHOD
Author(s):
Lucille Tanquerel and Luigi Lancieri
Abstract:
This article describes a fast technique of characterization of sound documents based on a statistical measure of the variation of the signal. We showed that a very limited sampling was sufficient to obtain a reasonable performance of the characteristic while being 100 times faster to calculate than a complete sampling. During preliminary tests, we carried out a first validation of our approach by highlighting a correlation of 0.7 between the human perception of the rhythm and our characteristic as well as an error of recognition lower than 5%. In this new series of tests, we show that our approach makes possible to associate to a cut file its missing half, with an error rate from approximately 30%.

Title:
COMPARISON OF BACKGROUND SUBTRACTION METHODS FOR A MULTIMEDIA LEARNING SPACE
Author(s):
F. El Baf, T. Bouwmans and B. Vachon
Abstract:
This article presents, at a first time, a multimedia application called Aqu@theque. This project consists in elaborating a multimedia system dedicated to aquariums which gives ludo-pedagogical information in an interactive learning area. The reliability of this application depends of the segmentation and recognition steps. Then, we focus on the segmentation step using the background subtraction principle. Our motivation is to compare different background subtraction methods used to detect fishes in video sequences and to improve the performance of this application. In this context, we present a new classification of the critical situations which occurred in videos and disturbed the assumptions made in background subtraction methods. This classification can be used in any application using background subtraction like video surveillance, motion capture or video games.

Title:
EFFICIENT MOTION COMPENSATION ARCHITECTURE WITH RATE-DISTORTION OPTIMIZATION FOR H.264/AVC
Author(s):
Tian Song and Takashi Shimamoto
Abstract:
In this paper, a novel motion compensation architecture is proposed to support the Rate-Distortion Optimization( RDO) in H.264/AVC. First, the scope of the motion compensation in this work is defined not only including the half and quarter pixel motion compensation but also the deblocking filter and rate-distortion optimazation. Then, base on the new concept of motion compensation an efficient architecture for H.264/AVC codec is constructed. Proposed architecture could select the best mode for INTRA macroblocks using the lagrange function by calculating the distortion and the generated bits. It could also calculate the lagrange function for INTER macroblocks by receiving the motion vector information and the interpolation data from theME(Motion Estimation)module to construct a complete rate distortion optimization architecture. Pipelined processing structure is designed for sub-block mode selection to achieve real-time processing for up to HDTV resolution inputs. Implementation result shows that proposed architecture could be realized with only 42,280 gates and 48,320 bits SRAM.

Title:
EFFICIENT DIGITAL FREQUENCY DOWN CONVERTER STRUCTURE USING CIC FILTERS AND INTERPOLATED FOURTH-ORDER POLYNOMIALS
Author(s):
Youngbeom Jang, Do-Han Kim and Won-Sang Lee
Abstract:
In this paper, we propose an efficient digital frequency down converter (DFDC) structure using CIC (Cascaded Integrator-Comb) decimation filters and interpolated fourth-order polynomials (IFOP). Typical DFDC with high decimation factors consist of a CIC filter and a halfband filter. By inserting the proposed IFOP between the CIC and halfband filters, it is shown that passband droop and aliasing band attenuation characteristics are simultaneously improved. Since the IFOP requires only three multiplications, the proposed DFDC can be used in intermediate frequency blocks of the high-speed communication systems.

Title:
SPECTRUM WEIGHTED HRTF BASED SOUND LOCALIZATION
Author(s):
Sergio Cavaliere and Pietro Santangelo
Abstract:
In the framework of humanoid robotics it’s of great importance studying and developing computational techniques that enrich robot perception and its interaction with the surrounding environment. The most important cues for the estimation of sound source azimuth are interaural phase differences (IPD), interaural time differences (ITD) and interaural level differences (ILD) between the binaural signals. In this paper we present a method for the recognition of the direction of a sound located on the azimuthal plane (i.e. the plane containing the interaural axis). The proposed method is based on a spectrum weighted comparison between ILD’s and IPD’s extracted from microphones located at the ears and a set of stored cues; these cues where previously measured and stored in a database in the form of a Data Lookup Table. While the direct lookup in the table of the stored cues suffers from the presence of both ambient noise and reverberation, as usual in real environments, the proposed method, exploiting the overall shape of the actual frequency spectrum of the signal, both its phase and modulus, reduces dramatically errors in the localization. In the paper we give also the experimental evidence that such method improves greatly the usual HRTF based identification methods.

Title:
COLOUR SPACES STUDY FOR SKIN COLOUR DETECTION IN FACE RECOGNITION SYSTEMS
Author(s):
Jose M. Chaves-González, Miguel A. Vega-Rodríguez Juan A. Gómez-Pulido and Juan M. Sánchez-Pérez
Abstract:
In this paper we show the results of a work where a comparison among different colour spaces is done in order to know which one is better for human skin colour detection in face detection systems. Our motivation to do this study is that there is not a common opinion about which colour space is the best choice to find skin colour in an image. This is important because most of face detectors use skin colour to detect the face in a picture or a video. We have done a study using 10 different colour spaces (RGB, CMY, YUV, YIQ, YCbCr, YPbPr, YCgCr, YDbDr, HSV –or HSI– and CIE-XYZ). To make the comparisons we have used truth images of 15 different people, comparing at pixel level the number of correct detections (false negatives and false positives) for each colour space.

Title:
BOUNDARY POINT DETECTION FOR ULTRASOUND IMAGE SEGMENTATION USING GUMBEL DISTRIBUTIONS
Author(s):
Brian Booth and Xiaobo Li
Abstract:
Due to high noise, low contrast, and other imaging artifacts, region boundaries in ultrasound images often do not conform to the assumptions of many image processing algorithms. Specifically, the beliefs that region boundaries have a high gradient magnitude or a high intensity can break down in this context. In this paper, we present an alternative way of detecting likely boundary points in ultrasound images by decomposing the image into one-dimensional intensity scans. These intensity scans, mimicking traditional A-Mode ultrasound, are modeled using Gumbel distributions. Results show that the relationship between the modes of these distributions and regions boundaries is relatively strong.

Title:
FEATURES EXTRACTION FOR MUSIC NOTES RECOGNITION USING HIDDEN MARKOV MODELS
Author(s):
Fco. Javier Salcedo, Jesús Díaz-Verdejo and José Carlos Segura
Abstract:
In recent years Hidden Markov Models (HMMs) have been successfully applied to human speech recognition. The present article proves that this technique is also valid to detect musical characteristics, for example: musical notes. However, any recognition system needs to get a suitable set of parameters, that is, a reduced set of magnitudes that represent the outstanding aspects to classify an entity. This paper shows how a suitable parameterisation and adequate HMMs topology make a robust recognition system of musical notes. At the same time, the way to extract parameters can be used in other recognition technologies applied to music.

Title:
AN IMPROVED SUPER RESOLUTION RECONSTRUCTION ALGORITHM FOR VIDEO SEQUENCE
Author(s):
Hyo-Moon Cho and Sang-Bok Cho
Abstract:
In this paper, we introduce the input image selection-method to improve the reconstructed high-resolution (HR) image quality. To obtain ideal super-resolution (SR) reconstruction image, all input images are well-registered. However, the registration is not ideal in practice. By reason of this, the number of input images with low registration error is more important than the number of input images in order to obtain good quality of a HR image. The input image suitability could be evaluated by using statistical and restricted registration properties. Therefore, we propose the input image evaluation-method in automatic manner as pre-processing of SR reconstruction and its architecture. In video sequences, all input images in specified region are allowed to use SR reconstruction as low-resolution (LR) input image and/or the reference image. The evaluation basis is decided by the threshold value and this threshold is calculated by using the maximum motion compensation error (MMCE) of the reference image. If the motion compensation error (MCE) of LR input image is in the range of 0 < MCE < MMCE then this LR input image is selected for SR reconstruction, else then LR input image are neglected. The optimal reference LR (ORLR) image is decided by comparing the number of the selected LR input (SLRI) images for each reference LR input (RLRI) image. Finally, we generate a HR image by using optimal reference LR image and selected LR images and by using the Hardie’s interpolation method. This proposed algorithm is expected to improve the quality of SR without any user intervention.

Title:
SEARCHING FOR A ROBUST MFCC-BASED PARAMETERIZATION FOR ASR APPLICATION
Author(s):
J. V. Psutka, Luboš Šmídl and Aleš Pražák
Abstract:
The paper concerns with searching for areas of robust setting a MFCC-based parameterization as regards numbers of band-pass filters and computed coefficients. Settings that are theoretically recommended for telephone and microphone speech are compared with a large number of experimental results and a new technique for determination of robust areas of {<# of band-pass filters>×<# of coefficients>} is designed.

Title:
IMAGE RESTORATION A New Explicit Approach in Filtering and Restoration of Digital Images
Author(s):
Pejman Rahmani, Benoıt Vozel and Kacem Chehdi
Abstract:
Image restoration, in presence of noise, is well known to be an ill-posed inverse problem. Deconvolution of blurry and noisy digital images is a very active research area in image processing. This paper introduces a novel approach composed of two optimized sequential stages of image processing: denoising followed by deconvolution. In the first stage, the denoising filter and the number of iteration are chosen in order to obtain the best value of the usual criteria and the good recovering of the blurry image. We assume that the statistics of the noise are previously estimated. In the second stage, a deconvolution method is applied on an almost noise free version of the blurry image. Compared with the classical deconvolution methods, the numerical experiments of proposed method, appear to give significant improvement. The preliminary results of the new cascade approach are very encouraging as well.

Title:
ADAPTIVE AND COOPERATIVE SEGMENTATION SYSTEM FOR MONO-AND MULTI-COMPONENT IMAGES
Author(s):
Madjid Moghrani, Claude Cariou and Kacem Chehdi
Abstract:
We present a cooperative and adaptive system for multi-component image segmentation, in which segmentation methods used are based upon the classification of pixels represented by statistical features chosen with respect to the nature of the regions to segment. One originality of this system is its adaptive characteristic: it allows taking into account the local context in the image to automatically adapt the segmentation process to the nature of specific regions which can be uniform or textured. The method used for the detection of the regions’ nature is based on a classification of pixels with respect to the uniformity index of Haralick. Then a cooperative approach is set up for the textured areas which can combine results incoming from different classification methods and choose the best result at the pixel level using an assessment index. In order to validate the system and show the relevance of the adaptive procedure used, experimental results are presented for the segmentation of synthetic and real multi-component CASI images.

Title:
A THREE-LAYER SYSTEM FOR IMAGE RETRIEVAL
Author(s):
Daidi Zhong and Irek Defée
Abstract:
Visual patterns are composed of basic features forming well-defined structures and/or statistical distributions. Often, they always present simultaneously in visual images. This makes the problem of description and representation of visual patterns complicated. In this paper we proposed a hierarchical retrieval system, which is based on subimages and combinations of feature histograms, to efficiently combine structure and statistical information for retrieval tasks. We illustrate the results on face database retrieval problem. It is shown that proper selection of subimage and feature vectors can significantly improve the performance with minimized complexity.

Title:
SPONTANEOUS AND PERSONALIZED ADVERTISING THROUGH MPEG-7MARKUP AND SEMANTIC REASONING Exploring New Ways for Publicity and Marketing over Interactive Digital TV
Author(s):
Martín López-Nores, José J. Pazos-Arias, Jorge García-Duque Yolanda Blanco-Fernández, Marta Rey-López and Esther Casquero-Villacorta
Abstract:
Publicity is one of the sustaining pillars of the television industry. In an increasingly competitive market, the
involved agents are striving to exploit all the possibilities to get revenues from advertising, but their techniques lack targeting and are usually at odds with the comfort of the TV viewers. In response to those problems, this paper introduces a new advertising model that aims at harnessing the interactive capabilities of the modern TV receivers (either domestic or mobile ones). The approach is based on automatically identifying products which are semantically related to the things on screen that catch the viewer’s attention, and then assembling interactive services that provide him/her with personalized commercial functionalities.

Title:
KNOWLEDGE ENGINEERING FOR AFFECTIVE BI-MODAL HUMAN-COMPUTER INTERACTION
Author(s):
Efthymios Alepis, Maria Virvou and Katerina Kabassi
Abstract:
This paper presents knowledge engineering for a system that incorporates user stereotypes as well as a multi
criteria decision making theory for affective interaction. The system bases its inferences about students emotions on user input evidence from the keyboard and the microphone. Evidence form these two modes is combined by a user modelling component underlying the user interface. The user modelling component reasons about users’ actions and voice input and makes inferences in relation to their possible emotional states. The mechanism that integrates the inferences form the two modes has been based on the results of two empirical studies that were conducted in the context of requirements analysis of the system. The evaluation of the developed system showed significant improvements in the recognition of the emotional states of users.

Title:
4I (FOR EYE) MULTIMEDIA Intelligent Semantically Enhanced and Context-ware Multimedia Browsing
Author(s):
Oleksiy Khriyenko
Abstract:
Next generation of integration systems will utilize different methods and techniques to achieve the vision of ubiquitous knowledge: Semantic Web and Web Services, Agent Technologies and Mobility. Unlimited interoperability and collaboration are the important things for almost all the areas of people life. Development of a Global Understanding eNvironment (GUN) (Kaykova et al., 2005), which would support interoperation between all the resources and exchange of shared information, is a very profit-promising and challenging task. And as usually, a graphical user interface is one of the important parts in a process performing. Following the new technological trends, it is time to start a stage of semantic-based contextdependent multidimensional resource visualization and semantic metadata based browsing across resources. With a growing ubiquity of digital media content, whose management requires suitable annotation and systems able to use that annotation, the ability to combine continuous media data with its own multimedia specific content description into the one source brings the idea of a true multimedia semantic web one step closer. Thus, 4I (FOR EYE) technology (Khriyenko, 2007) is a perfect basis for elaboration of intelligent semantically enhanced and context-aware across multimedia content browsing.

Title:
ENHANCING LSB STEGANOGRAPHY AGAINST STEGANALYSIS ATTACKS USING COMBINATIONAL LSBS
Author(s):
Yahya Belghuzooz and Ali Al-Qayedi
Abstract:
This paper describes an enhanced approach for hiding secret messages in the spatial domain of digital cover images such that the resulting stego-images are robust to steganalysis attacks. Firstly, different methods of hiding in the Least Significant Bits (LSBs) are comparatively discussed including the Sequential and the Random algorithms. Then our approach is illustrated which uses a combination of LSBs to store large amounts of secret information while maintaining robustness against detection by steganalysis attacks. The results achieved are commensurate to those obtained using widely available stego tools.

Title:
SPATIALIZED AUDIO CONFERENCES IMS Integration and Traffic Modelling
Author(s):
Christopher J. Reynolds, Martin J. Reed and Peter J. Hughes
Abstract:
Existing monophonic multiparty VoIP conferencing applications are currently limited to supporting a single conversation floor, with limited numbers of simultaneous speakers. We discuss the additional requirements and benefits of delivering a spatially enhanced audio application via Head Related Transfer Function (HRTF) filtering, which may support many conversation floors. Several network delivery architectures are presented, including integration to the Next Generation Network (NGN) IP Multimedia Subsystem (IMS). The delivery architectures are compared using traffic models, and implications for the scope of such an application are discussed.

Title:
USING IMAGE TO FOSTER BUSINESS TO CONSUMER ONLINE TRUST
Author(s):
Khalid Al-Diri, Dave Hobbs and Rami Qahwaji
Abstract:
Much of the latest research on business to consumer (B2C) e-commerce has focused on ways of building trust through cues that encourage consumers to purchase through online since it suffers from the lack of face to face interpersonal exchanges that enhance trust behaviour in conventional commerce. To bridge the human interaction dilemma, an extensive laboratory based experiment was conducted to assess the trust of consumers using four online vendors’ websites. This paper addresses the issues and findings of a study that uses Western and Saudi images as well as video clips to mimic customer support in increasing the behavioural purchasing trust of the online vendor. The findings from the study clearly highlight that images have an imperative role to play in increasing the trust of online consumers with Saudi images playing a pivotal role in increasing this kind of trust.

Title:
CHANGE DETECTION AND BACKGROUND UPDATE THROUGH STATISTIC SEGMENTATION FOR TRAFFIC MONITORING
Author(s):
T. Alexandropoulos, V. Loumos and E. Kayafas
Abstract:
Recent advances in computer imaging have led to the emergence of video-based surveillance as a monitoring solution in Intelligent Transportation Systems (ITS). The deployment of CCTV infrastructure in highway scenes facilitates the evaluation of traffic conditions. However, the majority of video-based ITS are restricted to manual assessment and lack the ability to support automatic event notification. This is due to the fact that, the effective operation of intelligent traffic management relies strongly on the performance of an image processing front end, which performs change detection and background update. Each one of these tasks needs to cope with specific challenges. Change detection is required to perform the effective isolation of content changes from noise-level fluctuations, while background update needs to adapt to timevarying lighting variations, without incorporating stationary occlusions to the background. This paper presents the operation principle of a video-based ITS front end. A block-based statistic segmentation method for feature extraction in highway scenes is analyzed. The presented segmentation algorithm focuses on the estimation of the noise model. The extracted noise model is utilized in change detection in order to separate content changes from noise fluctuations. Additionally, a statistic background estimation method, which adapts to gradual illumination variations, is presented.

Title:
IMPROVEMENT OF VOIP QUALITY BY PACKET DROPPING IN ADSL ROUTERS
Author(s):
Qin Dai, Matthias Baumann and Ralf Lehnert
Abstract:
Packet dropping is known as a simple mechanism to control TCP traffic. In this paper, TCP packet dropping is introduced in the egress router of an ADSL downlink. The aim is to improve the quality of VoIP connections that compete with TCP applications in downlink direction. The ADSL downlink buffer is assumed to operate as simple FCFS queue. Different simulations have been conducted that evaluate the mechanism in two scenarios. Firstly, the long-term impact of the mechanism both on VoIP application and TCP applications is investigated. Secondly, with more realistic network settings, the effectiveness of the mechanism for a short-time real speech is evaluated. The speech’s PESQ estimate is used to assess the service quality. The results indicate that in both cases packet dropping can improve the VoIP quality. However, the required high dropping ratio can result in TCP traffic bursts and therefore unstable VoIP quality as well as bad TCP performance.

Title:
TOWARDS BUILDING FAIR AND ACCURATE EVALUATION ENVIRONMENTS
Author(s):
Dumitru Dan Burdescu and Marian Cristian Mihăescu
Abstract:
Each e-Learning platform has implemented means of evaluating learner’s knowledge by a specific grading methodology. This paper proposes a methodology for obtaining knowledge about the testing environment. The obtained knowledge is further used in order to make the testing system more accurate and fair. Integration of knowledge management into an e-Learning system is accomplished through a dedicated software module that analyzes learner’s performed activities, creates a learner’s model and provides a set of recommendations for course managers and learners in order to achieve prior set goals.

Title:
REVERSIBLE AND SEMI-BLIND RELATIONAL DATABASE WATERMARKING
Author(s):
Gaurav Gupta and Josef Pieprzyk
Abstract:
In 2002, Agrawal and Kiernan proposed a relational database watermarking scheme that modifies least significant bits (LSBs) of numerical attributes selected using a secret key. The scheme does not address query preservation (some queries give different results when executed on the original and watermarked relation). Additive and secondary watermarking attacks on the watermarked relation are also possible. Such attacks can
render the original watermark undetectable. Hence, an attacker who embeds his watermark in a previously watermarked relation can claim ownership of that relation. However, if the scheme is reversible, then a previous watermark, if any, can be detected in the reversed relation. In this paper, we propose an enhanced reversible, semi-blind and query-preserving watermarking scheme. Using this scheme, the correct owner of a relation can be identified even if the relation has been watermarked by multiple parties. If required, the database can be restored to it’s original state too. This finds applications in high-precision settings such as military operations or scientific experiments.

Title:
HIGH RATE DATA HIDING IN SPEECH SIGNAL
Author(s):
Ehsan Jahangiri and Shahrokh Ghaemmaghami
Abstract:
One of the main issues with data hiding algorithms is capacity of data embedding. Most of data hiding methods suffer from low capacity that could make them inappropriate in certain hiding applications. This paper presents a high capacity data hiding method that uses encryption and the multi-band speech synthesis paradigm. In this method, an encrypted covert message is embedded in the unvoiced bands of the speech signal that leads to a high data hiding capacity of tens of kbps in a typical digital voice file transmission scheme. The proposed method yields a new standpoint in design of data hiding systems in the sense of three major, basically conflicting requirements in steganography, i.e. inaudibility, robustness, and data rate. The procedures to implement the method in both basic speech synthesis systems and in the standard mixedexcitation linear prediction (MELP) vocoder are also given in detail.

Title:
IMPROVING VOD P2P DELIVERY EFFICIENCY OVER INTERNET USING IDLE PEERS
Author(s):
Leandro Souza, Xiaoyuan Yang Javier Balladini, Ana Ripoll and Fernando Cores
Abstract:
This paper presents DynaPeer Chaining, a peer-to-peer Video-on-Demand (VoD) delivery policy designed to deal with high bandwidth requirement of multimedia contents and additional constraints imposed by Internet environment: higher delays and jitter, network congestion, non-symmetrical clients’ bandwidth and inadequate support for multicast communications. We consider the scenario where we have multiple ADSLbased peers that stream the same video to multiple receivers. We propose an adaptive scheme to take advantage of idle peers in order to improve system efficiency, even when extreme conditions (low request rates or limited peer resources) are considered. We conducted a performance comparison study of our proposal with classic multicast (Patching) and other P2P delivery schemes, such as Pn2Pn and Chaining, improving their performance by 50%, 62% respectively, even when taking into account Internet constraints.

Title:
DESIGN AND IMPLEMENTATION OF MULTI-STANDARD AUDIO DECODER
Author(s):
Kong Ji, Liu Peilin, Deng Ning, Fu Xuan, Zhang Guocheng, He Bin, Liu Qianru
Abstract:
In this paper, a design and implementation for Multi-Standard Audio Decoder is presented. The architecture of the decoder is designed to support MPEG-2/MPEG-4 AAC LC Profile (ISO/IEC 13818-7 2006) (ISO/IEC 14496-3 2006), Dolby AC-3 (ATSC 1995), Ogg Vorbis (Xiph.org Foundation 2004), Windows Media Audio (WMA) (Microsoft 2006) and MPEG-1 Layer 3 (MP3) (ISO-IEC/JTC1 SC29 1991). Based on the analysis of algorithms of these multi-standards, software/hardware co-design method is used to implement the audio decoder in which a module called FILTERBANK is designed as a hardware engine. The FILTERBANK which can support IMDCT (Inverse Modified Discrete Cosine Transform) process of different standards is configured by CPU according to the decoded information. Compared with the solutions of DSP/RISC or ASIC multi-standard decoders, our Multi-Standard decoder has achieved a balance between software’s flexibility and hardware’s high efficiency. Also it meets the requirement of low cost, low power and high audio quality. The implementation results on FPGA are given and the performance of the decoder is evaluated.

Title:
REALIZATION AND OPTIMIZATION OF H.264 DECODER FOR DUAL-CORE SOC
Author(s):
Jia-Ming Chen, Chiu-Ling Chen ,Jian-Liang Luo, Po-Wen Cheng, Chia-Hao Yu, Shau-Yin Tseng and Wei-Kuan Shih
Abstract:
This paper presents an H.264/AVC decoder realization on a dual-core SoC (System-on-Chip) platform by the well-designed macroblock level software partitioning. Furthermore, optimizations of the procedures executed on each core, and data movement between two cores are captured from software and hardware techniques. The evaluation results show that a video with D1 (720×480 pixels) resolution can reach realtime decoding by the implementation, which provides a valuable experience for similar designs.

Title:
IMPROVEMENTS IN SPEAKER DIARIZATION SYSTEM
Author(s):
Rong Fu and Ian D. Benest
Abstract:
This paper describes an automatic speaker diarization system for natural, multi-speaker meeting conversations
using one central microphone. It is based on the ICSI-SRI Fall 2004 diarization system (Wooters et al., 2004), but it has a number of significant modifications. The new system is robust to different acoustic environments - it requires neither pre-training models nor development sets to initialize the parameters. It determines the model complexity automatically. It adapts the segment model from a Universal Background Model (UBM), and uses the cross-likelihood ratio (CLR) instead of the Bayesian Information Criterion (BIC) for merging. Finally it uses an intra-cluster/inter-cluster ratio as the stopping criterion. Altogether this reduces the speaker diarization error rate from 25.36% to 21.37% compared to the baseline system (Wooters et al., 2004).

Title:
BACKWARDS COMPATIBLE, MULTI-LEVEL REGIONS-OF-INTEREST (ROI) IMAGE ENCRYPTION ARCHITECTURE WITH BIOMETRIC AUTHENTICATION
Author(s):
Alexander Wong and William Bishop
Abstract:
Digital image archival and distribution systems are an indispensable part of the modern digital age. Organizations perceive a need for increased information security. However, conventional image encryption methods are not versatile enough to meet more advanced image security demands. We propose a universal multi-level ROI image encryption architecture that is based on biometric data. The proposed architecture ensures that different users can only view certain parts of an image based on their level of authority. Biometric authentication is used to ensure that only an authorized individual can view the encrypted image content. The architecture is designed such that it can be applied to any existing raster image format while maintaining full backwards compatibility so that images can be viewed using popular image viewers. Experimental results demonstrate the effectiveness of this architecture in providing conditional content access.

Title:
SVG BASED SECURE UNIVERSAL MULTIMEDIA ACCESS
Author(s):
Ahmed Reda Kaced and Jean-Claude Moissinac
Abstract:
In this paper, we develop and implement our Secure Universal Multimedia Access system (SUMA) for SVG content in the following three subtasks. For content adaptation, we based on XML/RDF, CC/PP and XSLT, for signing and authenticating SVG content we use Merkle hash tree technique and for content delivery, we develop a mechanism for dynamic delivery of multimedia content over wired/wireless network. We present Signature scheme and an access control system that can be used for controlling access to SVG documents. The first part of this paper briefly describes the access control model on which the system is based. The second part of this paper presents the design and implementation of SUMA adaptation engine. SUMA aims to deliver an end-to-end authenticity of original SVG content exchanged in a heterogeneous network while allowing content adaptation by intermediary proxies between the content transmitter and the final users. Adaptation and authentication management are done by the intermediary proxies, transparently to connected hosts, which totally make abstraction of these processes.

Title:
APPEARANCE-BASED HUMAN GALLERY CONSTRUCTION FROM VIDEO
Author(s):
Kyongil Yoon, Yaser Yacoob, David Harwood and Larry Davis
Abstract:
An approach for constructing a dynamic gallery of people observed in a video stream is described. We consider
two scenarios that require determining the number and identity of participants: outdoor surveillance and meeting rooms. In these applications face identification is typically not feasible due to the low resolution across the face. The proposed approach automatically computes an appearance model based on the clothing of people and employs this model in constructing and matching the gallery of participants. The appearance model uses color/path-length profile and a robust distance measure based on Kernel Density Estimation (KDE) and Kullback-Leibler (KL) distance, to evaluate similarity between people and add models to the gallery. A one-to-one constraint is enforced to correctly match instances to models at each frame. In the meeting room scenario we exploit the fact that the relative locations of subjects are likely to remain unchanged for the whole sequence.

Title:
IMPACTS OF LEVEL-2 CACHE ON PERFORMANCE OF MULTIMEDIA SYSTEMS AND APPLICATIONS
Author(s):
Abu Asaduzzaman, Manira Rani and Darryl Koivisto
Abstract:
Multimedia systems normally suffer while processing multimedia applications because of their limited resources. The demand for tremendous amount of processing power raises serious challenges for multimedia systems and applications. Studies show that cache memory has strong influence on the performance of multimedia systems and applications. In our previous work, we optimize level-1 cache parameters to enhance the performance of portable devices running MPEG4 decoder. The focus of this paper is to evaluate the impacts of level-2 cache on the performance of multimedia systems running MPEG4 and H.264/AVC encoders. We develop VisualSim model and C++ code to run the simulation. We measure miss rates, CPU utilization, and power consumed by varying level-2 cache size. Simulation results show that the performance of multimedia systems and applications can be enhanced by optimizing level-2 cache.

Title:
DC MOTOR USING MULTI ACTIVATION WAVELET NETWORK (MAWN) AS AN ALTERNATIVE TO A PD CONTROLLER IN THE ROBOTICS CONTROL SYSTEM
Author(s):
Walid Emar, Noora Khalaf, Maher Dababneh and Waleed Johar
Abstract:
In this paper, a robust MAWN is proposed. An application that constructs Wavelet Network as an alternative to a PD controller in the robotics control system with DC motor is fully investigated. Experimental results not only show that the target performance can be achieved by the proposed Wavelet Network, but also it outperforms the conventional PD controller. An literature survey was conducted to shed some light into this research field shows a sparsity of work addressing this concept, and this what stimulated the idea of this work.

Title:
BLUEMUSIC: A MULTICHANNEL ARCHITECTURE FOR MUSIC DISTRIBUTION
Author(s):
Marco Furini
Abstract:
Despite the increasing number of e-music downloads and the large use of mobile devices, the mobile music market is slowly taking off. Prices seem to be the main burden: compared to the wired environment, a song is twice or three times more expensive. Furthermore, mobile network transfer data rate causes the download time to be very long. In this paper we propose Bluemusic, a multi channel architecture that couples the usage of the mobile phone network with the free-of-charge communication technologies provided in cellphones (e.g.,
bluetooth or Wi-Fi) to distribute music in the mobile environment. To protect digital contents, Bluemusic is provided with a security mechanism that prevents illegal contents distribution. An evaluation of our approach shows that Bluemusic can be helpful to the expansion of the mobile music market.

Title:
A NEW HORIZON BECKONS FOR SAUDI ARABIA IN THE TECHNOLOGICAL AGE OF E-COMMERCE & ON-LINE SHOPPING
Author(s):
Khalid Al-Diri, Dave Hobbs and Rami Qahwaji
Abstract:
Electronic commerce is a worldwide phenomenon. Its diffusion has apparently taken different paths in different nations. This is partially because of the significant differing characteristics of national infrastructural and the political and socio-economic environments for e-commerce adoption. The growing use of the Internet in Saudi Arabia provides a developing prospect for e-shopping. Despite the high potential of online shopping in Saudi Arabia, there is still a lack of understanding concerning the subject matter and its potential impact on consumer. This paper is part of larger study, and aims to establish a preliminary assessment, evaluation and understanding of the characteristics of online shopping in Saudi Arabia, based on a sample of 144 Internet users, it explores their information-seeking patterns as well as their motivations and concerns for online shopping. Consumers in Saudi Arabia still lack of trust in the vendors’ websites when utilizing the Internet as a shopping channel. They are mainly concerned about issues related to security and privacy when dealing with online vendors, and also about issues regarding the Saudi Internet network, English language as a dominant Internet language. While the most motivators for Saudis to utilize the online shopping were convenience, product/service not available offline, and the price respectively. We present and discuss our findings, and identify changes that will be required for broader acceptance and diffusion of online shopping in Saudi Arabia.

Title:
A REAL TIME TRAFFIC ENGINEERING SCHEME FOR BROADBAND CONVERGENCE NETWORK(BCN)
Author(s):
Hwa-Jong Kim, Myoung-Soon Jeong and Jong-Won Kim
Abstract:
Recently, Broadband Convergence Network (BcN), a Korean version of Next Generation Network (NGN), was introduced to guarantee pre-defined QoS for high speed multimedia service. The BcN is considering charging users for premium services. For the BcN to be successfully diffused, however, a practical traffic engineering (TE) tool is required because the BcN is composed of many kinds of subnetworks, and real time feedback (traffic control) would be mandatory for the premium service. In the paper, a new TE scheme for BcN, Rule Based Capture(RBC)/User Satisfaction Parameter(USP), is proposed to resolve the latent problems of the BcN TE. The USP is designed to be admitted by many BcN subnetworks as a common intermediate description of service quality instead of conventional QoS parameters. The RBC is introduced to settle the real time TE issue against the vast accumulation of traffic monitoring data. The pilot RBC/USP is implemented on the Linux platform and its performance is investigated. We found that the average traffic log size is reduced to 0.058% for FTP, and 2.39% for streaming service by using the RBC/USP.

Title:
A MULTIMEDIA DATABASE MANAGEMENT SYSTEM FOR MEDICAL DATA
Author(s):
Liana Stanescu, Dumitru Burdescu, Marius Brezovan and Cosmin Stoica
Abstract:
The paper presents a relational multimedia database management system for managing visual and alphanumerical information from the medical domain. The MMDBMS offers numerical and char data types for alphanumerical information, and Image data type used for storing in an original manner the visual information. An Image data type stores the image in a binary manner, its type, its dimensions and information about color and texture that are automatically extracted. This information will be used for content-based visual query process. The color information is represented by the color histogram quantified to 166 colors in the HSV color space. The texture information is represented by a vector with 12 values resulted from the method that uses Gabor filters for texture detection. This DBMS brings up as an element of originality the visual interface for building content-based image query using color and texture characteristics and a modified Select command. This MMDBMS, implemented using Java technologies is platform independent and can be easily used by the medical personnel.

Title:
ON ENCRYPTION AND AUTHENTICATION OF THE DC DCT COEFFICIENT
Author(s):
Li Weng and Bart Preneel
Abstract:
When encryption and authentication techniques are applied to image or video data, sometimes it is advantageous to limit the operation to the DC DCT coefficient of each 8×8 block in a picture. In this work, the performance of such an approach is evaluated. This problem is considered as an image quality problem, and the metric structural similarity is used to show that by authenticating the DC coefficient, about 60% of the information can be guaranteed; by encrypting the DC coefficient, about 80% of the information can be hindered.

Title:
VOICE USER INTERFACE USING VOICEXML Environment, Architecture and Dialogs Initiative
Author(s):
Alexandre M. A. Maciel and Edson C. B. Carvalho
Abstract:
In this work we present a set of applications for Internet with voice user interface using VoiceXML language. Architecture, main platforms and dialog initiative ways were studied. Applicability and limitations were determined.

Title:
A MULTIMEDIA IMS ENABLED RESIDENTIAL SERVICE GATEWAY
Author(s):
Vitor Pinto, Vitor Ribeiro, Iván Vidal, Jaime García, Francisco Valera, Arturo Azcorra
Abstract:
Internet access has been, until now, the main driver for the generalization of broadband connections in the residential market. Simple IP based services like email and web browsing were, during many years, the typical services provided to residential customers. Today the telecommunications market is changing and operators are looking for ways to provide, through those same IP broadband connections, value added services. These will, in one hand, increase their revenues and on the other hand, provide to the customer a wider range of services until now inaccessible. Triple Play is already a reality, although, the convergence between mobile and fixed networks is bringing to the home a new range of IP Multimedia Subsystem (IMS) based services, which used to be exclusive of the mobile world. Although, to successfully achieve the delivery of these new services, the interface between residential and operator’s networks must be meticulously defined and implemented, by what is usually called the residential gateway (RGW). This paper focuses on emerging residential services and the implications that these impose on the RGW. The coexistence between IMS based services and non-IMS based services are also approached on this paper, with a special emphasis on RGW Quality of Service (QoS) issues.

Title:
DOCXS A Distributed Computing Environment for Multimedia Data Processing
Author(s):
Tobias Lohe, Michael Fieseler, Steffen Wachenfeld and Xiaoyi Jiang
Abstract:
This paper presents DocXS, a distributed computing environment for multimedia data processing, which was developed at the University of M¨unster, Germany. DocXS is platform independent due to its implementation in Java, is freely available for non-commercial research, and can be installed on standard office computers. The main advantage of DocXS is that it does not require its users to care about code distribution or parallelization. Algorithms can be programmed using an Eclipse-based user interface and the resulting Matlab and Java operators can be visually connected to graphs representing complex data processing workflows. Experiments with DocXS show that it scales very well with only a small overhead.

Title:
FIDELITY AND ROBUSTNESS ANALYSIS OF IMAGE ADAPTIVE DWT-BASED WATERMARKING SCHEMES
Author(s):
Franco Del Colle and Juan Carlos Gomez
Abstract:
An Image AdaptiveWatermarking method based on the DiscreteWavelet Transform is presented in this paper.
The robustness and fidelity of the proposed method are evaluated and the method is compared to state-of-theart watermarking techniques available in the literature. For the evaluation of watermark transparency, an image fidelity factor based on a perceptual distortion metric is introduced. This new metric allows a perceptually
aware objective quantification of image fidelity.