SIGMAP 2019 Abstracts


Area 1 - Multimedia and Communications

Short Papers
Paper Nr: 14
Title:

Real-time Nonlinear Signal Processing Super Resolution of 8K Endoscope Cameras

Authors:

Seiichi Gohshi, Chinatsu Mori, Kenkichi Tanioka and Hiromasa Yamashita

Abstract: Presently, 8K is the highest resolution of video systems. Originally, 8K research started only for broadcasting services. However, aside from broadcasting, 8K resolution video systems have also an important application in medical field, and endoscopic surgery is in its practical stage. In endoscopic surgery, a 0.02 mm thread is used. This 0.02 mm thread is not visible when using a 4K endoscope. However, when an 8K endoscope is employed, the thread is visible; however, fine focus is still necessary. Moreover, adjusting the focus of 8K by using the common tools only, such as view finders or small monitors, is very difficult. Hence, commercial HD/4K cameras are equipped with auto-focus functions; however, the central areas are not always the focus points. The focus is very deep and the focus points change during endoscopic surgeries. Owing to these reasons, a surgeon should manually control the endoscope focus. It is always very difficult to adjust the focus accurately. Super resolution (SR) has been proposed to sharpen out-of-focus images. However, a real-time SR technology is necessary for the 8K endoscope. In this study, a nonlinear signal processing super resolution (NLSP) is introduced to improve the resolution of 8K endoscope cameras. NLSP can enhance the 8K endoscope images and improve the camera’s focus depth.

Area 2 - Multimedia and Vision

Short Papers
Paper Nr: 9
Title:

Depth Generation using Structured Depth Templates

Authors:

Lei Zhang

Abstract: We propose a new stereoscopic image / video conversion algorithm by using a two-directional structured depth model matching method. This work is aimed at providing an effect depth map to a 2D scene. By analyzing structure features of the inputting image frame, a kind of depth model called structured depth model is estimated to be as an initial depth map. Then the final depth map can be obtained by a depth post processing. Subjective evaluation is performed by comparing original depth maps generated manually and generated from the proposed method.

Area 3 - Social Multimedia

Full Papers
Paper Nr: 8
Title:

An Overview on Image Sentiment Analysis: Methods, Datasets and Current Challenges

Authors:

Alessandro Ortis, Giovanni M. Farinella and Sebastiano Battiato

Abstract: This paper introduces the research field of Image Sentiment Analysis, analyses the related problems, provides an in-depth overview of current research progress, discusses the major issues and outlines the new opportunities and challenges in this area. An overview of the most significant works is presented. A discussion about the related specific issues is provided: emotions representation models, existing datasets and most used features. A generalizable analysis of the problem is also presented, by identifying and analyzing the components that affect the sentiment toward an image. Furthermore, the paper introduces some additional challenges and techniques that could be investigated, proposing suggestions for new methods, features and datasets.

Area 4 - Multimedia Signal Processing

Full Papers
Paper Nr: 2
Title:

Semi-supervised Audio Source Separation based on the Iterative Estimation and Extraction of Note Events

Authors:

Alejandro Delgado Castro and John E. Szymanski

Abstract: In this paper, we present an iterative semi-automatic audio source separation process for single-channel polyphonic recordings, where the underlying sources are isolated by clustering a set of note events, which are considered to be single notes or groups of consecutive notes coming from the same source. In every iteration, an automatic process detects the pitch trajectory of the predominant note event in the mixture, and separates its spectral content from the mixed spectrogram. The predominant note event is then transformed back to the time-domain and subtracted from the input mixture. The process repeats using the residual as the new input mixture, until a predefined number of iterations is reached. When the iterative stage is complete, note events are clustered by the end-user to form individual sources. Evaluation is conducted on mixtures of real instruments and compared with a similar approach, revealing an improvement in separation quality.

Paper Nr: 6
Title:

An Attempt to Create Speech Synthesis Model That Retains Lombard Effect Characteristics

Authors:

Gražina Korvel, Olga Kurasova and Bożena Kostek

Abstract: The speech with the Lombard effect has been extensively studied in the context of speech recognition or speech enhancement. However, few studies have investigated the Lombard effect in the context of speech synthesis. The aim of this paper is to create a mathematical model that allows for retaining the Lombard effect. These models could be used as a basis of a formant speech synthesizer. The proposed models are based on dividing the speech signal into harmonics and modeling them as the output of a SISO system whose transfer function poles are multiple, and inputs vary in time. An analysis of the Lombard effect of the synthesized signal is performed on the noise residual. The synthesized signal residual is described by vectors of acoustic parameters related to the Lombard effect. For testing the performance of the created models in various noise conditions two classifiers are employed, namely kNN and Naive Bayes. For comparison of results, we created models of sinusoids based on frequency tracks. The results show that a model based on the residual sinewave sum demonstrates the possibility of retaining the Lombard effect. Finally, future work directions are outlined in conclusions.

Short Papers
Paper Nr: 1
Title:

Profile Extraction and Deep Autoencoder Feature Extraction for Elevator Fault Detection

Authors:

Krishna M. Mishra, Tomi R. Krogerus and Kalevi J. Huhtala

Abstract: In this paper, we propose a new algorithm for data extraction from time series signal data, and furthermore automatic calculation of highly informative deep features to be used in fault detection. In data extraction elevator start and stop events are extracted from sensor data, and a generic deep autoencoder model is also developed for automated feature extraction from the extracted profiles. After this, extracted deep features are classified with random forest algorithm for fault detection. Sensor data are labelled as healthy and faulty based on the maintenance actions recorded. The remaining healthy data are used for validation of the model to prove its efficacy in terms of avoiding false positives. We have achieved 100% accuracy in fault detection along with avoiding false positives based on new extracted deep features, which outperforms results using existing features. Existing features are also classified with random forest to compare results. Our developed algorithm provides better results due to the new deep features extracted from the dataset compared to existing features. This research will help various predictive maintenance systems to detect false alarms, which will in turn reduce unnecessary visits of service technicians to installation sites.

Paper Nr: 15
Title:

Illegal Audio Copy Detection using Fundamental Frequency Map

Authors:

Heui-su Son, Sung-woo Byun and Soek-Pil Lee

Abstract: In this paper, we present a new audio identification system which is robust to various attacks. The types of attacks employed are modification such as changes of tempo, pitch and speed and noise addition. We propose a two-dimensional representation for the audio signal called FFMAP. This consists of pitch components and frame components. We also employ Pearson’s correlation score to calculate similarity between original audio data and query. Experimental results show that the proposed algorithm has a high performance.

Area 5 - Multimedia Systems and Applications

Full Papers
Paper Nr: 10
Title:

An Automatic Sleep Scoring Toolbox: Multi-modality of Polysomnography Signals’ Processing

Authors:

Rui Yan, Fan Li, Xiaoyu Wang, Tapani Ristaniemi and Fengyu Cong

Abstract: Sleep scoring is a fundamental but time-consuming process in any sleep laboratory. To speed up the process of sleep scoring without compromising accuracy, this paper develops an automatic sleep scoring toolbox with the capability of multi-signal processing. It allows the user to choose signal types and the number of target classes. Then, an automatic process containing signal pre-processing, feature extraction, classifier training (or prediction) and result correction will be performed. Finally, the application interface displays predicted sleep structure, related sleep parameters and the sleep quality index for reference. To improve the identification accuracy of minority stages, a layer-wise classification strategy is proposed according to the signal characteristics of sleep stages. The context of the current stage is taken into consideration in the correction phase by employing a Hidden Markov Model to study the transition rules of sleep stages in the training dataset. These transition rules will be used for logic classification results. The performance of proposed toolbox has been tested on 100 subjects with an average accuracy of 85.76%. The proposed automatic scoring toolbox would alleviate the burden of the physicians, speed up sleep scoring, and expedite sleep research.

Short Papers
Paper Nr: 12
Title:

A Framework for Visualizing the Dynamic Events of Carbon Nanocomposites using Virtual and Augmented Reality Tools

Authors:

Razib Iqbal, Taylor Kuttenkuler, Chad Brewer and Ridwan Sakidja

Abstract: Atomic interactions pertaining to carbon-nanocomposites can be elusive and hard to comprehend, and as such great benefit can be gained through a visualization of these interactions within a virtual or augmented reality setting. In this paper, we present a framework that can be used for Material Science research and education incorporating topics related to the dynamics in nanomaterials. We developed the proof of concept implementation of this framework for Virtual Reality (VR) and Augmented Reality (AR) settings using the Unity game engine. Throughout this paper, we discuss our framework as well as the related user experiences and performance measurements we gathered when using our framework with the Google Daydream, HTC Vive, and Microsoft HoloLens in introducing scientists to the use of AR and VR as a tool for nanocomposite and molecular dynamics research.

Paper Nr: 7
Title:

Advanced Multi-neural System for Cuff-less Blood Pressure Estimation through Nonlinear HC-features

Authors:

Francesco Rundo, Alessandro Ortis, Sebastiano Battiato and Sabrina Conoci

Abstract: Blood Pressure (BP) is one of the most important physiological indicator that can provide useful information in the medical field. BP is usually measured by a sphygmomanometer device, which is composed by a cuff and a mechanical manometer. In this paper, a novel algorithmic approach to accurately estimate both systolic and diastolic blood pressure is presented. This algorithm exploits the PhotoPlethysmoGraphy (PPG) signal pattern acquired by non-invasive and cuff-less Physio-Probe (PP) silicon-based SiPM device. The PPG data are then processed with ad-hoc bio-inspired mathematical model which estimates both systolic and diastolic pressure values. We compared our results with those measured using a classical sphygmomanometer device and encouraging results of about 97% accuracy were achieved.

Paper Nr: 13
Title:

Effective Image Processing Procedure for Skin Lesion Recognition in Contactless Skin Diagnosis Devices

Authors:

Hansoo Kim

Abstract: An image analysis procedure for recognizing various skin lesions under contactless skin diagnosis environment is proposed. The proposed procedure is composed of five stages, and experimental results show that issues such as uneven distribution of light are properly addressed, and various skin lesions are effectively discriminated according to their characteristics using the image processing technology and the shadow analysis.