


Vol 29, No 1 (2019)
- Year: 2019
- Articles: 20
- URL: https://journals.rcsi.science/1054-6618/issue/view/12266
Proceedings of the 6th International Workshop
An Intelligent Information Technology for Symbol-Extraction from Weakly Formalized Graphic Documents
Abstract
In this paper, we consider algorithms for synthesizing effective distortion-tolerant features and decision rules to recognize weakly formalized objects on large-format bitmap images. In addition, we analyze the effect of binarization of the initial bitmap description on the quality of subsequent automatic recognition.



Developing and Studying the Algorithm for Segmentation of Simple Images Using Detectors Based on Doubly Stochastic Random Fields
Abstract
A problem of segmentation of simulated images with the simplest objects is considered. In addition, an algorithm is developed for segmentation of doubly stochastic images that is based on correlation properties rather than brightness properties of images. The efficiency of this algorithm is studied. To increase the segmentation accuracy, an anomaly-detection algorithm based on doubly stochastic random fields is proposed. The proposed algorithms are studied for various levels of signals. They are compared with the known segmentation algorithms. Image-segmentation software is developed. Its brief description is given.



Developing a Filtering Algorithm for Doubly Stochastic Images Based on Models with Multiple Roots of Characteristic Equations
Abstract
The properties of doubly stochastic models constructed using a combination of autoregression models with multiple roots of characteristic equations are studied. These models are demonstrated to be adequate to real multidimensional signals; the probabilistic and correlation properties of the simulated signals are studied. Based on the proposed models, a filtering algorithm is developed for doubly stochastic autoregression random fields generated by the models with multiple roots of the characteristic equations. The algorithm is compared to the alternative approaches.



Object Identification on Low-Count Images by Means of Maximum-Likelihood Descriptors of Precedents
Abstract
The objects recognition/identification problem is considered. More specifically, the identification of the objects according to their intensity form (shape) on images of a special class is in the focus of the paper. The considered special class of images (low-count images) is related to the registration of low-intensity radiation and, therefore, relatively small numbers of photo-counts. Typical properties of low-count images are low signal-noise ratio, low contrast, and fuzzy shape of imaged objects. Therefore, classical methods intended to recognize images with acceptable quality characteristics are, in general, not sufficiently efficient for images of the considered class; new recognition approaches must be developed for them. Such an approach is proposed in the present paper. The proposed approach is oriented to methods of statistical (machine) learning and is intended to identify objects given by a set of random points (counts). In the framework of the discussed approach, the recognition problem is formalized as the statistical classification (identification) problem for the intensity of point processes with respect to classes formed according to the data observed earlier (precedents). To implement the proposed approach, we reduce the recognition problem to the problem of statistical learning with the maximum-likelihood descriptions of observed precedents. In the framework of the discussed approach, the identification process for intensities to be recognized is treated as follows: for each such intensity, we select a precedent from the already formed database such that it is maximum-likelihood for its description. We extend the notion of precedents up to the class of affine-like form transformations; i.e., the recognized image is determined up to its size and location. In the present paper, the proposed approach is developed up to the algorithmic implementation level. The structure of the obtained algorithms is close to the structure of the well-known EM-algorithm in its variational-Bayesian treatment.



Visible and Infrared Imaging Based Inspection of Power Installation
Abstract
The inspection of power lines is the crucial task for the safe operation of power transmission: its components require regular checking to detect damages and faults that are caused by corrosion or any other environmental agents and mechanical stress. During recent years, the use of Unmanned Autonomous Vehicle (UAV) for environmental and industrial monitoring is constantly growing and the demand for fast and robust algorithms for the analysis of the data acquired by drones during the inspections has increased. In this work, we use UAV to acquire power transmission lines data and apply image processing to highlight expected faults. Our method is based on a fusion algorithm for the infrared and visible power lines images, which is invariant to large scale changes and illumination changes in the real operating environment. Hence, different algorithms from image processing are applied to visible and infrared thermal data, to track the power lines and to detect faults and anomalies. The method significantly identifies edges and hot spots from the set of frames with good accuracy. At the final stage we identify hot spots using thermal images. The paper concludes with the description of the current work, which has been carried out in a research project, namely SCIADRO.



Method of Optimal Circular Path for Iris Template Matching
Abstract
A modification of the conventional method for iris template matching is proposed. The normalized iris images to be compared are divided into several sectors, which are then shifted in the angular and radial directions to find the best alignment. The values of these shifts are determined by finding the optimal path in the distance array. This approach is algorithmically and computationally simpler than its well-known analogs and yields similar or better results in terms of recognition accuracy. In addition, the problem of selecting the optimal parameters for the wavelets that form the iris pattern is investigated. It is shown that improving the accuracy of alignment makes it possible to use more informative high-frequency wavelets. Numerical experiments are carried out on the ICE2005 and CASIA open image databases.



Applied Problems
Multisource Speech Analysis for Speaker Recognition
Abstract
On a comprehensive speech database, speaker recognition characteristics are compared under the usage of various voice-source models. Inverse problems to find a source via vowel speech segments are solved on the base of a special speech-production model and voice-source models (A-source, piecewise-linear source, nonparametric source, and source found by means of the spectral relation method). In the first stage, we find the pulses such that the relative residuals of their segmented and their theoretical analogs computed by means of the speech-production model are less than 0.25. For the selected pulses, a posteriori estimates of the error of their determining are computed and the final selection of the source pulses is performed: for the recognition procedure, we leave only pulses with a posteriori estimates of the error less than the accepted level 0.3. In the space of parameters found for each source model, a statistical model is created for each speaker and the recognition is performed. For the speaker recognition with respect to one vowel, the mean error is approximately equal to 66% for the piecewise-linear source, 61% for the spectral relation method, and 33% for the A-source.



Modern Problems of Brain-Signal Analysis and Approaches to Their Solution
Abstract
A number of problems of modern quantitative EEG are considered: the non-stationarity of the processed signal, the decreasing accuracy of the received results due to the averaging effect in the spectral analysis methods used in computer EEG, and the impossibility of detecting complexes in analyzed EEG without involving a clinician. A structural (syntactic) approach is suggested, which reduces the negative effect of these problems, and the mathematical apparatus for its realization is developed. The results of application of the structural approach to analysis of a real EEG are given.



Ground Object Information Recovery for Thin Cloud Contaminated Optical Remote Sensing Images
Abstract
Ground object information on optical remote sensing images is obscured by thin clouds. This paper proposes a ground object information recovery algorithm for thin cloud contaminated optical remote sensing images by combining methods of support vector guidance filtering and transfer learning. Firstly, thin cloud contaminated target images and cloud-free images are decomposed into multidirectional subbands by using multi-directional nonsubsampled dual-tree complex wavelet packet transform (NS-DTCWPT). Then support vector guided filter is applied to remove thin cloud and transfer learning method is used to predict the ground object information on multidirectional subbands. Finally, the processed multidirectional subbands are reconstructed by using inverse multi-directional NS-DTCWPT to obtain the ground object information recovery images. The proposed algorithm combines the advantages of methods of support vector guidance filtering and transfer learning. Experimental results show that the proposed algorithm can effectively remove the thin clouds on the optical remote sensing images and obtain a good recovery effect of ground object information.



Motion Maps and Their Applications for Dynamic Object Monitoring
Abstract
In this paper, dynamic object behavior analysis task is considered. We use basic optical flow to form integral optical flow and use it to separate background and foreground and obtain intensive motion regions. Based on information extracted from integral optical flow, we introduce notion of motion maps and show how it can be used for dynamic object motion analysis. In motion maps, we analyze pixel motions statistically for each frame to obtain quantity of pixels moving toward or away from each position and their comprehensive motion at each position. We then define and compute regional motion indicators to describe motions at regional level. These indicators are further used for analysis of dynamic object behavior. We conduct a set of experiments on real world videos. Experimental results show that motion maps can be effectively used for dynamic object monitoring.



Segmentation of a Point Cloud by Data on Laser Scanning Intensities
Abstract
This paper presents a method of segmentation of a point cloud into individual objects. The described method is based on using the data on the intensities of the reflected signals obtained by laser scanning. To accelerate the calculations, the point cloud is preliminarily partitioned into non-overlapping sets, superpixels. The objects are segmented by merging the superpixels and taking account of two features: the distance between the histograms of the intensity distribution and the coordinates of the points. To improve the quality of segmentation, three-dimensional filtering using a low-frequency Gabor filter is suggested. The calculation results and an analysis of the effectiveness of the proposed method are given.



Implementing an Android Application for Automatic Vietnamese Business Card Recognition
Abstract
This paper presents an application for automatic Vietnamese business cards recognition. Vietnamese business cards are characterized by complex layouts, Vietnamese-English mixed texts, and diverse typesetting. The goal is to extract meaningful information from the scanned image, such as names, phone numbers, email addresses, job titles and more. The recognition algorithm proposed is divided into five parts: (1) business card processing, (2) text block finder, (3) image binarization, (4) OCR processing and (5) linguistic processing. The final Android application supports both Vietnamese and English business cards and it manages and synchronizes data on all of the user’s devices. It also allows users to edit the contact information, or delete them at will. They can call, send messages or email to contacts right in the application or share that information with others. The result has proved that our application has a definite advantage in recognizing Vietnamese business cards, or mixed English and Vietnamese when compared to existing commercial application on the Google Play store.



A Comparative Analysis of Segmentation Techniques for Lung Cancer Detection
Abstract
Cancer is the major cause of death worldwide. Lung cancer is one of the most common types of cancer and is the main reason of cancer death. Lung cancer is defined as a detrimental lung tumor which is identified by unregulated cell development in the lung tissues. If this disease is not treated at early stages then this unregulated growth can spread into the neighboring tissues and other parts of the body. Detection of lung cancer at its early stage is very difficult because there are very less or may be no symptoms in the early stages of this disease and most of the cancer cases are usually diagnosed in its subsequent stages. Treatment of lung cancer in the early stages can improve the survival rate. And for this purpose it is an essential task to detect the lung cancer at its early stages. In this paper we have presented a comparative analysis of various image segmentation techniques for the detection of lung cancer. These methods include Thresholding methods, Marker Controlled watershed Segmentation, Edge detection and PDE based segmentation techniques.



Face Recognition Using Fuzzy Minimal Structure Oscillation in the Wavelet Domain
Abstract
This paper proposes a novel technique using Fuzzy Minimal Structure Oscillation in the Wavelet domain. This pathway consists of three parts: convert color face images into red, green and blue (RGB) face images, then apply 2D wavelet transformation (Haar) for dimension reduction, by removing abundances and preserving the original features of the image and lastly apply Fuzzy Minimal Oscillation to distinguish unknown face images from a set of known images. In this paper introduced a new algorithm for face recognition formed and tested using MATLAB 7.9 software.



Cross-Layered Embedding of Watermark on Image for High Authentication
Abstract
Watermarking on multimedia gives more attention in research society. In this non-blind watermarking approach a watermark is reshaped and grouped into odd and even images. Next, wavelet is imposed on an original image and to enrich the robustness of the technique, watermarking process is enabled only in the upper diagonal regions of middle level wavelet coefficients. Further, the backward process is taken place to obtain the watermarked image. To prove ownership, original and watermarked image have undergone the same operation and acquire the copyright information. Experimental results indicate that the quality of the watermarked image is better also this algorithm is robust against filtering and geometrical attacks.



Representation, Processing, Analysis, and Understanding of Images
Applications of Algebraic Moments for Corner and Edge Detection for a Locally Angular Model
Abstract
An approach to subpixel corner and edge detection in images is described. The approach is based on algebraic moments of the brightness function describing a halftone image. For an ideal two-dimensional L-corner edge, we consider a model with the following four parameters: the coordinates of the corner vertex, the orientation and the degree measure of the corner, and the brightness values from both sides of the edge. A particular case of the angular model is a linear model that describes a linear edge. To obtain all parameters of the model, six algebraic moments are used. To compute the moments rapidly, masks are used. Based on the angular model, we propose an algorithm that makes it possible to quickly refine the coordinates of the corner and edge points in the image with subpixel accuracy. The use of integral characteristics increases the resistance to various kinds of noises. We perform experiments to confirm the efficiency of the proposed approach.



Dimensionality Reduction of Hyperspectral Images Using Pooling
Abstract
Hyperspectral image having huge numbers of narrow and contiguous bands involves high computation complexity in processing and analysing the image. Hence dimensionality reduction is applied as an essential pre-processing step for hyperspectral data. Pooling is a technique of reducing spatial dimension and successfully applied in convolutional neural network. There are various types of pooling strategies present viz. max pool, mean pool and having their respective merits. In the present article, the concept of pooling is applied in the spectral dimension of the hyperspectral data to reduce the dimensionality and compared the result with standard reduction process like principal component analysis. Different pooling methods are applied and compared across and the mean pooling is found to be performing better. The results are compared in terms of overall accuracy and execution time.



Software and Hardware for Pattern Recognition and Image Analysis
Improved Architecture and Configurations of Feedforward Neural Networks to Increase Forecasting Accuracy for Moments of Finite Normal Mixtures
Abstract
The methodology of forecasting using neural networks for modified data with original values replaced by discrete ones was successfully applied in previous authors’ works for mixture moments approximated with method of moving separation of mixtures. Previously the uniform architecture has been used for all analyzed time-series. This paper proposes an improved type of neural network architecture with grid-search hyper-parameter tuning for expectation, variance, skewness and kurtosis. It allows increasing the value of prediction accuracy for some combinations of forecast periods and number of gaps up to 99.7%. Numerous tables and plots are presented to better demonstrate the results.



A Smart Security System Using Multimodal Features from Videos
Abstract
Multi-modal biometric authentication system uses more than one biometric feature and the use of multi-modal biometrics improves security by making the system invulnerable to spoofing attacks. The proposed system uses face and gait biometrics for authentication and identification. The authentication is done in an unobtrusive manner without the knowledge and co-ordination of the user with the help of surveillance cameras. The videos are captured from two surveillance cameras, placed at fronto-parallel and fronto-normal views are given as input to the system. The gait system uses the video from fronto-parallel view and uses a model free approach to extract a spatio temporal motion summary of the gait cycle. The gait features has been compared by calculating the Euclidean distance between them. The face system uses the video from fronto normal view and uses an appearance based approach to extract features from the face of the user. The face features has been compared by calculating Chi-Square dissimilarity between them. The score level fusion is performed to provide an enhanced security system. A threshold value is kept and it is compared with the scores to authenticate the person. The Minimum Distance Classifier is used to identify the person by fusing the multimodal features.



Erratum


