Contents of Volume 32 (2022)

1/2022 2/2022 3/2022 4/2022 5/2022


  • [1] Huang L., Shi X., Xiang J., (China)
    A method for joint detection and re-identification in multi-object tracking, pp. 285-300

      Full text

    Abstract: In order to better balance the detection accuracy and tracking speed, we propose an online balanced multi-object tracking method (BalMOT), which integrates object detection and appearance extraction into a single network, and can simultaneously output detection and appearance embedding. We also model the training of classification, regression, and embedding features as a multi-task training problem and each part is weighted based on the task-independent uncertainty method. In addition, we introduce the transition layer to optimize the repeated gradient information in the network and reduce the training cost. Through the training, our BalMOT system reaches 71.9% multiple object tracking accuracy (MOTA) on the MOT17 challenge dataset, and the speed fluctuates between 17.4 ~ 22.3 frames per second (FPS) according to the size of the input image.

  • [2] Khan S., Mukati A., Rizvi S.S.H., Yazdanie N., (Pakistan)
    Tooth segmentation in 3D cone-beam CT images using deep convolutional neural network, pp. 301-318

      Full text

    Abstract: Segmentation of an individual tooth in dental radiographs has great significance in the process of orthodontics surgeries and dentistry. Machine learning techniques, especially deep convolutional neural networks can play a key role in revolutionizing the way orthodontics surgeons and dentists work. Lately, many researchers have been working on tooth segmentation in 3D volumetric dental scans with a great degree of success, but to the best of our knowledge, there is no pretrained neural network available publicly for performing tooth segmentation in 3D cone-beam dental CT scans. The methods which so far have been proposed by the researchers in this domain are based on complex multistep pipelines. This lack of the availability of a pre-trained model blocks the path for further explorations in this domain. In this research, we have produced a deep learning model for tooth segmentation from CBCT dental radiographs. The proposed model can segment teeth in CBCT scans in a single step. To train the proposed model, we obtained a dataset consisting of 70 3D CBCT volumes from a local health facility. We labeled the ground truth through a semi-automatic method and trained our neural network. The training yielded a validation accuracy of 95.57% on a binary class semantic segmentation of the 3D CBCT volumes. The model is successfully able to segment teeth, regardless of their type from the background in a single step. This eliminates the need of having a complex and lengthy pipeline which many researchers have been proposing. The proposed model can be extended by incorporating labeling schemes. The custom labeling schemes will help healthcare professionals to perform the labeling as per their needs. The produced model can also provide a basis for further research in this domain.

  • [3] Sumathi K., Vinod V., (India)
    Classification of fruits ripeness using CNN with multivariate analysis by SGD, pp. 319-332

      Full text

    Abstract: Ripeness estimation of fruits is an essential process that impact the quality of fruits and its marketing. Nearly 30% to 35% get wasted from the harvested fruits due to lack of skilled workers in classification and fruit grading. Although it can be executed by human assessment, it is time consuming, costlier and error prone. Lot of research is carried to automate the quality assessment of fruits. Several hyper-parameters have been considered which have liven up by providing robust convolutional neural network (CNN). This paper has focused on image resizer stochastic gradient descent (SGD) algorithm for computing the loss. It updates the parameter by concentrating channels with respect to red, green, and blue (RGB) to identify and classify the images as ripen and rotten. The real time dataset (6702 images) of oranges, papaya and banana is collected. Using SGD optimizer, learning rate of 0.01 and nearest neighbor interpolation algorithm as resizer, the proposed model has achieved accuracy rate of 96.56% after 38 epochs in classifying the fruits as ripen and rotten. It is also observed that it is possible to use small dataset on visual geometry group with 16 layer (VGG) with the above specification and good accuracy rate can be achieved.


  • [1] Yildirim S., Durmusoglu A., Sevim C., Bingol M.S., Kalkat M., (Turkey)
    Design of neural predictors for predicting and analysing COVID-19 cases in different regions, pp. 233-251

      Full text

    Abstract: Nowadays, some unexpected viruses are affecting people with many troubles. COVID-19 virus is spread in the world very rapidly. However, it seems that predicting cases and death fatalities is not easy. Artificial neural networks are employed in many areas for predicting the system’s parameters in simulation or real-time approaches. This paper presents the design of neural predictors for analysing the cases of COVID-19 in three countries. Three countries were selected because of their different regions. Especially, these major countries’ cases were selected for predicting future effects. Furthermore, three types of neural network predictors were employed to analyse COVID-19 cases. NAR-NN is one of the proposed neural networks that have three layers with one input layer neurons, hidden layer neurons and an output layer with fifteen neurons. Each neuron consisted of the activation functions of the tan-sigmoid. The other proposed neural network, ANFIS, consists of five layers with two inputs and one output and ARIMA uses four iterative steps to predict. The proposed neural network types have been selected from many other types of neural network types. These neural network structures are feed-forward types rather than recurrent neural networks. Learning time is better and faster than other types of networks. Finally, three types of neural predictors were used to predict the cases. The R2 and MSE results improved that three types of neural networks have good performance to predict and analyse three region cases of countries.

  • [2] Lhotska L., Husak J., Stejskal J., Kotek M., Dolezal J., Adolf J., (CZ)
    Role of virtual reality in the life of ageing population, pp. 253-267

      Full text

    Abstract: Virtual reality (VR) has been on the scene for several decades already. Its first applications were in gaming. However, hardware and software were expensive and thus not for everybody. Since that time, the development of technology proceeded fast and enabled to open new application areas for VR. Currently many commercial systems are available for gaming, training and education, simulations, design, and also for medical purposes. In the article we focus on VR applications in healthcare. First we present existing commercial solutions, and research studies showing the potential of VR in healthcare. In recent years there have appeared many interesting projects and applications aimed at ageing population as target users. We present examples of such projects. Based on our previous experience and after analysis of available solutions, we propose a conceptual architecture od software environment for development of such applications and discuss their potential use. Finally, the implementation of the proposed architecture for interactive application of experience sets is described.

  • [3] Cui J., Su L., Wei R., Li G., Hu H., Dang X., (China)
    EEG authentication based on deep learning of triplet loss, pp. 269-283

      Full text

    Abstract: As a novel biometric characteristic, the electroencephalogram (EEG) is used for biometric authentication. To solve the challenge of efficiently growing the number of classifications in traditional classification networks and to increase the practicality of engineering, this paper proposes an authentication approach for EEG data based on an attention mechanism and a triplet loss function. The method begins by feeding EEG signals into a deep convolutional network, maps them to 512-dimensional Euclidean space using a long short-term memory network combined with an attention mechanism, and obtains feature vectors for EEG signals with identity information; it then adjusts the network parameters using a triplet loss function, such that the Euclidean distance between feature vectors of similar signals decreases while the distance between signals of a different type increases. Finally, the recognition method is evaluated using publicly available EEG data sets. The experimental results suggest that the method maintains the recognition rate while effectively expanding the classifications of the model, hence thus boosting the practicability of EEG authentication.


  • [1] Hamplová A., Franc D., Veselý A., (CZ)
    An improved classifier and transliterator of hand-written Palmyrene letters to Latin , pp. 181-195

      Full text

    Abstract:This article presents the problem of improving the classifier of handwritten letters from historical alphabets, using letter classification algorithms and transliterating them to Latin. We apply it on Palmyrene alphabet, which is a complex alphabet with letters, some of which are very similar to each other. We created a mobile application for Palmyrene alphabet that is able to transliterate hand-written letters or letters that are given as photograph images. At first, the core of the application was based on MobileNet, but the classification results were not suitable enough. In this article, we suggest an improved, better performing convolutional neural network architecture for hand-written letter classifier used in our mobile application. Our suggested new convolutional neural network architecture shows an improvement in accuracy from 0.6893 to 0.9821 by 142% for hand-written model in comparison with the original MobileNet. Future plans are to improve the photographic model as well.

  • [2] Bandi S.R., Anbarasan M., Sheela D., (India)
    Fusion of SAR and optical images using pixel-based CNN, pp. 197-213

      Full text

    Abstract:Sensors of different wavelengths in remote sensing field capture data. Each and every sensor has its own capabilities and limitations. Synthetic aperture radar (SAR) collects data that has a high spatial and radiometric resolution. The optical remote sensors capture images with good spectral information. Fused images from these sensors will have high information when implemented with a better algorithm resulting in the proper collection of data to predict weather forecasting, soil exploration, and crop classification. This work encompasses a fusion of optical and radar data of Sentinel series satellites using a deep learning-based convolutional neural network (CNN). The three-fold work of the image fusion approach is performed in CNN as layered architecture covering the image transform in the convolutional layer, followed by the activity level measurement in the max pooling layer. Finally, the decision-making is performed in the fully connected layer. The objective of the work is to show that the proposed deep learning-based CNN fusion approach overcomes some of the difficulties in the traditional image fusion approaches. To show the performance of the CNN-based image fusion, a good number of image quality assessment metrics are analyzed. The consequences demonstrate that the integration of spatial and spectral information is numerically evident in the output image and has high robustness. Finally, the objective assessment results outperform the state-of-the-art fusion methodologies.

  • [3] Merta J., Brandejský T., (CZ)
    Two-layer genetic programming, pp. 215-231

      Full text

    Abstract:This paper focuses on a two-layer approach to genetic programming algorithm and the improvement of the training process using ensemble learning. Inspired by the performance leap of deep neural networks, the idea of a multilayered approach to genetic programming is proposed to start with two-layered genetic programming. The goal of the paper was to design and implement a twolayer genetic programming algorithm, test its behaviour in the context of symbolic regression on several basic test cases, to reveal the potential to improve the learning process of genetic programming and increase the accuracy of the resulting models. The algorithm works in two layers. In the first layer, it searches for appropriate sub-models describing each segment of the data. In the second layer, it searches for the final model as a non-linear combination of these sub-models. Two-layer genetic programming coupled with ensemble learning techniques on the experiments performed showed the potential for improving the performance of genetic programming.


  • [1] Li. Q., Qiang Y., Kong D., Liu X., (China)
    A model based on SVM-GDPSO for the voltage stability forecasting of large power system , pp. 131-146

      Full text

    Abstract: The stability assessment of a large power system in real-time is very necessary after it encounters fault. The paper proposes a new model (SVM-GDPSO) for assessing the large power system. In order to enhance SVM, taking tangent vector of power flow Jacobian (PFJ) as the goal of machine learning was used for improving the precision. Besides, particle swarm optimization (PSO) with Gaussian disturbance (GD) is taken for setting the key parameters of SVM, and metalearning was utilized to decrease the search space of PSO. The experiment on the standard test system of IEEE 118-bus demonstrated that this model could reflect the status of large power system in time. Besides, the method could locate the fault area and rank the fault level by the observation of critical bus. The proposed method has the reliability rate 97.22 %, which is superior to the back propagation neural network (BPNN) and SVM-GA, as well as determines the fault area with the success rate of 96.61 %.

  • [2] Masoudi B., Danishvar S. , (Iran, UK) ,
    Deep multi-modal schizophrenia disorder diagnosis via a GRU-CNN architecture, pp. 147-161

      Full text

    Abstract:Schizophrenia is a complex mental disorder associated with a change in the functional and structural of the brain. Accurate automatic diagnosis of schizophrenia is crucial and still a challenge. In this paper, we propose an automatic diagnosis of schizophrenia disorder method based on the fusion of different neuroimaging features and a deep learning architecture. We propose a deep-multimodal fusion (DMMF) architecture based on gated recurrent unit (GRU) network and 2D-3D convolutional neural networks (CNN). The DMMF model combines functional connectivity (FC) measures extracted from functional magnetic resonance imaging (fMRI) data and low-level features obtained from fMRI, magnetic resonance imaging (MRI), or diffusion tensor imaging (DTI) data and creates latent and discriminative feature maps for classification. The fusion of ROI-based FC with fractional anisotropy (FA) derived from DTI images achieved state-of-theart diagnosis-accuracy of 99.50% and an area under the curve (AUC) of 99.7% on COBRE dataset. The results are promising for the combination of features. The high accuracy and AUC in our experiments show that the proposed deep learning architecture can extract latent patterns from neuroimaging data and can help to achieve accurate classification of schizophrenia and healthy groups.

  • [3] Ye Y., Jiang S., Wang S., Li H., (China)
    Distant supervision relation extraction based on mutual information and multi-level attention, pp. 161-179

      Full text

    Abstract:Distant supervision for relation extraction, an effective method to reduce labor costs, has been widely used to search for novel relational facts from text. However, distant supervision always suffers from incorrect labelling problems. Meanwhile, existing methods for noise reduction oftentimes ignore the commonalities in the instances. To alleviate this issue, we propose a distant supervision relation extraction model based on mutual information and multi-level attention. In our proposed method, we calculate mutual information based on the attention mechanism. Mutual information are used to build attention at both word and sentence levels, which is expected to dynamically reduce the influence of noisy instances. Extensive experiments using a benchmark dataset have validated the effectiveness of our proposed method.


  • [1] Likhonina R., Uglickich E. (CZ)
    Hand detection application based on QRD RLS lattice algorithm and its implementation on Xilinx Zynq Ultrascale+ , pp. 73-92

      Full text

    Abstract: The present paper describes hand detection application implemented on Xilinx Zynq Ultrascale+ device, comprising multi-core processor ARM Cortex A53 and FPGA programmable logic. It uses ultrasound data and is based on adaptive QRD RLS lattice algorithm extended with hypothesis testing. The algorithm chooses between two use-cases: (1) “there is a hand in front of the device” vs (2) “there is no hand in front of the device”. For these purposes a new structure of the identification models was designed. The model presenting use-case (1) is a regression model, which has the order sufficient to cover all incoming data. The model responsible for use-case (2) is a regression model, which has a smaller order than the model (1) and a certain time delay, covering the maximal distance where the hand can possibly appear. The offered concept was successfully verified using real ultrasound data in MATLAB optimized for parallel processing and implemented in parallel on four cores of ARM Cortex A53 processor. It was proved that computational time of the algorithm is sufficient for applications requiring real-time processing.

  • [2] Ji K.K., Li Z.Z., Chen J., Liu K.L., Wang G.Y. (China)
    Freeway accident duration prediction based on social network information, pp. 93-112

      Full text

    Abstract: Accident duration prediction is the basis of freeway emergency management, and timely and accurate accident duration prediction can provide a reliable basis for road traffic diversion and rescue agencies. This study proposes a method for predicting the duration of freeway accidents based on social network information by collecting Weibo data of freeway accidents in Sichuan province and using the advantage that human language can convey multi-dimensional information. Firstly, text features are extracted through a TF-IDF model to represent the accident text data quantitatively; secondly, the variability between text data is exploited to construct an ordered text clustering model to obtain clustering intervals containing temporal attributes, thus converting the ordered regression problem into an ordered classification problem; finally, two nonparametric machine learning methods, namely support vector machine (SVM) and k-nearest neighbour method (KNN), to construct an accident duration prediction model. The results show that when the ordered text clustering model divides the text dataset into four classes, both the SVM model and the KNN model show better prediction results, and their average absolute error values are less than 22 %, which is much better than the prediction results of the regression prediction model under the same method.

  • [3] Qing D., Li J., Deng Q., Liu S. (China)
    Mining and quantifying the optimal DBH range of loblolly pine with improved particle algorithm, pp. 113-130

      Full text

    Abstract: In order to fully understand the objective law of height and DBH growth of loblolly pine trees and exploring the best DBH (Diameter at Breast Height) Range for loblolly pine tree height growth, 13 340 loblolly pines with initial DBH between 1 inch and 7 inch were selected from Alabama as research objects, and statistics on its growth from 2000 to 2015. Because particle swarm optimization (PSO) is suitable for solving non-linear problems, the optimal DBH of loblolly pine is transformed into the optimization problem of PSO, which quantifies the optimal DBH range of loblolly pine at different scales by mapping strategy. The experimental results show that the range of the breast diameter suitable for the high growth of the pine tree is concentrated between 3.7 inch and 7.3 inch. The height of the pine tree begins to enter a period of rapid growth from a breast diameter of 3.9 inch (±0.2 inch ). The tree height growth rate reached a maximum at a breast diameter of 6.4 inch (±0.6 inch ), and the tree height entered a slow growth period after the breast diameter of 11.92 inch (±0.3 inch). In general, when the breast diameter exceeds 15.26 inch (±0.3 inch), the height of the pine tree stops growing.


  • [1] Abeska Y.Y., Cavas L. (Turkey)
    Artificial neural network modelling of green synthesis of silver nanoparticles by honey , pp. 1-14

      Full text

    Abstract: Nanomaterials draw attention because of their unique physical, chemical and biological properties in areas such as catalysis, electronic, optics, medicine, solar energy conversion and water treatment. Green synthesis of silver nanoparticles has many superiorities compared to physical and chemical methods such as lowcost, nontoxicity, eco-sensitive. In this paper, experimental conditions related togreen synthesis of silver nanoparticles by honey were modelled using artificial neural network (ANN). While agitation time, agitation rate, pH, temperature, honey concentration, AgNO3 concentration were selected as input parameters, production of silver nanoparticles was used as an output parameter. According to the results, optimum hidden neuron number was found as 40 with Levenberg–Marquardt back-propagation algorithm. In this conditions, the percentages of training, validationand testing were 75, 20 and 5, respectively. After creating neural network separated input data set was applied and then experimental and ANN predicted data were compared. In conclusion, ANN can be an alternative modelling and robust approach that could help researchers in this field to estimate production of silver nanoparticles.

  • [2] Jozová Š., Uglickich E., Nagy I., Likhonina R. (CZ)
    Modeling of discrete questionnaire data with dimension reduction, pp. 15-41

      Full text

    Abstract: The paper deals with the task of modeling discrete questionnaire data with a reduced dimension of the model. The discrete model dimension is reduced using the construction of local models based on independent binomial mixtures estimated with the help of recursive Bayesian algorithms in the combination with the naive Bayes technique. The main contribution of the paper is a three-phase algorithm of the discrete model dimension reduction, which allows to model high-dimensional questionnaire data with high number of explanatory variables and their possible realizations. The proposed general solution is applied to the traffic accident questionnaire analysis, where it takes the form of the classification of the accident circumstances and prediction of the traffic accident severity using the currently measured discrete data. Results of testing the obtained model on real data and comparison with theoretical counterparts are demonstrated.

  • [3] Daqrouq K., Alkhateeb A., Ahmad W., Khalaf E., Awad M., Noeth E., Alharbey R.A., Rushdi A.M. (Pakistan, Saudi Arabia) ,
    A universal ECG signal classification system using the wavelet transform, pp. 43-54

      Full text

    Abstract: The electrocardiograph (ECG) is one of the most successful medical diagnostic tools. The ECG can show, roughly speaking, all types of heart disordersthat appear as ECG signal arrhythmias or problems with the rate or rhythm of thehuman heartbeat. In this paper, a universal ECG signal arrhythmia classificationsystem is proposed. The proposed system is based on using the wavelet transformin two of its known forms, namely, the discrete wavelet transform (DWT) andthe wavelet packet transform (WPT), or a combination thereof. The purpose ofthe research reported herein is to find out a universal classification system; in thesense of providing a capability for simultaneous classification of all types of known heart arrhythmias. Three algorithms based on the wavelet transform are tested for different wavelet levels, wavelet functions, training and testing ratios, and elapsed times. We rank these algorithms according to the elapsed times needed for their processing over the whole loop of the eight different arrhythmia classes. This ranking nominates the WPT-based algorithm to be the most superior method among the competing methods. A different ranking according to successful recognition rates assigns priority instead to the method combining the WPT and the DWT.

  • [4] Qiao F.J., Li B., Gao, M.Q., Li J.J. (China)
    ECG signal classification based on adaptive multi-channel weighted neural network, pp. 55-72

      Full text

    Abstract: The intelligent diagnosis of cardiovascular diseases is a topic of great interest. Many electrocardiogram (ECG) recognition technologies have emerged, but most of them have low recognition accuracy and poor clinical application. To improve the accuracy of ECG classification, this paper proposes a multi-channel neural network framework. Concretely, a multi-channel feature extractor is constructed by using four types of filters, which are weighted according to their importance, as measured by kurtosis. A bidirectional long short-term memory (BLSTM) network structure based on attention mechanism is constructed, and the extracted features are taken as the input of the network, and the algorithm is optimized by attention mechanism. An experiment conducted on the MIT-BIH arrhythmia database shows that the proposed algorithm obtains excellent results, with 99.20 % specificity, 99.87 % sensitivity, and 99.89 % accuracy. Therefore, the algorithm is practical and effective in the clinical diagnosis of cardiovascular diseases.