The Involvement of Local Binary Pattern to Improve the Accuracy of Multi Support Vector-Based Javanese Handwriting Character Recognition

- Indonesia is a country that is rich in cultural diversity. An example of one such variety is the Javanese language. The letters that are usually used in Javanese are non-Latin letters or are usually known as Javanese script. However, along with advances in technology, the Javanese language is increasingly being forgotten. In the past, the Javanese script was used as a subject in schools, aiming for Indonesian students to continue to gain knowledge about the Javanese script. The initial step in the introduction of the Javanese script starts with the preprocessing process by changing the image of the Javanese script from the RGB image to a grayscale image which is then performed feature extraction, where the feature extraction used in this script recognition is texture extraction with the Local Binary Pattern (LBP) algorithm. The results of this processing are obtained information that can be used as a parameter in the Multi Support Vector Machine (SVM) classification to predict Javanese script images. In this study using the LBP method with the Multi SVM Algorithm as a classification algorithm produces a high accuracy of 90% in the recognition of Javanese script, better than using only Multi SVM with an accuracy of 80%.


INTRODUCTION
Indonesia is a country that is rich in cultural diversity.An example of one such variety is the Javanese language.The letters that are usually used in Javanese are non-Latin letters or are usually known as Javanese script.However, along with advances in technology, the Javanese language is increasingly being forgotten.In the past, the Javanese script was used as a subject in schools, aiming for Indonesian students to continue to gain knowledge about the Javanese script [1], [2].
The initial step in the introduction of the Javanese script starts with the preprocessing process by changing the image of the Javanese script from the RGB image to a grayscale image which is then performed feature extraction, where the feature extraction used in this script recognition is texture extraction with the Local Binary Pattern (LBP) algorithm [3], [4].The results of this processing are obtained information that can be used as a parameter in the Multi SVM classification to predict Javanese script images.In this study using the LBP method with the Multi SVM Algorithm as a classification algorithm produces a fairly high accuracy of 90% so that it can be used as a system for recognizing Javanese script [5] .
Javanese letters or generally called Javanese script have their own level of difficulty to be learned.This is because it is associated with many similar curves and circles from letter to letter.The difficulties that occur in learning Javanese script will affect the interest of young people in learning Javanese letters [6], [7].In addition, there are not many applications regarding the introduction of Javanese letters.
Image recognition process can be done by classification process either in real time or directly or indirectly.Image classification is usually done using 2 image processing techniques, namely unsupervised learning and supervised learning techniques.In supervised learning techniques, several algorithms that are usually and commonly used by researchers are SVM (Support Vector Machine) [8]- [10], Neural Network (NN) [11]- [17], K-Nearest Neighbor (KNN) [18]- [20], and others.The use of mind processing using supervised learning algorithms is usually used when researchers already have clear training data and variables to classify data.While the research object is in the form of images that are often used which can be static images or dynamic images.Static images are images obtained directly from available media such as books, the internet where images can be used directly to make detection easier.Meanwhile, dynamic images are images that are drawn or written by humans on a piece of paper or any media, for example, and the shape of the image so that one image from another will definitely be different and will not always be the same as a picture in a book or on the internet.
There have been many studies on handwriting recognition based on various methods, such as research conducted by Susanto, et.Al [5], in his research on A High Performance of Local Binary Patterns on Classify Javanese Character Classification, with the Local Binary Pattern (LBP) algorithm for feature extraction and KNN as the classification algorithm, produces an accuracy of 82.5% in the parameters K = 3 and LBP [64 64].Another study is the by [21] using Javanese script character recognition using the Convolutional Neural Network method as a classification algorithm on Javanese script images, resulting in an accuracy of 95.04%.Another study is the research by [22], regarding the introduction of Javanese script using the Area Based Feature Extraction and Support Vector Machine methods resulting in an accuracy of 90.84%.Several previous studies inspired this researcher to create a Javanese script recognition system.This study proposes the LBP and Multi SVM methods.SVM was chosen because it has a computation that has a fairly good level of accuracy.The use of the LBP method is also considered more effective in solving problems to recognize the texture of Javanese script.
Here, the LBP algorithm chosen as a feature extraction process in this study because LBP was first studied to be used for texture analysis.In this study, it is also necessary to analyze the form of the Javanese script writing so that with LBP the computer can recognize the Javanese script writing form and use Multi SVM.

RESEARCH METHOD
Overall in this research is to analyze and identify patterns of Javanese script and recognize whether the Javanese script is correct or not.The introduction of the Javanese script uses the LBP algorithm and to determine the accuracy level of the system's accuracy in recognizing the Javanese script using the Multi SVM algorithm to classify it.The initial method used is to input the data taken with a cellphone camera, then the image is pre-processed from the RGB image which is then converted into a grayscale image.Then the grayscale image is converted into a binary image, which is then carried out by a feature extraction process where this feature extraction process uses the LBP algorithm.After carrying out the feature extraction process on the image and classification is carried out for the process of recognizing Javanese script letters using the Multi SVM algorithm.The hypothesis of the above explanation is that in this study it can recognize Javanese script writing based on the pattern or shape of the indentation of the Javanese script.
The process of recognizing Javanese characters by combining the LBP and Multi SVM algorithms, in which LBP is used as feature extraction and Multi SVM as a prediction or classification of Javanese script.This merging algorithm begins with the process of taking pictures using a cellphone.Then the RGB image from the photo is converted to greyscale, after which the image is resized to a size of 256x256 pixels.After that, the feature extraction process is carried out using LBP.The next process is the classification process using the Multi SVM algorithm.The flow of the research process is explained in Figure 1.

Data
The data is in the form of images of Javanese script writing which consists of 20 Javanese script letters and 20 of each letter are written and then photographed using a mobile phone camera.groups of letters consisting of 30 images and 200 images are used as testing data.Data collection was carried out by writing directly by hand on 800 Javanese script letters, Javanese script writing containing 600 Javanese script writing which was used for data training, and 200 Javanese script writing which was used for data testing.So that a total of 800 Javanese writings are handwritten using a black marker, written on A4 size HVS paper.Then the image is taken by taking a photo with the Samsung A51 cellphone camera, with the best lighting.After that, each picture is grouped by letter and given a name.Table 1 is an example of naming the data for training.Image acquisition is a method of collecting image datasets to be used in research.In this study the dataset was obtained from the handwriting of 20 Javanese script letters written on the media in the form of A4 white paper.After writing, each post will be taken or photographed using the Samsung A51 mobile phone camera, with the best lighting.

RGB Conversion
After the dataset is obtained, the next step for testing this time is preprocessing, where the goal with image preprocessing before processing serves to improve digital image quality in order to achieve maximum results.In digital image preprocessing initially the sizes are still different so as to make it easier and to achieve maximum accuracy so it is necessary to change all image sizes to 256x256 size.To resize digital images, you can use the matlab imresize() function.

Local Binary Pattern (LBP)
Local Binary Pattern is an excellent method used for texture analysis which was first introduced by [7].LBP uses 8 neighboring pixels and 1 central pixel which functions as a threshold value.The value of each neighboring pixel will be compared with the central pixel value, if the central value is greater than the neighboring pixel value then the neighboring pixel value is 0, otherwise the value is 1.All neighboring pixel values are combined from left to right, a binary number will be obtained which can be decilmalized which are also known as Local Binary Patterns (LBP) as shown in Figure 3.
The basic LBP has limitations, namely the size of a 3x3 neighboring pixel cannot capture large-scale texture domain features.Therefore, to deal with large-scale textures, LBP is developed using larger local neighboring pixels.The first step that can be taken in developing a local neighborhood pixel size is to define a T texture in the neighborhood of a P + 1 (P > 0) grayscale image as illustrated in (1), where gc is the neighboring local pixel center value.gp(p=0, . . ., P-1) is the grayscale pixel value P with the same space as the radius of the circle R (R> 0) will form a circular symmetric neighbor local pixel.The figure below illustrates neighboring local pixels with different P and R values, (P,R) shows P neighboring pixels with a circle radius of R. Neighboring pixel values that do not exactly fall on the pixel will be calculated using bilinear interpolation [23].

T = t(gc,g0, . . . ,gp-1)
(1)  Where   is the gray value of the neighboring pixel and   is the gray value of the center pixel and P circle neighborhood pixel with radius R, and the function s(x) is defined as in (3).
Based on the definition above, it can be concluded that LBP is an invariant for grayscale image transformation which maintains the intensity order of each neighboring pixel.The calculated histogram LBP in an area can be used as a texture descriptor.The LBP operator (P,R) produces 2p different output values, according to the 2p binary pattern formed by P pixels in the neighbor set.So it can be seen that the binary contains more information than the other binaries.Therefore, it is possible to use a 2p binary pattern subset to be able to describe the texture of an image.The pattern a uniform pattern with the notation  (,) 2 .There are 2 bitwise transitions from 0 to 1 or vice versa in a circular binary which can be said to be a uniform local binary.For example, 00000000(0 transitions) and 00110000 (2 transitions) are called uniforms because there are 2 bitwise transitions whereas 10001011 (4 transitions) 10101101 (6 transitions) are not uniforms.From the results of research conducted by Ojala et al, the uniforms pattern dominates 90% of the overall pattern in the (8,1) neighborhood and 70% dominates the (16,2) neighborhood in the image texture.

Support Vector Machine (SVM)
SVM is a learning system which uses parameters in the form of linear functions to predict or classify.SVM uses a feature space with high dimensions, and uses a learning algorithm based on the implementation optimization theory of statistical learning [7], [9], [22].SVM is a classification method that is currently widely used and developed by many studies.SVM was sparked from learning theory using statistical numbers and promises to provide classification or prediction results compared to other classification methods.SVM can work very well in highdimensional data sets.SVM itself uses a kernel technique which must map the original data from its original dimension to another dimension with a relatively higher dimension.In the Artificial Neural Network (ANN) method, during the training process all training data will be thoroughly studied.

Testing Method
In this study, we use confusion matrix calculation, because the confusion matrix can evaluate the prediction results of the Multi SVM algorithm more specifically because it has parameters True Positive (TP), True Negative (TN), False Negative (FN), False Positive (FP) as shown in (4) and Table 2. Thus, the results of the accuracy of the method used will be obtained so that it can determine the percentage of validation of a person's handwriting pattern recognition.In order to obtain predictive results from Multi SVM based on data training has been prepared, before the image is tested, the image will also be processed preprocessing and through the LBP extraction process so that it can produce data in the form of information which will be used for data parameter testing of the Multi SVM algorithm.After all the training data is processed and accommodated in the Multi SVM training data variable, the next step is testing using the test data that has been prepared.All test data will go through a process starting from preprocessing to LBP extraction, then digital images will be recognized or predicted using custom functions.Then the results of multiSvm () will be accommodated in a variable, namely out which is used as a parameter to predict the image of the Javanese script.When the find() function is called with the out parameter and the testing data label, it will display the predicted results from the Multi SVM algorithm.In Table 3, the highest accuracy was obtained from Cellsize 64, which was 90%, while the lowest accuracy occurred in Cellsize 32, which was 77%.From the results of the research conducted, regarding handwriting pattern recognition by combining the LBP (Local Binary Pattern) extraction algorithm and for the classification algorithm, namely Multi SVM so that it trains the system to be able to recognize and predict the types of Javanese script optimally.Starting from the initial character image where the image has a different size, then it is uniformized to a size of 256x256, then the RGB image is processed into a grayscale image and then into binary.Binary images are then processed using the LBP algorithm using cells size 32, 64, and 128 in the hope of getting extraction results or information in an image, where the results of these values can be used for parameter values to process Multi SVM classification.In testing using the LBP and Multi SVM methods with 800 datasets (600 data training and 200 data testing) with 20 different fonts, cellssize = 64 produces the highest accuracy rate of 90% compared to other cellsizes.

CONCLUSION
LBP with Cellsize 64 produces an accuracy rate of 90% and Cellsize 128 has an accuracy rate of 88.5% while Cellsize 32 has an accuracy rate of 77%.The combination of the LBP algorithm with the Multi SVM Algorithm as a classification algorithm produces a fairly high accuracy of 90% so that it can be used as a system for recognizing Javanese script.To improve the results and refinement in this study, there are several suggestions that can be used in further research, namely using a combination of other algorithms to produce maximum accuracy.In future research, you can try using additional threshold algorithms, such as canny or others, in order to increase the level of accuracy.Our algorithm still has deficiencies with results that are not optimal, so if this research is developed further, researchers may need to try several kernels in SVM and Multi SVM to analyze the results of preprocessing-based processing.

ACKNOWLEDGEMENT
We would like thank to Dian Nuswantoro University to provide grant research in 2022-2023 to develop a deep understanding of machine learning especially in Javanese character recognition.

Figure 3 .
Figure 3. LBP conversion to biner value

Table 3 .
Comparison of accuracy using MultiSVM and MultiSVM-LBP