Predicting News Article Popularity with Multi Layer Perceptron Algorithm

– Nowadays, news media seems to have been digitized. One of them is printed news which has now turned into online news. The increasing use of social media has made people interested in reading news online. News needs to attract readers with their headlines. Various online news media businesses want to know the future demand of readers, as well as whether the released news can reach more readers so that the news becomes popular. Therefore, with the increasing interest in online news today, this paper will analyze the performance of the Neural Network Algorithm and other artificial intelligence techniques in predicting the popularity of news articles that can help the media to know whether their news will become popular. The news article popularity prediction system can increase its revenue if there are advertisements in the news. The test results show that the accuracy of the Multi Layer Perceptron is 76% and Random Forest gives an accuracy of 70%.


INTRODUCTION
In the current era of information and technology, news media seems to have been digitized [1]. One of the misnews in print media which has now turned into online news. Reading, writing, and sharing information have become a part of life for people's entertainment [2], [3]. The emergence of online news makes people very interested in discussing all information for public consumption. This is supported by the development of social media such as YouTube, Instagram, Twitter, and Facebook so as to make people's interest in reading online news become increasing. It is undeniable, that the advancement of social media seems to increase the distribution of online news media. That's why, information in online news flows so fast, so news becomes more dynamic with low cost but a relatively short life span [4].
The fast flow of online news certainly creates new problems for writers to continue to innovate and present news that is always up to date. More and more, online news enthusiasts are booming. News needs to attract readers with headlines or news titles, to anticipate the preference of readers. In another hand, the reader is able to anticipate the content of a news article before a headline, because knowledge inside the content of the news is certainly in accordance with the content of the news [5]. Various online news media businesses want to know the future demand of readers, as well as whether the released news can reach more readers. If they can find out if the news can reach more readers, of course, they will be better prepared to make decisions immediately in implementing news on their online platform [6]. Therefore, with the increasing interest in online news today, this paper will analyze the performance of the Neural Network Algorithm and other techniques in predicting news article popularity can help the media to know whether their news will become popular. The predicting news article popularity system can increase their income if there are advertisements in the news. This system is widely used in various types of applications such as media advertising, traffic management, and economic trend forecasting. This research uses an online news data set from Tribun News Articles which contains almost 1000 news titles, news_articles descriptions, publish time, publish date, and a number of views from February 01, 2021, to April 08, 2021, to be processed in a model so that it can be classified to predict the popularity.
Artificial Neural networks are models with a high enough level of accuracy to perform tasks such as classification and prediction. This model is considered a Multi Layer network of logistic regression units. This model also has more layers and a complex structure, so this research assume that neural networks are stronger for prediction systems than one-layer parametric logistic regression. Artificial Neural Network has widely used as one for predictive modeling. This method has a good ability in analyzing data patterns, that's why this algorithm is good in prediction. One Artificial Neural Network that is often used as a predictive model is Multi Layer Perceptron.
The algorithm can help news media in classifying whether the news is worthy / can be published with an accurate prediction of the popularity. The media can do the classification first before the news is in the hands of the public so that the media can provide interesting content and headlines to achieve popularity.

Research Position
Previous research conducted by Priyanka Rathord (2019) under the title A Comprehensive Review on Online News Popularity Prediction using Machine Learning Approach conducted research with Comparative analysis of various popularity prediction methods, namely Random Forest, SVM, Ada Boost, KNN, Naive Bayes, Linear Regression, Logistic Regression and Genetic Algorithm [6], [7]. This research results in the accuracy of each algorithm in predicting the popularity of news, where Random Forest occupies the highest accuracy position. However, this research will still be improved by using the Neural Network algorithm and will be compared with the previous algorithm.
In Jalal Rezaeenour's (2018) research in a journal entitled Developing a New Hybrid Intelligent Approach for Prediction Online News Popularity, he conducted research on popular news prediction by utilizing the ELM (Extreme Learning Machine) Neural Network algorithm [4]. This research shows that the most important predictors of popularity are the time for publishing news (higher number of visitors on weekends) and news topics (lifestyle and social media are the most popular topics on the site).
In Feras Namous' (2018) research entitled Online News Popularity Prediction, he conducted a study to determine popular news predictions using data sets from the Mashable News Website and compared algorithms for classification and prediction. The best algorithms with the highest level of accuracy for popular news prediction cases are Random Forest and Multi Layer Perception Neural Network.
Based on the current research, the method with the best accuracy is Multi Layer Perceptron and Random Forest, so the paper will compare the level of accuracy in the application of Multi Layer Perceptron with Random Forest.

Dataset
The data used to conduct this research are articles from Tribun News https://www.kaggle.com/waseemakramkhan/the-tribune-news-articles. This data set contains almost 1000 news titles, news_articles descriptions, publish time, publish date, number of views and popularity from February 01, 2021, to April 08, 2021, collected from the tribune newspapers. On Tribun News Article Dataset, there is a column that defines the popularity, conducted of is_popularity. This research will apply the prediction of popularity by headlines / titles of the news articles. The data set used is balanced, the two classes have a number that is not much different, which is 432 for popular news data and 589 for non-popular news data.
The data set is also taken from https://www.kaggle.com/datasets/szymonjanowski/internet-articles-data-with-usersengagement?resource=download. In the data set there is a top_article attribute which indicates that the article is popular. Of the total 10436 data, only 3853 data were taken for research so that the overall data set was more balanced for both popularity classes. The data set with the top article class has unbalanced data, so the researcher only takes some and combines it with the first data set. The result of the two data sets is a balance.
From total data set will be split into three sets, consisting of train, validation, and test. , the data partitioning step splits the data into training, validation, and testing set for the experiment. In the "Diagnosis Using Computer Tomography (CT) Images" scheme, this study uses a split data set train, testing, validation scheme of 80:10:10, 60:20:20, 60:25:15, 60:30:10, 50:30:20.

System Design
Main design of Predicting News Article Popularity with Multi Layer Perceptron Algorithm research as follows : This paper used the text mining method [9], [10] to preprocess the data which has been divided into training, validation, and testing stages. At the preprocessing stage, the data will be processed using the text mining method. In this method, the data will go through a case folding stage to convert letters to lowercase, tokenizing, stop words removal, and lemmatization. The next training data will be a prediction process using the Multi Layer Perceptron method and Random Forest with input in the form of news article titles. The first method is MLP. For example, this method has 3 layers, that is input layers, hidden layers, and output layers. After the modeling is saved, the next step is the validation process.
After the training stage for the MLP algorithm, the data is then trained with the Random Forest algorithm which is also stored in the model. The preprocessing stage for this method is the same as previously described, the difference lies in the classification process of news popularity.
Validation will be applied by 5-fold cross-validation, which is split into a K number (on this stage K = 5) of section or fold where each fold is used as a testing set at some point. This step also gives an output of the prediction and accuracy of each k-subset value. The testing step will use testing data and use the model to process it. At the testing stage, the system will apply a confusion matrix to find out the percentage of the classification accuracy.
Both models will be compared between Random Forest and Multi Layer Perceptron, which means which method has the best accuracy rate for news popularity prediction. In today's prejudice, MLP has a better level of accuracy and performance. However, the results will still be determined based on the accuracy results at the testing stage.
The system needs one parameter which is an input news article title to show the prediction. The algorithm will predict the popularity. The entire process is built in the python programming language using the Scikit-learn, NLTK, Numpy, and Pandas to Regular Expression libraries and is integrated on the website using the Flask RESTful API so that users can use the system more easily.

Preprocessing
This stage is the stage for processing the data set. The first begins with inputting data on article news titles which will be processed using Stopwordss, tokenizing and filtering all words. At the preprocessing stage on figure 2 will produce clean data. This means that the data is in lowercase format, does not contain meaningless data, and consists of infinitive words with a valid meaning from the lemmatization step. In figure 3, the heading data will be processed through casefolding. All letters will be returned to lowercase and remove punctuation marks. From this sentence, it goes to the tokenizing stage, which is dividing it into several word tokens. And each of these words will go through a stop word removal process that filters the words listed in the stop word list. Furthermore, the clean words enter the lemmatization stage, which is changed to basic words in English.

TF-IDF Process
In the TF-IDF weighting, the process for each word has gone through the preprocessing stage. In the stage of giving weights to words, it is necessary to use the TF-IDF method. This weighting aims to assign a value to a word that will be used as input in the implementation of the model.

Data Processing
At this stage, the data processing will be carried out. To predict the popularity of news articles, it is necessary to apply several techniques. After collecting data and through all preprocessing steps, the processed data will be classified using the Multi Layer Perceptron Algorithm method, which will classify predictions of popularity from news articles.

Prediction using Multi Layer Perceptron
From the results of the words that have been processed by TF-IDF, then the prediction process will be carried out using the Multi Layer Perceptron method. The steps that need to be done in the Multi Layer Perceptron method are as follows:  Determine the number of input inputs, hidden layers, and outputs as training targets.  Randomly assigns initial values to all weights between the input-hidden and hidden-output layers.  Doing Feed forward.  Processing back propagation.  Table 1 below, is a sample calculation of the Multi Layer Perceptron using 2 input layer neurons, 2 hidden layer neurons, and 1 output layer. Table 1. Table Sample of input value From the sample above, X1 and X2 is the value of the input layer and Y is actual result or popularity of the news article. The prediction result needs to be the same as the Y value. So first thing to do is define Network Parameters:  Epoch = 150  Bias = 1 Figure 6. MLP Process Sample The input layer neurons are forwarded to the Hidden Layer in Neurons 1 and 2. The first step is to calculate the Weight. W111 means Weight of Hidden Neuron Layer 1, Input Layer 1, first Weight while B means Bias. The value at the input layer will be calculated using the formula described in the previous chapter.
The higher the epoch value, the more accurate the results will be. Giving an epoch value that is too high also does not have a good effect on training performance, so it is necessary to determine the right epoch value.
After calculating the weight for forward, the next updates weight for backward. From the updated weight, next is defined the induced field and neuron output, which is 0.0 and 0.9 in both neurons of the hidden layer.

System Testing
Testing will be carried out after the implementation phase is complete. Testing is very helpful for research to find out whether the system is running properly and appropriately.

RESULTS AND DISCUSSION
After building the system, it is necessary to do blackbox testing according to the scenario prepared in the previous chapter. The results of blackbox testing are represented in the following table: Based on the testing scenarios, the results are appropriate and can be concluded as successful. All features have been tested with the expected results. Testing needs to be done to find out the Design and Build of a News Popularity Prediction Website as needed and has been running correctly. Tests are carried out using User Acceptance Testing (UAT). The following are the results of the questionnaire testing the UAT method which is implemented in the News Popularity Prediction system. This stage aims to obtain information on whether the system that has been built is in accordance with user needs. Testing is intended to test the extent to which the application can function and be useful according to needs. From each percentage of respondents as many as 15 users taken on August 2, 2022, then the highest and lowest score can be calculated as follows: Table 3. Table Highest  From the calculation which states the highest value is 4800 so that the results of the percentage of UAT tests can be found as follows:

User Acceptance Testing Result
From the results of the percentage above, it can be concluded that the level of usability of the system is strong, which is 81% from 100%.

Accuracy Testing Result
Accuracy testing aims to determine the level of success of the system in predicting the popularity of news by using several testing samples that have been split in the system.

Multilayer Perceptron Testing
Before entering the testing stage, researchers need to calculate the validation value using K Fold Cross Validation. From the model parameters used for training and data validation, there are 5 ratio split data set model to validate.  The validation test result shows the highest average value lies in the ratio of 50:30:20 which is equal to 61%. Values from k-fold 1 to 5 have insignificant differences. Testing the accuracy of the Multi Layer Perceptron method using the MLP Classifier with 5 data set split data represent by confusion matrix on table as follows: From the table 5 above, there are 190 popular classes and 181 not popular class data which are predicted to be correct. Meanwhile, the other 118 data were predicted to be incorrect. From the table above, the results of the classification report calculation are as follows:  The table 6 is a classification report. From the report, it is stated that the MLP Classifier model gives a highest accuracy at ratio 80:10:10 split data set which is 76% with the precision value for not popular predictions is 76%, recall is 77%, and f1-score is 76%. Meanwhile, for popular predictions, precision is 76%, recall is 75% and f1-score is 76%.

Random Forest Testing
In this study, the Random Forest Algorithm was used as a comparison method with the Multilayer Perceptron. This method uses the same split data set with Multi Layer Perceptron. The stage before testing is to run the validation function using K Fold Cross Validation with a total of 5 Folds. From the validation test, the highest average value lies in ratio 50:30:10 which is equal to 63%. Values from k-fold 1 to 5 have insignificant differences. Testing the accuracy of the Random Forest method using the Random Forest Classifier with 5 data set split data represent by confusion matrix on table as follows: From the confusion matrix table, there are 193 not popular data and 150 popular data which are predicted as correct, while the other 155 data are predicted as wrong. Table 9 is also given a classification report that describes the value of precision, recall, and f1-score in each class.  From the table 9, the highest accuracy is at ratio 80:10:10 split data with 70% accurate. The classification report for the not popular class has a precision of 76%, recall of 60% and an f1-score of 67%. Meanwhile, the popular class has a precision of 66%, a recall of 80% and an f1-score of 73%.

Comparison Method Between MLP and Random Forest
After getting the accuracy results from the 2 algorithms, the next step is to compare the accuracy comparisons. Both methods use the same amount of data and balance in each separate data set. Comparison of Multi Layer Perceptron and Random Forest is presented in the following table: From the results of the comparison of the method models, it can be seen that the highest accuracy is obtained by the Multi Layer Perceptron algorithm with MLP Classifier with an accuracy value of 76%. The difference in accuracy with the Random Forest algorithm is quite large, namely 6% because this algorithm reaches 70% accuracy for news headline data.

CONCLUSION
Based on the research that has been done, it can be concluded as follows, from the test results that have been described in detail in chapter V, the accuracy of the Multi Layer Perceptron algorithm is 76%. Comparison of accuracy is done with Random Forest which gives an accuracy of 70%. The difference in accuracy is 6%, so it can be concluded that for the Predicting News Popularity with Multi Layer Perceptron research, the algorithm with the best classification and accuracy results is achieved by Multi Layer Perceptron. Second, the News Popularity Prediction system is built with a website-based front-end so that it can be accessed flexibly and can assist writers in managing the right headlines for news content development. Third, predicting the popularity of news in the system is given the Setting feature, which is a feature that can choose the preferred method for prediction in the system. Prediction can be processed using Multi Layer Perceptron Method or Random Forest.