A Comparative Hybrid Machine Learning Model for Stock Market Prediction Using Sentiment and Performance Data
Abstract
Stock market prediction remains one of the most challenging problems in financial analytics because price fluctuations are influenced by both structured financial indicators and unstructured qualitative signals such as news sentiment, investor reactions, and company disclosures. This research paper develops a hybrid machine learning framework integrating Support Vector Machine (SVM), Random Forest (RF), and Long Short-Term Memory (LSTM) to predict stock price movements using sentiment and company performance data. The study examines stock price percentage variations across 25 major publicly traded companies using three event categories: positive news events (Q1), negative news events (Q2), and yearly company performance reports (Q3). Data collected from 2021–2025 included 12,500 financial news articles, 35,000 social media posts, and 125 company performance reports. Results reveal that negative news has a stronger influence on stock prices than positive news, with an average decline of −4.6% compared to a +3.8% increase under positive events. The hybrid model achieved the highest classification accuracy (86.7%), lowest RMSE (0.21), and macro F1-score of 0.84, outperforming individual SVM, RF, and LSTM models. Statistical validation through ANOVA, regression, and independent t-tests confirmed that sentiment and event type significantly affect stock price variations. The findings establish sentiment as a leading indicator for short-term stock prediction while yearly reports contribute long-term stability. The study highlights the superiority of hybrid machine learning approaches for financial forecasting and decision support.
References
2. Zang, H., Cui, X., & Jiang, Y. (2020). Combining sentiment with GARCH-type volatility models. Finance Research Letters, 32, 101090. https://doi.org/10.1016/j.frl.2019.02.006
3. Chen, X., & Wei, C. (2020). A sentiment-aware LSTM for stock price movement prediction. Neural Computing and Applications, 32, 16775–16789. https://doi.org/10.1007/s00521-020-04845-2
4. Dang, N. C., Moreno-García, M. N., & la Torre Díez, I. (2020). Sentiment analysis based on deep learning: A comparative study. Expert Systems with Applications, 139, 112853. https://doi.org/10.1016/j.eswa.2019.112853
5. Li, H., Pan, Y., & Yang, L. (2021). Portfolio optimization with sentiment-enhanced forecasts. Applied Soft Computing, 111, 107709. https://doi.org/10.1016/j.asoc.2021.107709
6. Shah, D., Isah, H., & Zulkernine, F. (2019). Predicting stock market movements using sentiment analysis of social media: A review. Journal of Big Data, 5, 51. https://doi.org/10.1186/s40537-018-0168-6
7. Yoon, S., & Kim, Y. (2019). Financial sentiment analysis using neural network ensembles. Expert Systems with Applications, 120, 256–269. https://doi.org/10.1016/j.eswa.2018.11.009
8. Nguyen, A. T., Pham, T., & Pham, H. (2020). Real-time stock prediction system with streaming news and social media. IEEE Big Data 2020, 2529–2538. https://doi.org/10.1109/BigData50022.2020.9378381
9. Garg, R., Gupta, A., & Varshney, S. (2021). A systematic literature review on machine learning in stock market. Archives of Computational Methods in Engineering, 28(3), 1077–1101. https://doi.org/10.1007/s11831-019-09329-x
10. Hiew, K. L., Lai, K. K., & Phoon, K. F. (2019). Forecasting stock price movement using sentiment analysis and economic indicators. Applied Intelligence, 49(12), 4593–4613. https://doi.org/10.1007/s10489-019-01469-4
11. Wang, J., & Hu, J. (2018). Predicting stock market returns with sentiment features from online financial communities. PLoS ONE, 13(6), e0198807. https://doi.org/10.1371/journal.pone.0198807
12. Pagolu, V. S., Reddy, K. N., Panda, G., & Majhi, B. (2016). Sentiment analysis of Twitter data for predicting stock market movements. IEEE ICCIC 2016, 1342–1347. https://doi.org/10.1109/ICCIC.2016.7919656
13. Chen, Y., Hao, Y., & Jin, Z. (2021). Multi-task learning for financial sentiment and volatility prediction. Decision Support Systems, 140, 113426. https://doi.org/10.1016/j.dss.2020.113426
14. Hasan, M. R., Orgun, M. A., & Schwitter, R. (2018). A survey on real-time event detection from the Twitter data stream. Journal of Information Science, 44(4), 443–463. (Real-time pipeline context) https://doi.org/10.1177/0165551517698564
15. Pyo, S., Lee, J., & Cha, M. (2017). Predictability of social media in financial markets: Twitter, sentiment and event study. WWW Companion 2017, 963–972. https://doi.org/10.1145/3041021.3055130
16. Liang, Y., & Li, S. (2020). Enhancing stock prediction by integrating macroeconomic indicators with sentiment. Economic Modelling, 91, 601–613. https://doi.org/10.1016/j.econmod.2020.04.026
17. Hu, Z., Zhao, Y., & Huang, Y. (2022). Joint modeling of earnings announcements and sentiment for abnormal return prediction. Journal of Financial Data Science, 4(3), 45–63. https://doi.org/10.3905/jfds.2022.1.123

