Hybrid Big Data Analytics: Integrating Structured and Unstructured Data for Predictive Intelligence
Main Article Content
Abstract
Hybrid big data analytics has emerged as a compelling paradigm for predictive intelligence, yet most operational pipelines still privilege a single modality—either structured relational data or unstructured text—thereby under-exploiting complementary signals. This paper proposes a unified framework that integrates structured records (e.g., time-series sensors, tabular attributes) with unstructured corpora (e.g., clinical narratives, web-scale text) through a multi-modal deep learning architecture coupled with scalable clustering and query optimization. The method fuses static encoders, temporal CNN/LSTM modules, and text representations (e.g., document embeddings with BiLSTM/CNN) in a learned fusion layer, and augments inference with a Gaussian Mixture Model optimized by a bio-inspired Salp Swarm Algorithm for low-latency, distributed querying. Experiments across two representative domains—infectious-disease forecasting and Industry 4.0 cycle-time projection—demonstrate consistent gains over single-modality baselines in AUROC, F1, MAE, and AUPRC, while preserving near real-time responsiveness on commodity GPU/CPU clusters. We discuss integration complexity, interpretability challenges, and deployment constraints, and delineate practical pathways for edge-side execution, transfer learning across domains, and explainability overlays. By systematically bridging structured and unstructured modalities, the study evidences material performance improvements and offers a robust template for multimodal analytics in high-stakes environments.
Article Details
Issue
Section

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
How to Cite
References
H. Rashid Abdulqadir, A. Mohsin Abdulazeez, and D. Assad Zebari, “Data Mining Classification Techniques for Diabetes Prediction,” Qubahan Academic Journal, vol. 1, no. 2, pp. 125–133, May 2021, doi: 10.48161/qaj.v1n2a55. DOI: https://doi.org/10.48161/qaj.v1n2a55
M. Piastou, “Enhancing Data Analysis by Integrating AI Tools with Cloud Computing,” vol. 9001, p. 13924, 2008, doi: 10.15680/IJIRSET.2024.1307182.
S. Almufti, “The novel Social Spider Optimization Algorithm: Overview, Modifications, and Applications,” ICONTECH INTERNATIONAL JOURNAL, vol. 5, no. 2, pp. 32–51, Jun. 2021, doi: 10.46291/icontechvol5iss2pp32-51. DOI: https://doi.org/10.46291/ICONTECHvol5iss2pp32-51
S. M. Almufti, “Historical survey on metaheuristics algorithms,” International Journal of Scientific World, vol. 7, no. 1, p. 1, Nov. 2019, doi: 10.14419/ijsw.v7i1.29497. DOI: https://doi.org/10.14419/ijsw.v7i1.29497
R. Boya Marqas, S. M. Almufti, and R. Rajab Asaad, “FIREBASE EFFICIENCY IN CSV DATA EXCHANGE THROUGH PHP-BASED WEBSITES,” Academic Journal of Nawroz University, vol. 11, no. 3, pp. 410–414, Aug. 2022, doi: 10.25007/ajnu.v11n3a1480. DOI: https://doi.org/10.25007/ajnu.v11n3a1480
L. M. R. Rere, M. I. Fanany, and A. M. Arymurthy, “Metaheuristic Algorithms for Convolution Neural Network,” Comput Intell Neurosci, vol. 2016, pp. 1–13, 2016, doi: 10.1155/2016/1537325. DOI: https://doi.org/10.1155/2016/1537325
L. dos S. Coelho, “Gaussian quantum-behaved particle swarm optimization approaches for constrained engineering design problems,” Expert Syst Appl, vol. 37, no. 2, pp. 1676–1683, Mar. 2010, doi: 10.1016/j.eswa.2009.06.044. DOI: https://doi.org/10.1016/j.eswa.2009.06.044
J. Yu, M. Qin, and S. Zhou, “Dynamic gesture recognition based on 2D convolutional neural network and feature fusion,” Sci Rep, vol. 12, no. 1, Dec. 2022, doi: 10.1038/s41598-022-08133-z. DOI: https://doi.org/10.1038/s41598-022-08133-z
S. Solorio-Fernández, J. A. Carrasco-Ochoa, and J. F. Martínez-Trinidad, “A review of unsupervised feature selection methods,” Artif Intell Rev, vol. 53, no. 2, pp. 907–948, Feb. 2020, doi: 10.1007/s10462-019-09682-y. DOI: https://doi.org/10.1007/s10462-019-09682-y
L. Haji et al., “Dynamic Resource Allocation for Distributed Systems and Cloud Computing,” 2020, [Online]. Available: https://www.researchgate.net/publication/342317991
A. L. Dias, A. C. Turcato, G. S. Sestito, D. Brandao, and R. Nicoletti, “A cloud-based condition monitoring system for fault detection in rotating machines using PROFINET process data,” Comput Ind, vol. 126, Apr. 2021, doi: 10.1016/j.compind.2021.103394. DOI: https://doi.org/10.1016/j.compind.2021.103394
B. Taha Chicho, A. Mohsin Abdulazeez, D. Qader Zeebaree, and D. Assad Zebari, “Machine Learning Classifiers Based Classification For IRIS Recognition,” Qubahan Academic Journal, vol. 1, no. 2, pp. 106–118, May 2021, doi: 10.48161/qaj.v1n2a48. DOI: https://doi.org/10.48161/qaj.v1n2a48
I. J. Bush, R. Abiyev, and M. Arslan, “Impact of machine learning techniques on hand gesture recognition,” Journal of Intelligent and Fuzzy Systems, vol. 37, no. 3, pp. 4241–4252, 2019, doi: 10.3233/JIFS-190353. DOI: https://doi.org/10.3233/JIFS-190353
S. M. Almufti, R. Boya Marqas, and V. Ashqi Saeed, “Taxonomy of bio-inspired optimization algorithms,” Journal of Advanced Computer Science & Technology, vol. 8, no. 2, p. 23, Aug. 2019, doi: 10.14419/jacst.v8i2.29402. DOI: https://doi.org/10.14419/jacst.v8i2.29402
S. Chakrabarti et al., A Machine Learning Based Approach for Hand Gesture Recognition using Distinctive Feature Extraction. 2018.
N. H. Dardas and N. D. Georganas, “Real-time hand gesture detection and recognition using bag-of-features and support vector machine techniques,” IEEE Trans Instrum Meas, vol. 60, no. 11, pp. 3592–3607, Nov. 2011, doi: 10.1109/TIM.2011.2161140. DOI: https://doi.org/10.1109/TIM.2011.2161140