The Role of Machine Learning in Enhancing Data Science Workflows: A Systematic Review
DOI:
https://doi.org/10.69968/ijisem.2025v4i1392-397Keywords:
Machine Learning (ML), Data Science Workflows, Natural Language Processing, Artificial Intelligence, Internet of Things (Iot), Data Analytics, Data MiningAbstract
Big data is almost always the foundation of machine learning (ML) models, which have attracted a lot of interest in a range of applications, from computer vision to natural language processing. The intersection of data science, artificial intelligence, and software engineering is shown in the increasing number of products and applications that have integrated machine learning models. This review highlights that machine learning plays a pivotal role in enhancing data science workflows by automating complex tasks, improving predictive accuracy, and enabling data-driven decision-making. Proactive and reactive algorithms, supported by advanced computational power through GPUs and TPUs, allow better forecasting and response in real-world applications. Incorporating data engineering techniques with AI ensures scalability, efficiency, and reduced human errors. Furthermore, machine learning enables the discovery of hidden patterns, handling of massive datasets, and integration of smart computing for decision-making across domains such as business, healthcare, cybersecurity, and urban systems, thereby significantly advancing data science practices.
References
[1] S. Umrao, S. Dron, and R. Saxena, "An Examination of the Impact of Artificial Intelligence on Human Resource Management: Improving Efficiency and Employee Experience," Lect. Notes Networks Syst., vol. 928 LNNS, pp. 406-424, 2024,https://doi.org/10.1007/978-3-031-54671-6_30
[2] A. Nouri, P. E. Davis, P. Subedi, and M. Parashar, "Exploring the Role of Machine Learning in Scientific Workflows: Opportunities and Challenges." 2021.
[3] E. Kesavan, "Internet of Things (IoT): A Review of Security Challenges and Solutions," Int. J. Innov. Sci. Eng. Manag., vol. 2, no. 4, 2023,https://doi.org/10.69968/ijisem.2023v2i465-71
[4] E. Deelman, A. Mandal, M. Jiang, and R. Sakellariou, "The role of machine learning in scientific workflows," Int. J. High Perform. Comput. Appl., vol. 33, no. 6, 2019,https://doi.org/10.1177/1094342019852127
[5] M. J. Bdair, "Enhancing Machine Learning Workflows: A Comprehensive Study of Machine Learning Pipelines," Res. gate, 2024.
[6] A. Singh and N. Shanker, "Redefining Cybercrimes in light of Artificial Intelligence : Emerging threats and Challenges," pp. 192-201, 2024,https://doi.org/10.58532/V3BCAG6P1CH16
[7] J. Kumari, E. Kumar, and D. Kumar, "A Structured Analysis to study the Role of Machine Learning and Deep Learning in The Healthcare Sector with Big Data Analytics," Arch. Comput. Methods Eng., vol. 30, no. 6, 2023,https://doi.org/10.1007/s11831-023-09915-y
[8] R. Pugliese, S. Regondi, and R. Marini, "Machine learning-based approach: Global trends, research directions, and regulatory standpoints," Data Sci. Manag., vol. 4, 2021,https://doi.org/10.1016/j.dsm.2021.12.002
[9] K. E. Schackart, H. J. Imker, and C. E. Cook, "Detailed Implementation of a Reproducible Machine Learning-Enabled Workflow," Data Sci. J., vol. 23, no. 1, pp. 1-14, 2024,https://doi.org/10.5334/dsj-2024-023
[10] F. Le Piane, M. Vozza, M. Baldoni, and F. Mercuri, "Integrating high-performance computing, machine learning, data management workflows, and infrastructures for multiscale simulations and nanomaterials technologies," Beilstein J. Nanotechnol., vol. 15, 2024,https://doi.org/10.3762/bjnano.15.119
[11] S. Raschka, J. Patterson, and C. Nolet, "Machine learning in python: Main developments and technology trends in data science, machine learning, and artificial intelligence," Inf., vol. 11, 2020,https://doi.org/10.3390/info11040193
[12] Z. N. Jawad and V. Balázs, "Machine learning-driven optimization of enterprise resource planning (ERP) systems: a comprehensive review," Beni-Suef Univ. J. Basic Appl. Sci., vol. 13, no. 4, 2024,https://doi.org/10.1186/s43088-023-00460-y
[13] L. K. Andersen and B. J. Reading, "A supervised machine learning workflow for the reduction of highly dimensional biological data," Artif. Intell. Life Sci., vol. 5, 2024,https://doi.org/10.1016/j.ailsci.2023.100090
[14] G. Gouthami, N. Siddartha, and D. P. Borugadda, "Data Science: the Impact of Machine Learning," Futur. Trends Artif. Intell. Vol. 3 B. 8, vol. 3, 2024,https://doi.org/10.58532/V3BGAI8P2CH7
[15] N. B. Kilaru, "AUTOMATE DATA SCIENCE WORKFLOWS USING DATA ENGINEERING TECHNIQUES," Int. J. Res. Publ. Semin., vol. 2, no. 2, 2024.
[16] M. Ogrizović, D. Drašković, and D. Bojić, "Quality assurance strategies for machine learning applications in big data analytics: an overview," J. Big Data, vol. 11, no. 1, 2024,https://doi.org/10.1186/s40537-024-01028-y
[17] H. Hassani and E. S. Silva, "The Role of ChatGPT in Data Science: How AI-Assisted Conversational Interfaces Are Revolutionizing the Field," Big Data Cogn. Comput., vol. 7, no. 2, 2023,https://doi.org/10.3390/bdcc7020062
[18] I. H. Sarker, "Data Science and Analytics: An Overview from Data-Driven Smart Computing, Decision-Making and Applications Perspective," SN Comput. Sci., 2021,https://doi.org/10.1007/s42979-021-00765-8
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Mudita Sharma

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Re-users must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use. This license allows for redistribution, commercial and non-commercial, as long as the original work is properly credited.