AutoDiagn: An Automated Real-Time Diagnosis Framework for Big Data Systems
| dc.contributor.author | Demirbaga, Ümit | |
| dc.contributor.author | Wen, Zhenyu | |
| dc.contributor.author | Noor, Ayman | |
| dc.contributor.author | Mitra, Karan | |
| dc.contributor.author | Alwasel, Khaled | |
| dc.contributor.author | Garg, Saurabh | |
| dc.contributor.author | Zomaya, Albert Y. | |
| dc.contributor.author | Demirbaga, Ümit | |
| dc.date.accessioned | 2025-10-18T10:10:27Z | |
| dc.date.created | 2022 | |
| dc.date.issued | 2022 | |
| dc.department | Fakülteler, Mühendislik Mimarlık ve Tasarım Fakültesi, Bilgisayar Mühendisliği Bölümü | |
| dc.description.abstract | Big data processing systems, such as Hadoop and Spark, usually work in large-scale, highly-concurrent, and multi-tenant environments that can easily cause hardware and software malfunctions or failures, thereby leading to performance degradation. Several systems and methods exist to detect big data processing systems' performance degradation, perform root-cause analysis, and even overcome the issues causing such degradation. However, these solutions focus on specific problems such as stragglers and inefficient resource utilization. There is a lack of a generic and extensible framework to support the real-time diagnosis of big data systems. In this article, we propose, develop and validate AutoDiagn. This generic and flexible framework provides holistic monitoring of a big data system while detecting performance degradation and enabling root-cause analysis. We present an implementation and evaluation of AutoDiagn that interacts with a Hadoop cluster deployed on a public cloud and tested with real-world benchmark applications. Experimental results show that AutoDiagn can offer a high accuracy root-cause analysis framework, at the same time as offering a small resource footprint, high throughput, and low latency. | |
| dc.description.sponsorship | Turkish Ministry of National Education [EP/T021985/1, EP/R033293/1, EP/T022582/1]; National Natural Science Foundation of China [62072408]; Zhejiang Provincial Natural Science Foundation of China [LY20F020030] | |
| dc.description.sponsorship | This work was supported in part by the Turkish Ministry of National Education, in part by the following UKRI projects through SUPER under Grant EP/T021985/1, through PACE under Grant EP/R033293/1, and through Centre for Digital Citizens under Grant EP/T022582/1, in part by the National Natural Science Foundation of China under Grant 62072408, and in part by the Zhejiang Provincial Natural Science Foundation of China under Grant LY20F020030. | |
| dc.identifier.doi | 10.1109/TC.2021.3070639 | |
| dc.identifier.endpage | 1048 | |
| dc.identifier.issn | 0018-9340 | |
| dc.identifier.issn | 1557-9956 | |
| dc.identifier.issue | 5 | |
| dc.identifier.orcid | Zomaya, Albert/0000-0002-3090-1059 | |
| dc.identifier.orcid | Noor, Ayman/0000-0002-3344-2847 | |
| dc.identifier.orcid | Garg, Saurabh Kumar/0000-0001-8719-284X | |
| dc.identifier.orcid | Mitra, Karan/0000-0003-3489-7429 | |
| dc.identifier.orcid | Demirbaga, Umit/0000-0001-5159-0723; | |
| dc.identifier.scopus | 2-s2.0-85103783969 | |
| dc.identifier.scopusquality | Q1 | |
| dc.identifier.startpage | 1035 | |
| dc.identifier.uri | https://doi.org/10.1109/TC.2021.3070639 | |
| dc.identifier.uri | https://hdl.handle.net/11772/21875 | |
| dc.identifier.volume | 71 | |
| dc.identifier.wos | WOS:000778905700004 | |
| dc.identifier.wosquality | Q2 | |
| dc.indekslendigikaynak | Web of Science | |
| dc.indekslendigikaynak | Scopus | |
| dc.language.iso | en | |
| dc.publisher | IEEE Computer Soc | |
| dc.relation.ispartof | Ieee Transactions on Computers | |
| dc.relation.publicationcategory | Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı | |
| dc.rights | info:eu-repo/semantics/closedAccess | |
| dc.snmz | WoS_20251016 | |
| dc.subject | Big Data | |
| dc.subject | Monitoring | |
| dc.subject | Task Analysis | |
| dc.subject | Real-Time Systems | |
| dc.subject | Degradation | |
| dc.subject | Measurement | |
| dc.subject | Data Visualization | |
| dc.subject | Root-Cause Analysis | |
| dc.subject | Big Data Systems | |
| dc.subject | Qos | |
| dc.subject | Hadoop | |
| dc.subject | Performance | |
| dc.title | AutoDiagn: An Automated Real-Time Diagnosis Framework for Big Data Systems | |
| dc.type | Article | |
| dspace.entity.type | Publication | |
| relation.isAuthorOfPublication | 6197518d-2220-4e55-aa0a-5fc7d5c6606d | |
| relation.isAuthorOfPublication.latestForDiscovery | 6197518d-2220-4e55-aa0a-5fc7d5c6606d |










