Automate: Automatic Anomaly Detection and Root Cause Analysis Framework for Hadoop

dc.contributor.authorLiu, Xinyuan
dc.contributor.authorJha, Devki Nandan
dc.contributor.authorLi, Yinhao
dc.contributor.authorBarika, Mutaz
dc.contributor.authorDemirbaga, Ümit
dc.contributor.authorRanjan, Rajiv
dc.contributor.authorDemirbaga, Ümit
dc.date.accessioned2025-10-18T09:16:43Z
dc.date.created2024
dc.date.issued2024
dc.departmentFakülteler, Mühendislik Mimarlık ve Tasarım Fakültesi, Bilgisayar Mühendisliği Bölümü
dc.description1st IEEE International Conference on Meta Computing, ICMC 2024 -- Qingdao -- 210263
dc.description.abstractBig data frameworks, such as Hadoop, unlock immense potential. Yet, they come in handy with complex challenges, like illusive faults and straggler tasks, that disrupt the execution workflow and reduce resource utilisation. Traditional techniques such as the median analysis, struggle to identify and locate these faults. To address this issue, this paper presents Automate:automatic anomaly detection and root cause analysis framework, which makes a two-fold contribution. First, Automatehas improved the monitoring methods of Hadoop clusters and implements AUtool to monitor cluster resources and task progress. With the enhanced monitoring, we further leverage machine learning algorithms to analyse system logs, aiming to detect outliers and determine their root causes. Automate targets the issue of slow process execution, focusing on the combined impact of server heterogeneity and data locality, thereby offering a comprehensive analysis of factors affecting system efficiency. Our experimental findings demonstrate that the proposed method significantly enhances the accuracy in identifying system outliers and analysing root causes, offering an automated and more effective solution for monitoring and optimising big data system performance. © 2025 Elsevier B.V., All rights reserved.
dc.identifier.doi10.1109/ICMC60390.2024.00030
dc.identifier.endpage222
dc.identifier.isbn9798350355994
dc.identifier.scopus2-s2.0-105012166839
dc.identifier.scopusqualityN/A
dc.identifier.startpage213
dc.identifier.urihttps://doi.org/10.1109/ICMC60390.2024.00030
dc.identifier.urihttps://hdl.handle.net/11772/19397
dc.indekslendigikaynakScopus
dc.language.isoen
dc.publisherInstitute of Electrical and Electronics Engineers Inc.
dc.relation.publicationcategoryKonferans Öğesi - Uluslararası - Kurum Öğretim Elemanı
dc.rightsinfo:eu-repo/semantics/closedAccess
dc.snmzScopus_20251016
dc.subjectAnomaly Detection
dc.subjectBig Data
dc.subjectHadoop
dc.subjectPluggable Machine Learning
dc.subjectRoot Cause Analysis
dc.titleAutomate: Automatic Anomaly Detection and Root Cause Analysis Framework for Hadoop
dc.typeConference Object
dspace.entity.typePublication
relation.isAuthorOfPublication6197518d-2220-4e55-aa0a-5fc7d5c6606d
relation.isAuthorOfPublication.latestForDiscovery6197518d-2220-4e55-aa0a-5fc7d5c6606d

Dosyalar