Automate: Automatic Anomaly Detection and Root Cause Analysis Framework for Hadoop
| dc.contributor.author | Liu, Xinyuan | |
| dc.contributor.author | Jha, Devki Nandan | |
| dc.contributor.author | Li, Yinhao | |
| dc.contributor.author | Barika, Mutaz | |
| dc.contributor.author | Demirbaga, Ümit | |
| dc.contributor.author | Ranjan, Rajiv | |
| dc.contributor.author | Demirbaga, Ümit | |
| dc.date.accessioned | 2025-10-18T09:16:43Z | |
| dc.date.created | 2024 | |
| dc.date.issued | 2024 | |
| dc.department | Fakülteler, Mühendislik Mimarlık ve Tasarım Fakültesi, Bilgisayar Mühendisliği Bölümü | |
| dc.description | 1st IEEE International Conference on Meta Computing, ICMC 2024 -- Qingdao -- 210263 | |
| dc.description.abstract | Big data frameworks, such as Hadoop, unlock immense potential. Yet, they come in handy with complex challenges, like illusive faults and straggler tasks, that disrupt the execution workflow and reduce resource utilisation. Traditional techniques such as the median analysis, struggle to identify and locate these faults. To address this issue, this paper presents Automate:automatic anomaly detection and root cause analysis framework, which makes a two-fold contribution. First, Automatehas improved the monitoring methods of Hadoop clusters and implements AUtool to monitor cluster resources and task progress. With the enhanced monitoring, we further leverage machine learning algorithms to analyse system logs, aiming to detect outliers and determine their root causes. Automate targets the issue of slow process execution, focusing on the combined impact of server heterogeneity and data locality, thereby offering a comprehensive analysis of factors affecting system efficiency. Our experimental findings demonstrate that the proposed method significantly enhances the accuracy in identifying system outliers and analysing root causes, offering an automated and more effective solution for monitoring and optimising big data system performance. © 2025 Elsevier B.V., All rights reserved. | |
| dc.identifier.doi | 10.1109/ICMC60390.2024.00030 | |
| dc.identifier.endpage | 222 | |
| dc.identifier.isbn | 9798350355994 | |
| dc.identifier.scopus | 2-s2.0-105012166839 | |
| dc.identifier.scopusquality | N/A | |
| dc.identifier.startpage | 213 | |
| dc.identifier.uri | https://doi.org/10.1109/ICMC60390.2024.00030 | |
| dc.identifier.uri | https://hdl.handle.net/11772/19397 | |
| dc.indekslendigikaynak | Scopus | |
| dc.language.iso | en | |
| dc.publisher | Institute of Electrical and Electronics Engineers Inc. | |
| dc.relation.publicationcategory | Konferans Öğesi - Uluslararası - Kurum Öğretim Elemanı | |
| dc.rights | info:eu-repo/semantics/closedAccess | |
| dc.snmz | Scopus_20251016 | |
| dc.subject | Anomaly Detection | |
| dc.subject | Big Data | |
| dc.subject | Hadoop | |
| dc.subject | Pluggable Machine Learning | |
| dc.subject | Root Cause Analysis | |
| dc.title | Automate: Automatic Anomaly Detection and Root Cause Analysis Framework for Hadoop | |
| dc.type | Conference Object | |
| dspace.entity.type | Publication | |
| relation.isAuthorOfPublication | 6197518d-2220-4e55-aa0a-5fc7d5c6606d | |
| relation.isAuthorOfPublication.latestForDiscovery | 6197518d-2220-4e55-aa0a-5fc7d5c6606d |










