Show simple item record

dc.contributor.authorDemirbaga, Ümit
dc.date.accessioned2021-06-07T12:06:37Z
dc.date.available2021-06-07T12:06:37Z
dc.date.issued2021-05-05
dc.identifier.urihttps://link.springer.com/article/10.1007/s00521-021-06046-y
dc.identifier.urihttp://hdl.handle.net/11772/6651
dc.description.abstractTwitter produces a massive amount of data due to its popularity that is one of the reasons underlying big data problems. One of those problems is the classification of tweets due to use of sophisticated and complex language, which makes the current tools insufficient. We present our framework HTwitt, built on top of the Hadoop ecosystem, which consists of a MapReduce algorithm and a set of machine learning techniques embedded within a big data analytics platform to efficiently address the following problems: (1) traditional data processing techniques are inadequate to handle big data; (2) data preprocessing needs substantial manual effort; (3) domain knowledge is required before the classification; (4) semantic explanation is ignored. In this work, these challenges are overcome by using different algorithms combined with a Naïve Bayes classifier to ensure reliability and highly precise recommendations in virtualization and cloud environments. These features make HTwitt different from others in terms of having an effective and practical design for text classification in big data analytics. The main contribution of the paper is to propose a framework for building landslide early warning systems by pinpointing useful tweets and visualizing them along with the processed information. We demonstrate the results of the experiments which quantify the levels of overfitting in the training stage of the model using different sizes of real-world datasets in machine learning phases. Our results demonstrate that the proposed system provides high-quality results with a score of nearly 95% and meets the requirement of a Hadoop-based classification system.tr_TR
dc.description.sponsorshipNewcastle Universitesitr_TR
dc.language.isoengtr_TR
dc.publisherSpringertr_TR
dc.relation.isversionof10.1007/s00521-021-06046-ytr_TR
dc.rightsinfo:eu-repo/semantics/openAccesstr_TR
dc.subjectBig datatr_TR
dc.subjectMapReducetr_TR
dc.subjectMachine learningtr_TR
dc.subjectClassificationtr_TR
dc.subjectMonitoringtr_TR
dc.subjectVisualizationtr_TR
dc.titleHTwitt: A Hadoop-based platform for analysis and visualization of streaming Twitter datatr_TR
dc.typearticletr_TR
dc.relation.journalNeural Computing and Applicationstr_TR
dc.contributor.departmentBartın Üniversitesi, Mühendislik Mimarlık ve Tasarım Fakültesi, Bilgisayar Mühendisliği Bölümütr_TR
dc.contributor.authorID0000-0001-5159-0723tr_TR


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record