Spammer classification using ensemble methods over content-based features
Book
Authors | Makkar, A. and Goel, S. |
---|---|
Editors | Kusum Deep, Jagdish Chand Bansal, Kedar Nath Das, Arvind Kumar Lal, Harish Garg, Atulya K. Nagar and Millie Pant |
Abstract | As the web documents are raising at high scale, it is very difficult to access useful information. Search engines play a major role in retrieval of relevant information and knowledge. They deal with managing large amount of information with efficient page ranking algorithms. Still web spammers try to intrude the search engine results by various web spamming techniques for their personal benefit. According to the recent report from Internetlivestats in March (2016), an Internet survey company, states that there are currently 3.4 billion Internet users in the world. From this survey it can be judged that the search engines play a vital role in retrieval of information. In this research, we have investigated fifteen different machine learning classification algorithms over content based features to classify the spam and non spam web pages. Ensemble approach is done by using three algorithms which are computed as best on the basis of various parameters. Ten Fold Cross-validation approach is also used. |
Keywords | Web spamming; Machine learning; Boosting; Ensemble |
ISBN | 978-981103324-7 |
ISSN | 2194-5357 |
Digital Object Identifier (DOI) | https://doi.org/10.1007/978-981-10-3325-4_1 |
Web address (URL) | http://www.scopus.com/inward/record.url?eid=2-s2.0-85018399754&partnerID=MN8TOARS |
Output status | Published |
Publication dates | 13 Apr 2017 |
Publication process dates | |
Deposited | 22 May 2023 |
Year | 2017 |
Publisher | Springer Verlag |
Series | Advances in Intelligent Systems and Computing |
Journal | Advances in Intelligent Systems and Computing |
https://repository.derby.ac.uk/item/9yx44/spammer-classification-using-ensemble-methods-over-content-based-features
25
total views0
total downloads0
views this month0
downloads this month