Skip to Main content Skip to Navigation
Conference papers

WebGuard: Web-Based Adult Content Detection and Filtering System

Mohamed Hammami 1 Youssef Chahir 2 Liming Chen 1
2 Equipe Image - Laboratoire GREYC - UMR6072
GREYC - Groupe de Recherche en Informatique, Image, Automatique et Instrumentation de Caen
Abstract : We describe a Web filtering system "WebGuard", which aims to automatically detect and filter adult content on the Web. WebGuard uses Web crawler to extract relevant data from the Web, combines the textual content, the image content, and the URL name of a Web page to construct its feature vector. WebGuard uses data mining techniques to classify URLs into two classes: suspect URLs and normal URLs. The suspect URLs are stored in a database, which is constantly and automatically updated in order to reflect the highly dynamic evolution of the Web. When working, WebGuard simply captures a user's URL, matches it with the suspect URLs stored in the database and takes an appropriate action - filtering or blocking - according to the result of the analysis. Our preliminary results show that it can detect and filter adult content effectively.
Document type :
Conference papers
Complete list of metadatas

https://hal.archives-ouvertes.fr/hal-00324858
Contributor : Dodola Greyc <>
Submitted on : Thursday, September 25, 2008 - 4:40:59 PM
Last modification on : Tuesday, November 19, 2019 - 2:09:27 AM

Identifiers

Citation

Mohamed Hammami, Youssef Chahir, Liming Chen. WebGuard: Web-Based Adult Content Detection and Filtering System. IEEE/WIC International Conference on Web Intelligence (WI'03), 2003, Halifax, Canada. pp.574-578, ⟨10.1109/WI.2003.1241271⟩. ⟨hal-00324858⟩

Share

Metrics

Record views

196