Web content filtering
From Computing and Software Wiki
This page is a brief overview of different types and technologies behind HTTP or Web Content Filtering.
Contents |
Web Content Filtering
Traditional Filtering
Profanity in URL approach
The most basic form of content filtering is content based on an URL. It is so simple that many home routers even support this type of content blocking natively. This type of content blocking is done by disallowing a user to visit a URL with a given set of words or phrases in it.
As an example, assume a system administrator wants to block an online game such as Slime Volleyball. Since the word slime is conveniently not used very often, all URLS that contain the text string slime can be easily blocked. www.SlimeVolleyBall.net would be blocked simply because it contains the word Slime.
Although very simple, this type of blocking is ineffective in two ways:
- It frequently blocks sites that should not be blocked.
- Sites can get through the content filter easily by having an URL that does not explicitly mention what is on the site. Sites with codes or only partial phrases as URLs will get through this type of filter.
This approach could be called the profanity in URL approach because it is most often used to filter out URLs with profanity in them.
URL Lookup Approach
HTML Header Scanning
Advanced Filtering
--Dasd2 12:53, 27 March 2008 (EDT)