Beyond the pages that you access on a day-to-day basis, there are a range of pages that are not indexed by standard search engines such as Google, Bing or Yahoo.
The term ‘Deep Web’ is used to refer to these non-indexed pages, which can include web mail, online banking or services that sit behind a pay wall.
For example, a staff intranet or CMS system, used regularly by many employees, is part of the deep web as users have to input credentials to access the page and Google does not serve it up amongst other search results.
While the Deep Web cannot be searched using Google, there are several search engines that can look at this traffic including Surfwax, IceRocket, Stumpedia, Freebase and TechDeepWeb.
The main categories of website in the Deep Web include the contextual web, which includes pages where content varies for different access contexts, as well as dynamic content which is returned in response to a submitted query or accessed only through a form.
It also includes pages where access is blocked by a captcha, a paywall or a password, as well as pages that are only accessible through links produced by JavaScript, unlinked content or content that is encoded in multimedia files or file formats that are not handled by search engines.
It includes web archives which collect archives of web pages which have become inaccessible.
The Deep Web is everything that is not the Surface Web, which is the content which is indexed by search engines and includes all of the sites directly accessed by non-specialists, such as Wikipedia and Reddit.
Estimates vary on the size of the Deep Web, but it is said to cover about 90 percent of pages on the web. The internet is sometimes likened to an iceberg, where the part that is visible is actually only a fraction.
The Deep Web is also distinct from the Dark Web, a smaller group of websites that operate within the encrypted Tor network. The terms are often confused. Essentially, the relationship is that the Dark Web is a subset of the Deep Web.