- Unsupervised Learning
- Posts
- The Internet, the Deep Web, and the Dark Web
The Internet, the Deep Web, and the Dark Web
If you’re into computer security at all you may have heard of terms like “Deep Web” and “Dark Web”. The terms can be confusing so here are the basics:
The Internet: This is the easy one. It’s the common Internet everyone uses to read news, visit Facebook, and shop. Just consider this the “regular” Internet.
The Deep Web: The deep web is a subset of the Internet that is not indexed by the major search engines. This means that you have to visit those places directly instead of being able to search for them. So there aren’t directions to get there, but they’re waiting if you have an address. The Deep Web is largely there simply because the Internet is too large for search engines to cover completely. So the Deep Web is the long tail of what’s left out.
The Dark Web: The Dark Web (also called Darknet) is a subset of the Deep Web that is not only not indexed, but that also requires something special to be able to access it, e.g., specific proxying software or authentication to gain access. The Dark Web often sits on top of additional sub-networks, such as Tor, I2P, and Freenet, and is often associated with criminal activity of various degrees, including buying and selling drugs, pornography, gambling, etc. While the Dark Web is definitely used for nefarious purposes more than the standard Internet or the Deep Web, there are many legitimate uses for the Dark Web as well. Legitimate uses include things like using Tor to anonymize reports of domestic abuse, government oppression, and other crimes that have serious consequences for those calling out the issues.Common Dark Web resource types are media distribution, with emphasis on specialized and particular interests, and exchanges where you can purchase illegal goods or services. These types of sites frequently require that one contribute before using, which both keeps the resource alive with new content and also helps assure (for illegal content sites) that everyone there shares a bond of mutual guilt that helps reduce the chances that anyone will report the site to the authorities.
Summary
The Internet is where it’s easy to find things online because what you’re searching for is all in search engines.
The Deep Web is the part of the Internet that isn’t necessarily malicious, but is simply too large and/or obscure to be indexed due to the limitations of crawling and indexing software (like Google/Bing/Baidu).
The Dark Web is the part of the non-indexed part of the Internet (the Deep Web) that is used by those who are purposely trying to control access because they have a strong desire for privacy, or because what they’re doing is illegal.
For more primers like this, check out my tutorial series.
Notes
The Wikipedia article on the Deep Web.
The Wikipedia article on the Dark Web.
Both the Deep and Dark web ride on top of Internet infrastructure, so it’s important to understand the difference between the Internet that’s searchable as an experience vs. the Internet as the set of connections and protocols that enable connectivity.
Sizes are not to scale for the image.
The Dark Web is likely to come under increased scrutiny by authorities because of its potential use by terror organizations to coordinate attacks. This could include communication forums that require special access methods, require the use of encryption, and various types of strong authentication.
The use of “The Internet” above is somewhat confusing, as the Internet generally refers to the infrastructure that connects things. The usage here pertains to the user perspective, where they’re using “The Internet” (through a search engine) to find a recipe, to order a book online, etc.
Controlling access in the context of the Dark Web is not simply a matter of requiring a login to a web page. Access in this sense means you needing to do something special just to be able to interact with the service in question, such as using a VPN, or a proxy, or an anonymized network. Additional authentication is usually required once you arrive to the resource as well.
Not all Deep Web (or even Dark Web) resources are illicit, immoral, or illegal. There are some communities that are simply anti-establishment or pro-privacy to a degree that they believe they should be able to function without oversight or judgement by anyone.
Tor is an example of a project that can be, and is, used for both good and bad. It’s used to anonymize whistleblowers, but also to conceal criminal activity. Like encryption and even weapons, powerful tools often have dual purposes in this way, and are not intrinsically good or bad themselves.