How the Internet Works

Vocabulary

In order to have an educated discussion of HOW the Internet works we will need to come to a common understanding of the term used to discuss the workings of the Internet.
  1. Internet: The decentralized collection of Interconnected Networks and computers that share publicly accessible information.
  2. Domain: A collection of computers (or one) under the control of a server administrator.
  3. Domain name: The string of letters used to remember a web site (like yahoo.com).
  4. Top Level Domain (TLD): The last part of the domain name (like .com). There are 8 main US TLDs available as of this writing (com, net, org, gov, mil, biz, ws, us). There are also hundreds of others for countires (like uk, nz, au, ca, ge, fr etc.)
  5. DNS: Domain Naming System - The service (group of computers) that translate the domain name into an IP address (from the whois database).
  6. IP address: The Internet Protocol number for the computer connected to the Internet (like 168.1.1.1) using the Internet Protocol (IP)
  7. ISP: Internet Service Provider - The service used to connect a home or work computer to the Internet proper.
  8. HTML: HyperText Markup Language - the language that web pages are "programmed" in.
  9. HTTP: HyperText Transfer Protocol - The protocol for requesting and transmitting HTML web pages (on Port 80).
  10. URL: Uniform Resource Locator - the address of a given web page or web site (like http://www.arrowsmithweb.com/internet.htm).
  11. SSL: Secure Socket Layer - the most common form of secure web page encryption. It comes in different strengths (40 bit, 64 bit, 128 bit, and 256 bit (I think)).
  12. Registrar: A company who maintains the whois record of your domain name information and shares it with the root DNS servers. (Here's the main list)

Browsing the web

When people access the Internet they are usually accessing the World Wide Web (WWW) and using an Internet browser (like Internet Explorer or Netscape Navigator). Generally we use HTTP, which is why most URLs start with "http://www.". Once a URL is typed into the browser and the enter/return key is pressed the browser will attempt to retrieve the requested web page. First thing it does is check the URL for a name (instead of an IP number). If there is a name, the name is resolved at their local (ISP provided) DNS server (converted into an IP address). Then it sends a GET request for the desired file to that IP address. If the host server is operating fine, it will respond with a response code of 200 and then provide the file. When the browser get the file it parses it into instructions for display on the screen and other requests as needed (supporting images and other media files or framed documents). When all the files have been requested, received, and displayed communication is ended. HTTP is a connectionless protocol, so once the communication for that request is over, the connection is dismantled. The next time a page is requested, the process starts over again with a new connection being created.

It's worth noting that there are other forms of web communication. When a form is filled out and "submitted" is is usually send using a POST command instead of a GET command. The difference is that the data is more hidden and can contain greater amounts of data. A GET command can be saved in a bookmark, while a POST command will contain only the web page itself in the bookmark.
When the page is secure (SSL) it encrypts the messages being sent to and from any web page that accepts HTTPS communication. Remember - it's the protocol that matters the most. You can have an insecure form, but secured POSTing and handling of that form. Many web store-fronts use this technique to build whatever form they want on their site for purchasing products and then use a credit card processing web site to handle the SSL transaction adn the credit card information. SSL can be expensive to support on your own website, so it is many times cheaper to go with a third-party company to help you out.

Getting your own web site

Once you have an idea for a web site and you have started to design it, you need to find a web site hosting company (like WebSavvy Solutions). There are many reasons for this:
A) Usually your ISP will forbid you to host web sites (at least commercial ones) from your home computer.
B) Your ISP usually gives you a dynamic IP address, so you can't tell your DNS server where to point.
C) Most people do not leave their computer on and connected to the Internet all the time, but people expect your web site to be available all the time.
D) Your ISP upload speed is too slow for all but the most rarely visited web sites.
E) Your home computer is not very reliable or safe from hacker attack compared to a professional web hosting company.

Your next step is to get a domain name from a Registered Domain Name registrar or a sub agent. There are lots of places out there now. Originally Network Solutions had a monopoly on the service, but now it hasd been broadened out to many companies. You will need to pay $5 to $50 per year to have your chosen (available) name registered to you.
The third step is to set up your DNS entries with your registrar to point to the DNS servers associated with your web site host. You can specify 1 to 4 DNS servers in order to provide redundancy to the system. If the primary DNS server is down, the secondary server will be consulted, and so forth.
Your host (with the DNS servers) will set up their DNS to point to the server your web site is hosted on. This completes the sequence. Now when someone requests your domain name, they will be directed to the server you are hosted on.

More about DNS

OK, now you ask "well, how does the DNS entry at my host get to the DNS server of my ISP and the other DNS servers around the world?" Good question. Every computer involved in the DNS system runs a program called a DNS Resolver. Every DNS Resolver has addresses for "Root" DNS resolvers. These Root resolvers point to computers that can resolve the Top Level Domains. For example, they would point to one that knows about ".com"s, which would in turn have the address of the system that knows about "domainname". A ".com" resolver is given the information about "domainname" when "domainname" is registered as "domainname.com".

DNS servers cache domain name IP address pairs that are commonly used so they do not have to forward the request to another DNS server somewhere. (The DNS server should only cache the name as long as the"Time to live" specified by the host DNS server.) If the local DNS cache is missing the requested name, a root DNS server is contacted to get the definitive answer. The root DNS servers are the TLD DNS servers (for COM, one for EDU, etc.). More than one computer will have the complete root DNS and so each DNS server will have a list of valid root DNS servers. Every root DNS server has a complete list of the other root DNS servers for the TLD(s) it handles.

Changes


When your change the web pages on your site (on the host computer) the changes are recognized as soon as the next person requests those files.

When a DNS entry in changed it takes more time. This means your web site is being moved to a different IP address, but you are staying with the same host. The host DNS server records get changed and

When a DNS entry changes at the registrar, you are choosing to use a different set of DNS servers to specify which ip address your host computer is at. You would change this if you are switching host companies.