Chapter One

We're going to start out with some of the history of the Internet and its underlying technologies, including some background on how computers work and what's actually going on when you use a browser. You don't need this stuff to create a webpage, you can go buy FrontPage or PageMill or DreamWeaver and build webpages for years without knowing a single HTML tag, but knowing what the language does and how it does it will make you a better designer. After this class you'll be able to write a web page from scratch and make it do what you want it to, not just what the programmers at Microsoft think you might want.

What the Internet is

i. History of the Internet and WWW

The Internet started out as an experiment by the US DARPA, trying to develop a computer network that could survive losing one or more parts. Why? Because the military was getting more and more computerized, and they wanted to make sure they could stay connected in case of nuclear war...

The experiment included most of the major universities, and the academics quickly figured out how to use the infant Internet to exchange information between themselves. Eventually CERN, the European Particle Physics Lab developed HTML, and some kids at UI - Champaign-Urbana wrote the first browser - Mosaic, the ancestor of Netscape.

Early versions of HTML and Mosaic were pretty lame, and the skills and equipment needed to connect to the Internet were fairly complex, so in the late 80's and early 90's the 'Net was still mostly used by the government and academics. Then came the World Wide Web, a new way of networking computers together, cheap PC's, the first commercial ISP's and Web communities such as AOL. Suddenly anyone could access the Web with the click of a couple of buttons.

Boom. Instant explosion of possibilities. Major corporations and individuals could now reach the same enormous audiences. Right now there are an estimated 200 million people worldwide with access to the web, and that number is growing like wildfire. A new website is created every 7 seconds.

ii. How the bits connect

The World Wide Web is a collection of computers owned by governments, corporations, universities and individuals all over the world that are joined together by the Internet and are all running the same set of communications protocols. That means that even though they are all sorts of different computers speaking all sorts of different computer languages, they use the same set of instructions for talking among themselves.

Each type of computer runs on its own language. Mac's use MacO/S, PC-clones mostly use Windows, Sun's use Unix or Lisp, and so on. The WWW is way for all these different types of computers to communicate and share text, images and other files.

These communication protocols are all defined by a group called W3C, the World Wide Web Consortium, which is comprised of all the major hardware and software manufacturers, major governments and universities. Now that they are in place, there probably won't be any substantial changes in the near future.

iii. Server/Client Relations

Web pages live on computers called "servers", that do very little but sit there and serve up the files like a waiter.

Your browser, Netscape or Internet Explorer, lives on your computer at home. It's called a "client" because it's who the "server" serves.

How do the two come together?

When you click on a link, or a bookmark, your browser sends a message to your ISP saying "The boss wants this file". The ISP's computer looks up the address and connects to the server where the web page lives. That computer sends the HTML files, images and whatever else makes up the webpage to your ISP, which sends it to your computer, where your browser reads it and displays it properly for you.

Once the file or image is in your computer, it's stored in a "cache" (pronounced 'cash'), where your browser can get to it much more quickly if you need it again. A good example is the images that many webpages use as backgrounds. It's only transferred from the server the first time it's called for, after that if there are any other pages that use it your browser just loads it from the cache rather than contacting the server again.

iv. How do we find what we're looking for?

Computer programmers just love obscure terms for simple things, and even more obscure acronyms for those terms. URL stands for Uniform Resource Locator. Now that you know that, you can promptly forget it. What it means is Internet Address.

In English, when we want to say where something is located, we give it an address in a form that goes from the specific to the general. When you send a letter to your mom, you write her name first, then the building number, street name, city, and so on. With computers, it's the other way around. We give websites and webpages addresses that start general, then get more specific.

The average URL looks something like this:

http://www.domainname.com/foldername/filename.html

This is called a path, or a pathname, and tells the computer what road to take to find what we want.

The first part (http) stands for Hyper Text Transfer Protocol, which is a complex term for a simple idea. It tells the computer that what we're looking for should be in a certain format, so they know how to ask for it and how to move it from one computer to another.

Then comes (www.domainname.com). The 'www' stands for World Wide Web and is the prefix for most domains, though you will see other prefixes, like 'members'. It means that something is accessible to the Web, though not all things that are accessible are necessarily given that prefix.

The 'domainname' part is a name chosen by the owner of the website and registered with InterNIC. InterNIC is the organization that keeps track of ALL the domain names in the country and where they are located on the web. You can choose anything you like, and if someone else hasn't already registered it, it's yours (at least as long as you go on paying the yearly fee).

The 'com' part is what's called a 'top level domain'. When the 'Net was young and they didn't realize just how huge the whole thing was going to become, they decided there were 7 different types of entities that would be using the Web. Business or commercial sites (.com), colleges and universities (.edu), the US government (.gov), the US military (.mil), networks and ISP's (.net), nonprofit organizations (.org) and a general category that included state and local governments, libraries, museums and non-4-year schools (.us, for United States). All the other countries got 2-letter codes as well. This was a organizational thing, to try to make finding data easier and more rational. So of course it didn't work for long...

Very recently it was decided to add 7 new tp level domains. Soon you will be able to register domain names with the extensions .biz for businesses and corporations; .info for information-based services such as newspapers, libraries, etc.; .name for individuals' and personal websites; .pro for professions such as law, medicine, accounting, etc.; .aero for services and companies dealing with air travel; .coop for co-operative organizations; and .museum for museums, archival institutions, and exhibitions.

Those three parts make up the "domain name".

After that can come some more specific things, though not always. But if you don't have your own domain, or you have more than one site living on your domain, you have to say what folder the site is in.

And last of all comes the individual file name. If there's nothing after a backslash, the browser automatically looks for a file named 'index.html', that's just how they're programmed.

So it's just like an address out here in the real world, only backwards.


Mr. Filename
On Foldername Street
In Domain Land, WWW

v. Connections and Downloading

Connecting to the 'Net is just like making a normal phone call. The computers, yours and the ISP's are just like the telephones in your house and at your friend's. Your browser is you, your ISP's server software is your friend. The data streaming back and forth is the sound of you and your friend gossiping about what you did over the weekend. The line stays open as long as you don't hang up, even when you aren't talking. When you click on a link, or specify a URL, the ISP fetches it and sends it to you, just like your friend going to the bookshelf, picking up a book and reading out a passage to you.

Computers talk to each other in binary, endless beeps representing 1's or 0's. So they measure things in bits. A bit is one 1 or 0. A bit is shown by a lower-case b. Eight bits makes a byte. A byte is represented by a capital B. One byte is equal to one character, an "a" or "b". One thousand bytes is 1KB. One million bytes is a Megabyte, 1MB. And a billion bytes is a Gigabyte, 1GB. To further confuse the issue, there's also the abbreviation "Kb" (notice the use of lower case letters), which stands for "Kilo-bits" or one thousand bits.

This sentence is about 250 bits.

Modems are rated by speed. The average these days is 56Kb, which means they can transfer data at a maximum speed of 56,000 bits per second. Of course, just because you have a 56K modem doesn't mean you're actually going to get a 56K connection, or that the server you're connecting to can handle that speed. So when you're building a webpage, smaller is better. Plain text is easy. One character takes 8 bits, which is called a byte. So a 56K modem can transfer 7,000 characters per second, approximately 2000 words or 8 pages of text. Images take a lot more. A simple graphical button runs around 1 or 2KB. Photos, even small ones, can take up 15KB or more, sometimes *lots* more.

Even if you have a 56K modem, and your ISP says you've gotten a 56K connection, download speeds are actually much lower. This is because server speeds tend to be lower, and can be made worse by high traffic sites.

vi. Off-line Browsing

HTML was developed to share information over the web, but it's not necessary for an HTML file to be on the 'Net for you to look at it. You can create or save html files on your hard drive and view them with a browser. Open your browser, but don't log on. Then use File-Open to find and open any html, gif or jpg file. In fact, most of the materials for this class will be presented as html files on a floppy disk, rather than as printed pages.



previous next contents