File Names, URLs and Organization

Interpreting URLs

When creating a web site there are a number of details to consider and convention to follow in organizing your files. The first thing you need to understand is how a Universal Resource Locator (URL) is constructed. Here's an example URL:

http://www.carleton.edu/campus/ITS/index.html

This URL (like all URL's) specifies a single unique file that exists on the web. If we look at the various parts of this long string it breaks down like this:

The first bit "http://" is just standard web boilerplate. It essentially means "this is a web page". Which probably doesn't come as a surprise to anyone (except maybe the computer). The next bit is "www.carleton.edu". This is actually the name of a particular computer (this one lives in the basement of the CMC). The next two bits "campus/ITS" refer to folders that exist on the computer named www.carleton.edu. The folder "ITS" is inside the folder "campus" (hence the order given). The "/" is just a separator so that the folder names don't run together. Finally we end with the name of the particular file. This file is named (for reasons we'll learn in a moment) "index.html".

Extrapolating from this example you should be able to glean from any URL you see the name of the server, the name of the file in question, and the exact hierarchy of folders you'd have to navigate to find that particular file (assuming you had direct access to the server in question--which usually you don't).

Here's another very similar example with one important difference.

http://www.carleton.edu/campus/ITS/

This is just like the address above except we've left off the name of the file. So instead of pointing to a particular file this URL points to a folder (which may contain many files). The convention on the web is that if one specifies an URL that indicates only a folder (and not a particular file) then the web server should hand back the file named "index.html" that exists in that particular folder.So this URL would (if entered into a web browser) give back the very same page we mentioned first. Clearly the name "index.html" is very special in the world of web page creation.

Picking File Names

Which brings us to the subject of creating your own web site and choosing appropriate names for your files. As is clear from our example above it is good practice to create the "main" page in any folder with the name "index.html". That way people can just refer to the folder and automatically get back the page that is central to that group of pages. There are several rules and conventions that you should follow in naming web pages beyond your initial "index.html" page:

As the number of pages you've created mounts you'll find it's convenient to split your files across multiple folders. One common scheme is to make a folder called "images" where all your gifs and jpegs can sit without cluttering up your html files. As soon as you start spreading your files across multiple folders you'll find a need to delve a little deeper into the mysteries of URL's in order to link everything together.

Absolute and Relative URLs

The URL's we mentioned at the top of the page are absolute addresses. They give complete information about where a file is to be found. However, when you're building your web pages and need to link between them (and also to images that need to be embedded in them) you'll find another form of URL more convenient: the relative URL. The relative URL specifies the location of a file with respect to the file that is referring to it. For example if our "index.html" file had a link to a file named "staff.html" THAT LIVED IN THE VERY SAME FOLDER then the link could simply use "staff.html" as the address instead of specifying the absolute URL "http://www.carleton.edu/campus/ACNS/staff.html".

There are advantages to this beyond just saving typing time. Imagine ACNS changed its acronym to ITS. It's web pages would then live at http://www.carleton.edu/campus/ITS. If we'd specified the URL in index.html pointing to staff.html using the absolute address the URL would need to be changed (the file editted). However, if used the relative address no fixes would be necessary--the relative location of the two files is still the same, even though the absolute location has changed.

So by using relative URL in links between closely related files on the same server those files could be moved en masse to new locations without breaking their internal links. (Of course people outside Carleton who had created links using the now defunct absolute URL would still be out of luck).

But what about the aforementioned "images" folder? What if we had an image file named "logo.gif" that sat inside the images folder which in turn was inside the ACNS folder (the images folder sits parallel to the index.html file). To reference that image in the index.html we'd need to use the relative address "images/logo.gif". This URL indicates we should go to the folder where the linking file sits (in this case http://www.carleton.edu/campus/ACNS) and then travel to the images sub-folder and look for a file named "logo.gif" within that folder. Carrying this example a bit farther the "index.html" file that lives up in "http://www.careleton.edu/campus" would refer to this particular image using the relative address "ACNS/images/logo.gif".

Likewise if the "index.html" file in "http://www.carleton.edu/campus/ACNS" needed to link to the index.html file in "http://www.carleton.edu/campus" it would use the relative URL: "../index.html". The two periods tell the web server to look "up one level in the folder hierarchy". By extension if we put the following address "../../index.html" in a file in the http://www.carleton.edu/campus/ACNS folder it would point to a file named "index.html" at the very top level under http://www.carleton.edu/. This very same file could be accessed using the absolute address "http://www.carleton.edu/" (remind yourself why this address will work exactly the same as "http://www.carleton.edu/index.html"

Confused yet?

It all seem a bit silly at first. But if you stare at the examples for awhile and pay attention to URLs while your surf and create pages it will soon become second nature.

 

Back to the main workshop page

Sean Fox sfox@carleton.edu Carleton College June 2000