HTML Primer for CS115: Introduction to the Internet

Author: Scott Wegener

wegster@bitsmart.com

Disclaimer and Purpose of this Document

This document was created first as a text document to be handed out to my Internet class and is not a complete HTML reference. Currently not all tags are listed for text manipulation, nor are frames or tables covered yet, or do I distinguish HTML 3.0 tags from earlier versions. As this document steps through the creation of a home page, it assumes that the reader is on a Novell network running Windows for Workgroups. If you use this document as a reference, ignore any references to directories, and use a .html extension rather than .htm if you are putting your page on a UNIX host. This is a work in progress and should be completed in time. Bearing that in mind, feel free to read on and enjoy! Any comments or feedback is appreciated.

1. Overview

1.1 What is HTML Anyway?

HTML(Hypertext Markup Language) is a type of SGML(Standard General Markup Language). Examples of other popular SGMLs include LaTex and Postscript, which are both much more complicated than HTML. HTML is used to create World Wide Web(WWW) pages on the Internet and on intranets and LANs. The current version of HTML is 3.0, with both Netscape and Microsoft having custom features added on top of HTML 3.0. The HTML language is comprised of tags, or keywords, which WWW clients/browsers interpret to format text and images properly.

1.2 Background and Web Server Platforms

The Web and the Internet in general owe much to the UNIX community and X-Windows. The majority of servers on the Internet today are still UNIX based, although Windows NT and a few other platforms are becoming more popular. With the increasing popularity of the Internet and the Web with home users, and the availability of inexpensive Internet software and Web browsers, people are now connecting to the Internet from a variety of platforms including Windows 3.x, Windows NT or '95, Macintosh/System 7, and of course UNIX based systems.

Several problems arise when dealing with a multiple platform environment, most of which are handled for users by the TCP/IP protocol. However, there are some issues that the prospective Web author needs to be aware of such as the server settings of the site where the document is to reside on. A WWW server has several options which are configurable by the webmaster, one of which is the directory name from which to serve documents from when a client requests a filename with ~<username> in the path, as in http://bitsmart.com/~wegster, as well as a default document name to look for. Most servers use a default directory named public_html, and a default document name of index.html. This means that when you use your browser to open the document http://bitsmart.com/~wegster, the server looks first for the home directory for user wegster and the subdirectory public_html for a file named index.html.

Note that Windows 3.x does not allow filename with an extension longer than 3 digits(as in command.com, test.doc, etc.), so if you write your files on a Windows 3.x or DOS system, you will need to name your files with a .htm extension. This is fine for writing the file and using a browser to check it, but be aware that if you put your page(s) onto an Internet site you will need to rename the file with the proper .html extension and create a public_html subdirectory for your HTML pages. If the particular s ystem you place your page on is configured unusually and you find that http://<host>/~<username> doesn't seem to work you will have to contact the system administrator to find out the default directory and/or file name.


2. Creating Your First HTML Document

2.1: Write the text of the document

Start off by using a text editor or word processor that is able to save in text mode. There are several HTML editors available both commercially and over the Internet, but trying to use most of them before knowing the basics of HTML can be a frustrating experience. Type the text of your document into the editor, remembering not to use the word processing features like fonts, bold, italics and styles- these will not be saved and are of no use for an HTML document. Include the entire textual content of the document at this point; don't worry about how to make words into links or how to insert graphics, that comes later. Include text which you can make links to URLs from such as your favorite bookmarks, the college's homepage, your favorite search engine or zine, or any other favorite web sites. Separate your paragraphs with a single blank line between them and the same for section headings-all spaces in an HTML document are treated as a single space, so at this point the blank lines will only be used by you to determine where to add your HTML tags later on. Try to keep the length to a page or two- if you want to make it longer, consider breaking it up into separate pages for different subjects. Once you are happy with the content of your document, save the file as a text file into your home directory as index.htm.

2.2: Learning Basic HTML

2.2.1: Text and Tags

HTML documents are comprised of two things-text and tags. Images, sounds, animations, and other elements are all referenced by using the appropriate HTML tags and URLs or filenames. Most HTML tags consist of an 'opening tag' and an 'ending tag.' The opening tag marks the start of that tag or style, and the ending tag marks where the browser should stop treating the text according to the open tag. Example:
<TITLE> marks the starting section of the page title, which by will be displayed at the top of the browser window. The browser treats everything between the <TITLE> and </TITLE> as text to be displayed in the browser title bar. If you had not closed off the TITLE section, the entire document would try to be displayed as the title, an undesirable result. Ending tags always consist of the opening tag name prefixed by a forward slash( as in </TITLE>), and all HTML commands and tags are surrounded by the special characters < and >.

2.2.2: The HTML, HEAD, and BODY tags

The simplest HTML document consists of the following:

<HTML>
<HEAD>
head material
</HEAD>
<BODY>
body material
</BODY>
</HTML>

The HEAD section of a document may contain several special tags, but the main one that is used is the TITLE section. The other tags are rarely if ever used in the majority of HTML documents on the Web. The TITLE text that you specify is what will appear in the Browsers Window when you open the document. You should make this a short line such as "My Home Page" or something only a few words in length. The way you do this is to insert the begin <TITLE> tag and the end </TITLE> tag _around_ the text in your document that you want to be the title, example:

<HTML>
<HEAD>
<TITLE>Welcome to Scott's Home Page!</TITLE>
</HEAD>
<BODY>
body material
</BODY>
</HTML>

If you have already saved your text file from a text editor, you can insert the above tags around your text, with the text in your file going where it says body material up above. Give the document a title if you didn't make one in your text file already.

The BODY section of a document is where all of the document's information is displayed. This includes all plain text, images, and links of any kind. The <BODY> tag takes several optional arguments for setting the background color, link color(s), and other properties but ignore these for now; they will be summarized in the reference section and discussed in class.


2.2.3: Headings and Sections

The next step in converting your text document into a 'real' HTML document is to separate your page into sections where appropriate. Usually content is divided up into like subjects or links. Sections need to have a title for each, and this is accomplished through the use of HTML headings. Headings are diplayed in a bold large font and are denoted by the use of a start and end tag. There are 6 levels of headings, with level 1 being the largest and level 6 the smallest. The start tags therefore are <Hn> where n is the level of the tag, and the end tag is </Hn>. You should use the <H1> tag sparingly, usually only for your page title. Example:

<HTML>
<HEAD>
<TITLE>Welcome to Scott's Home Page!</TITLE>
</HEAD>
<BODY>
<H1>Scott's Home Page</H1>
</BODY>
</HTML>

Note that HTML is case insensitive and space insensitive, the tags could have been written as <h1> and </h1>, or the tags could have been on different lines withthe same result.

Next we want to choose the appropriate points in the document to insert other section titles. It is considered bad form to skip heading levels, so the next group of headings should be <H2> if your top level heading was <H1>. The use of headings is similar to an outline or table of contents in a book- the further down in a document the smaller the text. If different sections are of equal importance and not a subdivision of a parent section then use the same level for these. Go through your document now and find your headings; you may need to create some section titles/headings. Surround these with <H2> and </H2> tags. By default a line break will be inserted after each heading.


2.2.4: Line Breaks

Most browsers will automatically wrap text to fit the browser window by default. For most lines this is fine, but for times that you want specific formatting or only specific text on a line you need to manually insert line breaks. This is done with a single <BR> tag at the end of each line you want to force a line break. Example:

<BODY>
<H1>Scott's Online Resume</H1> Scott Wegener<BR>
101 Some Street<BR>
Bremerton, WA 98312<BR>
(360)111-1111<BR>
wegster@bitsmart.com<BR><BR>
<H2>Objectives:</H2>
will produce this:

Scott's Online Resume

Scott Wegener
101 Some Street
Bremerton, WA 98312
(360)111-1111
wegster@bitsmart.com

Objectives:

Without using a <BR> tag the text looks like this:

Scott's Online Resume

Scott Wegener 101 Some Street Bremerton, WA 98312 (360)111-1111 wegster@bitsmart.com

Objectives:


Start up your browser now and load your home page. This is accomplished by using file:///g:/students/<your account directory>/index.htm for the URL to open substituting your path after file:///g: Look through the document- it may not be pretty quite yet, but make sure that your title in the browser window and your headings are correct and properly closed. If you forget to close a heading or other tag, that tag will be applied to the end of the document. Go through your document now and see if there are any lines which you want to force a line break on. At the end of these lines go ahead and put the <BR> tag at the end of the line.


2.2.5: Styles

Styles for HTML are similar to the text formatting commands you have available on your favorite word processor. You can make text bold or italic, change the font, underline text, use subscripting and other text styles as well. There are two types of styles avaliable in HTML- physical and logical. Both types can be used to produce the same effects as we will see.

A physical style applies it's tag/style without regard for user settings in the Options/Preferences section of the user's browser. The following tags are available for physical styles:
Physical Styles
Tags: Description:
<B> text </B> renders the text in a boldface font
<I> text </I> italicizes text
<TT>text</TT> renders text in a typewriterfont
<STRIKE>text</STRIKE> renders the text in strikeout
<U>text</U> underlines the text
<SUB>text</SUB> subscripts the text
<SUP>text</SUP> superscripts the text
<SMALL>text</SMALL> causes the text to be rendered in a font smaller than the current text.
<BIG>text</BIG> causes text to be rendered largerthan current text
<BASEFONT SIZE=value> Value of 1-7, 1=small, 7=large, default of 3 is normal. Note to reset the size you need a </FONT> tag.
<FONT [SIZE=[+ | -] value] [COLOR=#<red><green><blue>][FACE=<face>]> sets current font size. If using + or - then is in relation to current font size. The color sets the text color and the face attribute sets a font type like "Roman". If the type doesn't exist the face value will be ignored.
<BLINK>text</BLINK> Makes the text blink.

Note that some browsers allow nested tags and some don't- <i><b>text</b></i> may either produce italisized bold text or may just produce the inner tag(bolded) only.

Logical styles also are used to change font characteristics, but each browser may treat a logical style differently. Emphasized text( <em> ) is almost universally rendered as italicized but may not be on some older browsers, and some browsers may allow user configuration for each logical style. Logical tags include:
Logical Styles
Tags: Description:
<EM>text</EM> emphasizes text, typically italics
<STRONG>text</STRONG> used for strong emphasis, typically bold.
<CODE>text</CODE> used for program code, typically a fixed width font
<SAMP>text</SAMP> used for sample output, typically a fixed width font
<KBD>text</KBD> used for displaying a keyboard key,fixed width
<VAR>text</VAR> used to define a variable like user id, fixed width
<DFN>text</DFN> display a word definition, fixed width font
<CITE>text</CITE> used for citations, typically a fixed width font
Note that the fixed width font used for each tag above is different from one another.


2.2.6 Other Text Tags

The following tags are for text manipulation and are considered neither logical or physical.
Tags:    			Description:
<BR [CLEAR=[LEFT | RIGHT | ALL]]>	line break.  CLEAR causes the next line to 
					be printed when the left, right, or both 
					margins are clear, useful for when the 
					next line should start after an aligned 
					image.
<HR>    				Inserts a horizontal rule(line) for a break
<NOBR>text</NOBR> disallows line breaks, used for making wide lines <CENTER>text</CENTER> center justifies the text between the tags <P>text lines</P> paragraph break, the </P>is optional. Similar to a <BR> but drops down 2 lines instead of only 1. <P [ALIGN=[LEFT | RIGHT | CENTER]]>paragraph</P> justifies the entire paragraph <PRE> text </PRE> preformatted text, uses an opening and closing line break, useful for when you want to have several spaces between words for format. This section is using preformatting, select View Document on your browser to see the tags. <BLOCKQUOTE>text</BLOCKQUOTE> indents text on both sides and puts a blank line below <WBR> allows a line break within a <NOBR> area
<ADDRESS>address</ADDRESS> used for mail or email addresses, typically rendered in italics

Note that there are other tags as well but they are beyond the scope of this document, for more information conduct a Web search for 'HTML Guides' or purchase one of the available books on HTML publishing.

Go through your document now and insert <P> tags before each paragraph break. Typically this will be immediately after each heading(<Hn>) tag you have as a minimum. After that go ahead and bold and italicize using either the logical(preferred) or physical methods described above. Save your document without quitting the editor and hit the Reload button in your browser to see the updated changes. Continue to add text manipulation tags until you are happy with the results so far.


2.2.7 Lists

If you have a list of your favorite links on your page or a list of related information you may want to use a list. The most commonly used types of HTML lists are definition lists, ordered lists, and unordered lists. A list may contain another list inside of it(nested) like a table of contents with subsequent indentation, but should not contain any headings inside a list.

An unordered list is by far the most used on the WWW; it consists of a list indented with bullets. The opening tag for an unordered list is <UL> and the closing tyag is </UL>. List items start with the tag <LI> at the start of their line. The syntax for the <UL> tag is: <UL [TYPE= {DISC | CIRCLE | SQUARE]]>, with the default being a disc. Example:
Fruits
<UL>
<LI>Bananas
<LI>Oranges
<LI>Apples
</UL>
with the browser output being:
Fruits

An ordered list is similar to an unordered list with the exception that instead of bullets, each item in the list is numbered. The format of the ordered list tag is as follows:
<OL [START=startvalue] [TYPE=[A | a | I | i| l]> where START is the starting value for the first item, and type sets the type of labelling/numbering as follows:

A definition list can be used for terms and definitions. The opening tag for a definition list is <DL> and the closing tag is </DL>. A term to be defined uses the tag <DT> at the start of the line with no closing tag, and a term definition uses the <DD> tag at the start of a line with no ending tag. Example:
<DL>
<DT> HTML
<DD> HTML, or HyperText Markup Language, is an SGML used for the production of WWW pages.
<DT> Internet
<DD> The Internet is a worldwide collection of computers forming a global network running the TCP/IP protocol.
</DL>
This would produce as output on your browser:
HTML
HTML, or HyperText Markup Language, is an SGML used for the production of WWW pages
Internet
The Internet is a worldwide collection of computers forming a global network running the TCP/IP protocol.
Go back through your document and implement a list of your choice, adding new text into your document if nescessary, but please make the content relative to your page and don't just include one of the example lists.


2.2.7 Inline Images

Images may be displayed in a document, either as standalone items or as anchors/links to an URL. We will discuss the simplest form now and wait until the section on anchors to use images as links. Most browsers currently can handle Graphic Interchange Format(GIF) images and JPEGs. Other image types may not be supported by all browsers. The format of the tag for images is:
<IMG SRC=<image URL. [ALIGN=[TOP | BOTTOM | MIDDLE | LEFT | RIGHT] [VSPACE=<vspace>] [HSPACE=<hspace>] WIDTH=<width>[%]] [HEIGHT=<height>[%]] [BORDER=<border>]. There are other options as well, refer to a full HTML guide for other options. The options listed above are:
Option: Description:
SRC=<image URL> location of the image to be displayed
ALIGN= determines how the image is placed
TOP aligns image with tallest item in the line
BOTTOM aligns bottom of image with baseline
MIDDLE aligns middle of image with baseline
LEFT places left justified and places text to right
RIGHT opposite of above
VSPACE= controls the vertical space in pixels above and below the image
HSPACE= controls the horizontal space in pixels to left and right of the image.
WIDTH=<width>[%] if % is used, then relative to window size, otherwise is width in pixels
HEIGHT=<height>[%] similar to above
BORDER=<border> sets border for image in pixels
Example: <IMG SRC= "https://members.tripod.com/~wegster/graphics/ie_static.gif" ALIGN = TOP> testing this out. will produce:
testing this out.
Note that this image is NOT a link to anything at this point(yet).

At this point, you may not have any images to place on your web page. You can bring in a picture and scan it, saving it as a GIF for later placement on your page, and/or can search the 'net for images or icons that you want to use for your page. You can save any image on a page by right clicking with your mouse and then selecting the save option. This works for any type of image, including backgrounds. Browse a few sites or do a Web search for "Web backgrounds", "background images", or "clip art"and try to find a background that you want to use as well as any other images. Try to limit them in size to less than 10k. Keep in mind that some images may have use restrictions on them- some graphic sites have free or public domain images, which you may use, and others may be copyrighted(which you may NOT use), or request that you put a link on your page to their site. For a good image or background that's a pretty fair deal.


2.2.8 Anchors and links

OK, at this point, your page should be fairly presentable, with the text laid out the way you want it, a few headings, some text effects, and perhaps a list or two. But what about links? That IS one of the points in making a Web page right? You bet. Unfortunately, this is one of the areas where the most mistakes are made, so I saved this for one of the last things you will do on your page.

An anchor, simply stated, is a hyperlink. When a user clicks on an anchor, the browser may open a new document at the same or a totally different site, or may just jump to a specified point within the same document. there are several formats for the ANCHOR tag:
<A NAME=(anchor)>text</a> This will 'name' a location in the document, allowing a jump to it later from another point in this or another doument.
<A HREF=#(anchor)>link text</a> Creates a link to the named anchor in the current document.
<A HREF=(URL)>link text</a> Will make a link from link text to the URL specified

HREF stands for a HyperText Reference(a link), which may be use any of the protocols recognized by your browser, typically including:

There are a few options for anchors not discussed here. A sample line which would use an image as a link is as follows:
<a HREF=mailto:wegster@bitsmart.com><IMG SRC="https://members.tripod.com/~wegster/graphics/email3.gif">Email Scott</a>
This would produce: Email Scott

Hey, what's that 'mailto:' thing you may ask? The mailto:<address> is similar to a protocol(it uses the protocol field in the 'URL') and is a builtin 'protocol' in graphical browsers like MS Internet Explorer and Netscape. When a mailto: link is selcted the browser will launch the email program specified under Options and Compose a new message to the person specified in the <address>. Also notice how the image was made into the link by placing the </a> tag AFTER the full tag for the image.

Now that you know how to make links in your document(s), go ahead and try making a few links on your page to your favorite sites or to some of the URLs in your bookmark file. Make sure you save your work and reload the document in the browser to test it.

You should now have a working Web page once you get your links working. If you want to expand your page, surf the 'net for a while looking for interesting links and save them as bookmarks or use copy and paste to place the links into your document. If you find a lot of links on a related subject you may want to create a second(or more) document dedicated to a single subject and make a link to the new page from your home page. Most of all, have fun with your page- just keep it in good taste.


3. Design Issues: Tips and Common Errors


Last modified: Tuesday, November 19, 1996