HTML Primer for CS115: Introduction to the Internet
Author: Scott Wegener
wegster@bitsmart.com
Disclaimer and Purpose of this Document
This document was created first as a text document to be handed out to my
Internet class and is not a complete HTML reference. Currently not all tags
are listed for text manipulation, nor are frames or tables covered yet, or do
I distinguish HTML 3.0 tags from earlier versions. As this document steps
through the creation of a home page, it assumes that the reader is on a
Novell network running Windows for Workgroups. If you use this document
as a reference, ignore any references to directories, and use a .html
extension rather than .htm if you are putting your page on a UNIX host. This is
a work in progress and should be completed in time. Bearing that in
mind, feel free to read on and enjoy! Any comments or feedback is appreciated.
1. Overview
1.1 What is HTML Anyway?
HTML(Hypertext Markup Language) is a type of SGML(Standard General
Markup Language). Examples of other popular SGMLs include LaTex
and Postscript, which are both much more complicated than HTML.
HTML is used to create World Wide Web(WWW) pages on the Internet
and on intranets and LANs. The current version of HTML is 3.0,
with both Netscape and Microsoft having custom features added
on top of HTML 3.0. The HTML language is comprised of tags, or
keywords, which WWW clients/browsers interpret to format text
and images properly.
1.2 Background and Web Server Platforms
The Web and the Internet in general owe much to the UNIX community
and X-Windows. The majority of servers on the Internet today are still UNIX based,
although Windows NT and a few other platforms are becoming more popular. With
the increasing popularity of the Internet and the Web with home users, and the
availability of inexpensive Internet software and Web browsers, people are now
connecting to the Internet from a variety of platforms including Windows 3.x,
Windows NT or '95, Macintosh/System 7, and of course UNIX based systems.
Several problems arise when dealing with a multiple platform environment,
most of which are handled for users by the TCP/IP protocol. However,
there are some issues that the prospective Web author needs to be aware of such
as the server settings of the site where the document is to reside on. A WWW server
has several options which are configurable by the webmaster, one of which is the
directory name from which to serve documents from when a client requests a filename
with ~<username> in the path, as in http://bitsmart.com/~wegster, as well as
a default document name to look for. Most servers use a default directory named
public_html, and a default document name of index.html. This means that when you
use your browser to open the document http://bitsmart.com/~wegster, the
server looks first for the home directory for user wegster and the subdirectory public_html
for a file named index.html.
Note that Windows 3.x does not allow filename with an extension longer
than 3 digits(as in command.com, test.doc, etc.), so if you write your files
on a Windows 3.x or DOS system, you will need to name your files with a .htm extension.
This is fine for writing the file and using a browser to check it, but be aware that if you put
your page(s) onto an Internet site you will need to rename the file with the proper .html
extension and create a public_html subdirectory for your HTML pages. If the particular s
ystem you place your page on is configured unusually and you find that
http://<host>/~<username> doesn't seem to work you will have to contact the system
administrator to find out the default directory and/or file name.
2. Creating Your First HTML Document
2.1: Write the text of the document
Start off by using a text editor or word processor that is able
to save in text mode. There are several HTML editors available
both commercially and over the Internet, but trying to use most
of them before knowing the basics of HTML can be a frustrating
experience. Type the text of your document into the editor, remembering
not to use the word processing features like fonts, bold, italics
and styles- these will not be saved and are of no use for an HTML
document. Include the entire textual content of the document
at this point; don't worry about how to make words into links
or how to insert graphics, that comes later. Include text which
you can make links to URLs from such as your favorite bookmarks,
the college's homepage, your favorite search engine or zine, or
any other favorite web sites. Separate your paragraphs with a
single blank line between them and the same for section headings-all
spaces in an HTML document are treated as a single space, so at
this point the blank lines will only be used by you to determine
where to add your HTML tags later on. Try to keep the length
to a page or two- if you want to make it longer, consider breaking
it up into separate pages for different subjects. Once you are
happy with the content of your document, save the file as a text
file into your home directory as index.htm.
2.2: Learning Basic HTML
2.2.1: Text and Tags
HTML documents are comprised of two things-text and tags.
Images, sounds, animations, and other elements are all referenced
by using the appropriate HTML tags and URLs or filenames. Most
HTML tags consist of an 'opening tag' and an 'ending tag.' The
opening tag marks the start of that tag or style, and the ending
tag marks where the browser should stop treating the text according
to the open tag. Example:
<TITLE> My First WWW Page </TITLE>
<TITLE> marks the starting section of the page title, which
by will be displayed at the top of the browser window. The browser treats
everything between the <TITLE> and </TITLE> as text
to be displayed in the browser title bar. If you had not closed
off the TITLE section, the entire document would try to be displayed
as the title, an undesirable result. Ending tags always consist
of the opening tag name prefixed by a forward slash( as in </TITLE>),
and all HTML commands and tags are surrounded by the special characters
< and >.
2.2.2: The HTML, HEAD, and BODY tags
The simplest HTML document consists of the following:
<HTML>
<HEAD>
head material
</HEAD>
<BODY>
body material
</BODY>
</HTML>
The HEAD section of a document may contain several special tags,
but the main one that is used is the TITLE section. The other
tags are rarely if ever used in the majority of HTML documents
on the Web. The TITLE text that you specify is what will appear
in the Browsers Window when you open the document. You should
make this a short line such as "My Home Page" or something
only a few words in length. The way you do this is to insert
the begin <TITLE> tag and the end </TITLE> tag _around_
the text in your document that you want to be the title, example:
<HTML>
<HEAD>
<TITLE>Welcome to Scott's Home Page!</TITLE>
</HEAD>
<BODY>
body material
</BODY>
</HTML>
If you have already saved your text file from a text editor, you
can insert the above tags around your text, with the text in your
file going where it says body material up above. Give the document
a title if you didn't make one in your text file already.
The BODY section of a document is where all of the document's
information is displayed. This includes all plain text, images,
and links of any kind. The <BODY> tag takes several optional
arguments for setting the background color, link color(s), and
other properties but ignore these for now; they will be summarized
in the reference section and discussed in class.
2.2.3: Headings and Sections
The next step in converting your text document into a 'real'
HTML document is to separate your page into sections where appropriate.
Usually content is divided up into like subjects or links. Sections
need to have a title for each, and this is accomplished through
the use of HTML headings. Headings are diplayed in a bold large
font and are denoted by the use of a start and end tag. There
are 6 levels of headings, with level 1 being the largest and level
6 the smallest. The start tags therefore are <Hn> where
n is the level of the tag, and the end tag is </Hn>. You
should use the <H1> tag sparingly, usually only for your
page title. Example:
<HTML>
<HEAD>
<TITLE>Welcome to Scott's Home Page!</TITLE>
</HEAD>
<BODY>
<H1>Scott's Home Page</H1>
</BODY>
</HTML>
Note that HTML is case insensitive and space insensitive, the
tags could have been written as <h1> and </h1>, or
the tags could have been on different lines withthe same result.
Next we want to choose the appropriate points in the document
to insert other section titles. It is considered bad form to
skip heading levels, so the next group of headings should be <H2>
if your top level heading was <H1>. The use of headings
is similar to an outline or table of contents in a book- the further
down in a document the smaller the text. If different sections
are of equal importance and not a subdivision of a parent section
then use the same level for these. Go through your document now
and find your headings; you may need to create some section titles/headings.
Surround these with <H2> and </H2> tags. By default
a line break will be inserted after each heading.
2.2.4: Line Breaks
Most browsers will automatically wrap text to fit the browser
window by default. For most lines this is fine, but for times
that you want specific formatting or only specific text on a line
you need to manually insert line breaks. This is done with a
single <BR> tag at the end of each line you want to force
a line break. Example:
<BODY>
<H1>Scott's Online Resume</H1>
Scott Wegener<BR>
101 Some Street<BR>
Bremerton, WA 98312<BR>
(360)111-1111<BR>
wegster@bitsmart.com<BR><BR>
<H2>Objectives:</H2>
will produce this:
Scott's Online Resume
Scott Wegener
101 Some Street
Bremerton, WA 98312
(360)111-1111
wegster@bitsmart.com
Objectives:
Without using a <BR> tag the text looks like this:
Scott's Online Resume
Scott Wegener
101 Some Street
Bremerton, WA 98312
(360)111-1111
wegster@bitsmart.com
Objectives:
Start up your browser now and load your home page. This is accomplished
by using file:///g:/students/<your account directory>/index.htm
for the URL to open substituting your path after file:///g: Look
through the document- it may not be pretty quite yet, but make
sure that your title in the browser window and your headings are
correct and properly closed. If you forget to close a heading
or other tag, that tag will be applied to the end of the document.
Go through your document now and see if there are any lines which
you want to force a line break on. At the end of these lines
go ahead and put the <BR> tag at the end of the line.
2.2.5: Styles
Styles for HTML are similar to the text formatting commands you
have available on your favorite word processor. You can make
text bold or italic, change the font, underline text, use subscripting
and other text styles as well. There are two types of styles
avaliable in HTML- physical and logical. Both types can be used
to produce the same effects as we will see.
A physical style applies it's tag/style without regard for user
settings in the Options/Preferences section of the user's browser.
The following tags are available for physical styles:
Physical Styles
Tags: | Description: |
<B> text </B> | renders the text in a boldface font |
<I> text </I> | italicizes text |
<TT>text</TT> | renders text in a typewriterfont |
<STRIKE>text</STRIKE> | renders the text in strikeout |
<U>text</U> | underlines the text |
<SUB>text</SUB> | subscripts the text |
<SUP>text</SUP> | superscripts the text
|
<SMALL>text</SMALL> | causes the text to be rendered
in a font smaller than the current text. |
<BIG>text</BIG> | causes text to be rendered largerthan current text |
<BASEFONT SIZE=value> | Value of 1-7, 1=small, 7=large, default of 3 is normal. Note to reset the size you need a </FONT> tag. |
<FONT [SIZE=[+ | -] value] [COLOR=#<red><green><blue>][FACE=<face>]> |
sets current font size. If using + or - then is in relation to current font size. The color sets the text color and the face attribute sets a font type like "Roman". If the type doesn't exist the face value will be ignored. |
<BLINK>text</BLINK> | Makes the text . |
Note that some browsers allow nested tags and some don't- <i><b>text</b></i>
may either produce italisized bold text or may just
produce the inner tag(bolded) only.
Logical styles also are used to change font characteristics, but
each browser may treat a logical style differently. Emphasized
text( <em> ) is almost universally rendered as italicized
but may not be on some older browsers, and some browsers may allow
user configuration for each logical style. Logical tags include:
Logical Styles
Tags: | Description: |
<EM>text</EM> | emphasizes text, typically italics |
<STRONG>text</STRONG> | used for strong emphasis, typically
bold. |
<CODE>text</CODE> | used for program code, typically a fixed width font |
<SAMP>text</SAMP> | used for sample output, typically a fixed width font |
<KBD>text</KBD> | used for displaying a keyboard key,fixed width |
<VAR>text</VAR> | used to define a variable like user id, fixed width |
<DFN>text</DFN> | display a word definition, fixed width font |
<CITE>text</CITE> | used for citations, typically a fixed width font |
Note that the fixed width font used for each tag above is different from one another.
2.2.6 Other Text Tags
The following tags are for text manipulation and are considered
neither logical or physical.
Tags: Description:
<BR [CLEAR=[LEFT | RIGHT | ALL]]> line break. CLEAR causes the next line to
be printed when the left, right, or both
margins are clear, useful for when the
next line should start after an aligned
image.
<HR> Inserts a horizontal rule(line) for a break
<NOBR>text</NOBR> disallows line breaks, used for making wide
lines
<CENTER>text</CENTER> center justifies the text between the tags
<P>text lines</P> paragraph break, the </P>is optional.
Similar to a <BR> but drops down 2 lines
instead of only 1.
<P [ALIGN=[LEFT | RIGHT | CENTER]]>paragraph</P>
justifies the entire paragraph
<PRE> text </PRE> preformatted text, uses an opening and
closing line break,
useful for when you want to have several
spaces between words for format. This
section is using preformatting, select View
Document on your browser to see the tags.
<BLOCKQUOTE>text</BLOCKQUOTE> indents text on both sides and puts a blank
line below
<WBR> allows a line break within a <NOBR> area
<ADDRESS>address</ADDRESS> used for mail or email addresses, typically
rendered in italics
Note that there are other tags as well but they are beyond the
scope of this document, for more information conduct a Web search
for 'HTML Guides' or purchase one of the available books on HTML publishing.
Go through your document now and insert <P> tags before
each paragraph break. Typically this will be immediately after
each heading(<Hn>) tag you have as a minimum. After that
go ahead and bold and italicize using either the logical(preferred)
or physical methods described above. Save your document without
quitting the editor and hit the Reload button in your browser
to see the updated changes. Continue to add text manipulation
tags until you are happy with the results so far.
2.2.7 Lists
If you have a list of your favorite links on your page or a list
of related information you may want to use a list. The most commonly
used types of HTML lists are definition lists, ordered lists,
and unordered lists. A list may contain another list inside of
it(nested) like a table of contents with subsequent indentation,
but should not contain any headings inside a list.
An unordered list is by far the most used on the WWW; it consists
of a list indented with bullets. The opening tag for an unordered
list is <UL> and the closing tyag is </UL>. List
items start with the tag <LI> at the start of their line.
The syntax for the <UL> tag is: <UL [TYPE= {DISC | CIRCLE
| SQUARE]]>, with the default being a disc. Example:
Fruits
<UL>
<LI>Bananas
<LI>Oranges
<LI>Apples
</UL>
with the browser output being:
Fruits
An ordered list is similar to an unordered list with the exception
that instead of bullets, each item in the list is numbered. The
format of the ordered list tag is as follows:
<OL [START=startvalue] [TYPE=[A | a | I | i| l]> where
START is the starting value for the first item, and type sets
the type of labelling/numbering as follows:
A Capital letters
a lowercase letters
I large roman numerals
i small roman numerals
l arabic numbers(default)
A definition list can be used for terms and definitions. The
opening tag for a definition list is <DL> and the closing
tag is </DL>. A term to be defined uses the tag <DT>
at the start of the line with no closing tag, and a term definition
uses the <DD> tag at the start of a line with no ending
tag. Example:
<DL>
<DT> HTML
<DD> HTML, or HyperText Markup Language, is an SGML used
for the production of WWW pages.
<DT> Internet
<DD> The Internet is a worldwide collection of computers
forming a global network running the TCP/IP protocol.
</DL>
This would produce as output on your browser:
- HTML
- HTML, or HyperText Markup Language, is an SGML used for the
production of WWW pages
- Internet
- The Internet is a worldwide collection of computers forming a
global network running the TCP/IP protocol.
Go back through your document and implement a list of your choice,
adding new text into your document if nescessary, but please make
the content relative to your page and don't just include one of
the example lists.
2.2.7 Inline Images
Images may be displayed in a document, either as standalone
items or as anchors/links to an URL. We will discuss the simplest
form now and wait until the section on anchors to use images as
links. Most browsers currently can handle Graphic Interchange
Format(GIF) images and JPEGs. Other image types may not be supported
by all browsers. The format of the tag for images is:
<IMG SRC=<image URL. [ALIGN=[TOP | BOTTOM | MIDDLE | LEFT
| RIGHT] [VSPACE=<vspace>] [HSPACE=<hspace>] WIDTH=<width>[%]]
[HEIGHT=<height>[%]] [BORDER=<border>]. There are
other options as well, refer to a full HTML guide for other options.
The options listed above are:
Option: Description:
SRC=<image URL> location of the image to be displayed
ALIGN= determines how the image is placed
TOP aligns image with tallest item in the line
BOTTOM aligns bottom of image with baseline
MIDDLE aligns middle of image with baseline
LEFT places left justified and places text to right
RIGHT opposite of above
VSPACE= controls the vertical space in pixels above and below
the image
HSPACE= controls the horizontal space in pixels to left and
right of the image.
WIDTH=<width>[%] if % is used, then relative to window
size, otherwise is width in pixels
HEIGHT=<height>[%] similar to above
BORDER=<border> sets border for image in pixels
Example:
<IMG SRC= "https://members.tripod.com/~wegster/graphics/ie_static.gif" ALIGN = TOP> testing this out. will produce:
testing this out.
Note that this image is NOT a link to anything at this point(yet).
At this point, you may not have any images to place on your web
page. You can bring in a picture and scan it, saving it as a
GIF for later placement on your page, and/or can search the 'net
for images or icons that you want to use for your page. You can
save any image on a page by right clicking with your mouse and
then selecting the save option. This works for any type of image,
including backgrounds. Browse a few sites or do a Web search for
"Web backgrounds", "background images", or "clip art"and try to find a background
that you want to use as well as any other images. Try to limit
them in size to less than 10k. Keep in mind that some images may have use restrictions
on them- some graphic sites have free or public domain images, which you may use, and others
may be copyrighted(which you may NOT use), or request that you put a link on your page to
their site. For a good image or background that's a pretty fair deal.
2.2.8 Anchors and links
OK, at this point, your page should be fairly presentable, with
the text laid out the way you want it, a few headings, some text
effects, and perhaps a list or two. But what about links? That
IS one of the points in making a Web page right? You bet. Unfortunately,
this is one of the areas where the most mistakes are made, so
I saved this for one of the last things you will do on your page.
An anchor, simply stated, is a hyperlink. When a user clicks
on an anchor, the browser may open a new document at the same
or a totally different site, or may just jump to a specified point
within the same document. there are several formats for the ANCHOR
tag:
<A NAME=(anchor)>text</a> This will 'name' a location
in the document, allowing a jump to it later from another
point in this or another doument.
<A HREF=#(anchor)>link text</a> Creates a link to
the named anchor in the current document.
<A HREF=(URL)>link text</a> Will make a link from
link text to the URL specified
HREF stands for a HyperText Reference(a link), which may be use
any of the protocols recognized by your browser, typically including:
There are a few options for anchors not discussed here. A sample
line which would use an image as a link is as follows:
<a HREF=mailto:wegster@bitsmart.com><IMG SRC="https://members.tripod.com/~wegster/graphics/email3.gif">Email Scott</a>
This would produce: Email Scott
Hey, what's that 'mailto:' thing you may ask? The mailto:<address>
is similar to a protocol(it uses the protocol field in the 'URL')
and is a builtin 'protocol' in graphical browsers like MS Internet
Explorer and Netscape. When a mailto: link is selcted the browser
will launch the email program specified under Options and Compose
a new message to the person specified in the <address>.
Also notice how the image was made into the link by placing the
</a> tag AFTER the full tag for the image.
Now that you know how to make links in your document(s), go ahead
and try making a few links on your page to your favorite sites
or to some of the URLs in your bookmark file. Make sure you save
your work and reload the document in the browser to test it.
You should now have a working Web page once you get your links
working. If you want to expand your page, surf the 'net for a
while looking for interesting links and save them as bookmarks
or use copy and paste to place the links into your document.
If you find a lot of links on a related subject you may want to
create a second(or more) document dedicated to a single subject
and make a link to the new page from your home page. Most of
all, have fun with your page- just keep it in good taste.
3. Design Issues: Tips and Common Errors
- Try to avoid large images on your home page. Users with slow
connections won't want to wait for large image files to load.
- If you have large graphic images to diplay, show a smaller thumbnail
image on your page and allow the user to click on it to access
the larger image.
- Use relative pathnames when referring to other documents or images
on the same system as your page. This minimizes typing and makes
changes easier later on if you rename directories.
- Do not overlap tags, example:
This is a bad line of <strong>code.
<strong>What is wrong with it?</em></strong>
If you must use more than one style, then nest them instead:
This is much <em>better.
<strong>These tags don't overlap.</strong></em>
- Use line breaks( the <BR> tag) and paragraph breaks(<P>)where a break or seperation
is important; don't rely on the browser auto-wrap feature.
- Do not use the characters <, >, " and &. These
are special characters used for denoting HTML tags. To physically
display these, use: <, >, " and &.
- Remember to close all quotes.
- Include a trailing / when referencing a directory-some browsers
won't handle the name correctly otherwise.
- Make meaningful links-avoid using the phrase 'click here' for
your links, embed the links within the content of your documents.
- Avoid spaces around links or the spaces will be underlined as
well. Looks bad.
- Avoid the use of the <BLINK> tag- this can be highly annoying
to people's eyes.
- Use upper level headings sparingly (<H1> and <H2>).
Try to only use <H1> for the top level title.
- Try to use logical rather than physical styles. This allows
the browser preference for each style to be used instead of forcing
a single view on the document.
- Be carefule not to 'squash' images by using both the HEIGHT=
and WIDTH= options in an <IMG> tag; use one or the other
and let it autoscale.
- Don't set a border to 0 on an image that is a link, it may not
be apparent to others that the image is a link without the border.
Last modified: Tuesday, November 19, 1996