Upload
others
View
67
Download
0
Embed Size (px)
Citation preview
HTML (feb'10)
© A.Lioy - Politecnico di Torino (2009-2010) E-1
HTML(HyperText Markup Language)
Antonio Lioy < [email protected] >
english version created byg y
Marco D. Aime < [email protected] >
Politecnico di Torino
Dip. Automatica e Informatica
History HTML 2.0 (nov’95 = RFC-1886)
specifies the de-facto standard of 1994
HTML 3 2 (1996) HTML 3.2 (1996)
compatible with 2.0
adds tables, applets, superscripts, subscripts, text surrounding images, …
HTML 4.01 (dec’ 97 – apr’ 98 – dec’ 99)
last (?) version of HTML
XHMTL 1.0 (jan’00 – aug’02)
rewriting of HTML 4.01 with XML
strict / transitional / frameset
HTML (feb'10)
© A.Lioy - Politecnico di Torino (2009-2010) E-2
HTML documents are normal US-ASCII texts
therefore, letters with accents or other “extended” characters are not allowedcharacters are not allowed
... enriched with hypertext and hypermedia links
... and with limited text formatting capabilities
all these additional capabilities are achieved through annotations expressed with tags
The tags enclosed between the symbols “less than" and
“greater than"
usually they are paired (start tag – end tag) usually they are paired (start tag end tag)<h1> ... </h1>
but can also be standalone<br> ( <br /> in XHTML)
general rule: the final tag is the same of the initial one, preceded by the symbol /
they are case insensitive in HTML and lowercase in XHTML
therefore it is better to write them always lowercase
HTML (feb'10)
© A.Lioy - Politecnico di Torino (2009-2010) E-3
The attributes you can better characterise a tag by using a set of
attributes
every attribute consists in a variable with an every attribute consists in a variable with an assigned value, placed inside the opening tag (e.g. <hr width="90%">)
The browsers visualising HTML documents (and navigate them)
requires an appropriate program: an HTML browser
a browser is an interpreter: a browser is an interpreter:
reads the source code (HTML + extensions)
tries to understand it (hoping there are no errors …)
does its best to visualise what is described by the source code
attention! not every browser visualises a given attention! not every browser visualises a given document in the same way
textual browser: Lynx
graphical browsers: Firefox, SeaMonkey, Netscape, Opera, Internet Explorer, ...
HTML (feb'10)
© A.Lioy - Politecnico di Torino (2009-2010) E-4
The war of the browsers there is not a single browser much more diffused
than the others (personal preferences, platform, …)
you must try to write HTML pages that fit every you must try to write HTML pages that fit every browser
some statistics:
www.w3schools.com/browsers/browsers_stats.asp
(jan-10) IE8=14%, IE7=12%, IE6=10%, FX=46%, Chrome=11% Safari=4% Opera=2%Chrome 11%, Safari 4%, Opera 2%
www.upsdell.com/BrowserNews/stat.htm
high variability
www.pgts.com.au/pgtsj/pgtsj0212d.html
difficult to identify various browsers with certainty
General structure of HTML documents
<!DOCTYPE HTML PUBLIC ...><html><html>
<head><title> title </title>... other headers ...
</head>
<body>
</html>
<body>text of the document
</body>
HTML (feb'10)
© A.Lioy - Politecnico di Torino (2009-2010) E-5
Document type declaration (DTD) required to identify the type of HTML (from 4.01):
strict.dtd = all non-deprecated elements
loose dtd = includes deprecated elements loose.dtd = includes deprecated elements
frameset.dtd = loose + use of frames
<!DOCTYPE HTML PUBLIC"-//W3C//DTD HTML 4.01//EN“"http://www.w3.org/TR/html4/strict.dtd">
<!DOCTYPE HTML PUBLIC<!DOCTYPE HTML PUBLIC"-//W3C//DTD HTML 4.01 Transitional//EN""http://www.w3.org/TR/html4/loose.dtd">
<!DOCTYPE HTML PUBLIC"-//W3C//DTD HTML 4.01 Frameset//EN""http://www.w3.org/TR/html4/frameset.dtd">
Example<!DOCTYPE HTML PUBLIC
"-//W3C//DTD HTML 4.01 Transitional//EN""http://www.w3.org/TR/html4/loose.dtd">p // g/ / /
<html><head><title>Example of HTML page</title></head><body>Here I can insert my document’s text, which will be visualised as simple text if I don’t use any formatting tag.</body></html>
HTML (feb'10)
© A.Lioy - Politecnico di Torino (2009-2010) E-6
Notes browsers do not signal errors: they ignore them!
white spaces and end of lines:
multiple spaces are treated as a single spacemultiple spaces are treated as a single space
end-of-lines has no effect on the formatting
the title (and in general the data inside the head) is very important since it is the element most used by automatic indexing services
HTML is an extensible language HTML is an extensible language
often new tags are added
browsers ignore unrecognised tags (or attributes) … but visualise the text enclosed inside the tag
Meta-data inside the HEAD part
data useful for:
indexing the HTML page indexing the HTML page
providing information to the web server and / or to the browser
syntax:
<meta name="author" content="Antonio Lioy">t "k d " t t "ht l"<meta name="keywords" content="html">
<meta http-equiv="Content-Type"content="text/html; charset=ISO-8859-1">
<meta http-equiv="Expires"content="Sun, 28 Feb 2010 23:59:00 GMT">
HTML (feb'10)
© A.Lioy - Politecnico di Torino (2009-2010) E-7
HTML internationalization HTML < 4 written with the ISO-8859-1 encoding
HTML-4 incorporates RFC-2070 that deals with internationalization (i18n in brief) of HTMLinternationalization (i18n in brief) of HTML
adoption of the ISO/IEC:10646 standard as the document character set
user agebts determine the character encoding according to (from highest to lowest priority):
HTTP response header "Content-Type: charset=xxx" HTTP response header Content Type: charset xxx
proper META tag in the HTML header<meta http-equiv="Content-Type" charset=xxx" …>
charset attribute of an element that designates an external resource
The “link” tag inside the HEAD part
logical connection to documents somehow related to the current oneto the current one
multiple LINK tags are possible
attributes:
href=URL
rel=alternate lang=…
rel=alternate media=…
rel=stylesheet
rel=start / contents / prev / next / …
type=MIME-type
HTML (feb'10)
© A.Lioy - Politecnico di Torino (2009-2010) E-8
Example of LINK
<head>
<title>Chapter 2</title><title>Chapter 2</title>
<link rel="contents" href="../toc.html">
<link rel="next" href="chapter3.html">
<link rel="prev" href="chapter1.html">
<link rel="stylesheet"
type="text/css" href="mystyle.css">yp / y y
</head>
Tools for checking HTML http://validator.w3.org
allows verifying if a page fully satisfies the official syntaxsyntax
can provide detailed explanations on the errors and on how to correct them
http://tidy.sourceforge.net
“cleans” the HTML code and transforms it to more recent versionsrecent versions
can be installed locally or used through the network http://cgi.w3.org/cgi-bin/tidy
problems with dynamically generated HTML (cannot validate an ASP or PHP source page)
HTML (feb'10)
© A.Lioy - Politecnico di Torino (2009-2010) E-9
Tools for checking HTML validation of dynamic pages must be performed on
the client rather than on the server
therefore, you need: therefore, you need:
a special plugin for the browser
manually visit all the pages to be validated
an excellent plugin for FireFox:
http://users.skynet.be/mgueury/mozilla/index.html
configured in "SGML parser" mode to have the same results of validator.w3.org
Comments can be inserted at every point in the text
can span multiple lines
enclosed inside <! and > enclosed inside <!-- and -->
examples:
<!–- this is a comment -->
<!--this commentspans four lines-->
HTML (feb'10)
© A.Lioy - Politecnico di Torino (2009-2010) E-10
Headings there are six levels of headings or titles:
<h1> . . . </h1>
<h2> </h2> <h2> . . . </h2>
<h3> . . . </h3>
<h4> . . . </h4>
<h5> . . . </h5>
<h6> . . . </h6>
should be used according to the logical meaning (semantics), not to achieve a specific formatting
in particular, it is not correct to use <hN> if not preceded by <hN-1>
Text blocks <p> . . . </p>
starts and terminates a paragraph
after terminating a paragraph browsers break the after terminating a paragraph, browsers break the current line (and may also insert a small vertical space)
<br> (HTML)<br/> (XHTML)
inserts a line break inserts a line break
<hr> (HTML)<hr/> (XHTML)
inserts an horizontal rule (line)
HTML (feb'10)
© A.Lioy - Politecnico di Torino (2009-2010) E-11
Horizontal rules (tag <HR>) (*) can specify the following attributes:
size= n_pixel (height)
width n pixel (absolute width) width= n_pixel (absolute width)
width= percentuale (width as % of the container)
align=left / right / center
by default the line is centered and has a width of 100%
attention!: HTML presentational attributes have been deprecated since style sheets exist (e.g. CSS)
Lists unordered list:
<ul> ... </ul>
ordered list: ordered list: <ol> ... </ol>
directory (deprecated): <dir> ... </dir>
menu (deprecated): <menu> ... </menu>
an element of (any) list: <li> ... </li>
HTML (feb'10)
© A.Lioy - Politecnico di Torino (2009-2010) E-12
Options for lists symbol preceding the items in unordered lists:
type=disc / circle / square
numbering style in ordered lists: numbering style in ordered lists:
start= index_of_the_first_item type=A / a / I / i / 1
that is:
alphabetic list (uppercase or lowercase)
roman numbers (uppercase or lowercase)
decimal numbers
can be specified for the whole list (ol) and for the single element (li)
List exampleTo pass the exam:<ol type="I"><li>attend the lessons</li><li>attend the lessons</li><li>perform the lab exercises</li></ol>
To pass the exam:I. attend the lessonsII. perform the lab
exercises
browser(note the indentation)
HTML (feb'10)
© A.Lioy - Politecnico di Torino (2009-2010) E-13
Definitions definition lists:
<dl>d t 1 /d<dt> term 1 </dt><dd> definition 1 . . . </dd><dt> term 2 </dt><dd> definition 2 . . . </dd>
. . . . .</dl>
Text formatting a text block can be characterised based on the role
it plays in the document (logical style) ...
... or based on the way we want to visualise it ... or based on the way we want to visualise it physically (physical style)
best to prefer logical styles and to leave greater freedom to the final user in defining how the text should appear on the screen
with XHTML (strict), the formatting tags have finally disappeared (you need to use CSS)
HTML (feb'10)
© A.Lioy - Politecnico di Torino (2009-2010) E-14
Formatting: physical styles <b> ... </b>
bold text
<i> </i> <i> ... </i>
italic text
<u> ... </u>
underlined text
<tt> ... </tt>
monospace text (like typewriter)
<blink> ... </blink>
blinking text
Formatting: physical styles <sup> ... </sup>
superscript text <sub> </sub> <sub> ... </sub>
subscript text <s> ... </s><strike> ... </strike>
strikethrough text
HTML (feb'10)
© A.Lioy - Politecnico di Torino (2009-2010) E-15
Formatting: logical styles <cite> citation </cite >
<code> code (program) </code>
<em> emphasis </em>
<kbd> keyboard </kbd>
<samp> example </samp>
<strong> reinforcement </strong>
<var> variable </var> <var> variable </var>
<dfn> definition </dfn>
Other logical styles <big> big text </big>
<small> small text </small>
can be nested to achieve an increased effect: can be nested to achieve an increased effect:
<big> <big> very big text </big> </big>
HTML (feb'10)
© A.Lioy - Politecnico di Torino (2009-2010) E-16
Formatting: text blocks <address> . . . </address>
address (typically e-mail)
<blockquote> . . . </blockquote>
long citations
<center> . . . </center>
centered text
<pre> </pre> <pre> . . . </pre>
preformatted text (spacingis preserved)
Reference to non US-ASCII characters HTML normally written in US-ASCII with MSB=0
ASCII characters > 127 must be encoded, as also characters with special meaningcharacters with special meaning
pay attention to the final ";"
to have ... write ...< <
> >
& &important since these areHTML d h t& &
" "
È È
é é
© ©
HTML reserved characters
HTML (feb'10)
© A.Lioy - Politecnico di Torino (2009-2010) E-17
Reference to non US-ASCII characters section 24 (pg. 299) in the HTML 4.01 standard
includes:
ISO 8859 1 extended characters e g » = » ISO 8859-1 extended characters – e.g. » = »
mathematical symbols – e.g. ∃ = ∃Greek letters – e.g. α = α international symbols – e.g. € = €
Links ( hyperlinks ) by using hyperlinks you can move automatically
from a resource to another
the HTML tag identifying the presence of a link is the HTML tag identifying the presence of a link is named anchor, and is identified with <a>
HTML (feb'10)
© A.Lioy - Politecnico di Torino (2009-2010) E-18
Hot to insert an hyperlink open the anchor opening tag: <a
insert a space
insert the URL of the resource, preceded by href=and enclosed by apices
close the opening tag with >
insert the text to highlight (the one associated with the anchor, called "hot word")
close the anchor: </a>
<a href="http://www.polito.it">POLITO</a>
Absolute and relative links it is possible to omit parts of the URL
in this case, it is called a “relative” link
the missing parts assume the same value of the the missing parts assume the same value of the current page
examples of relative links (supposed to be placed inside the page http://www.lioy.it/01eny/exam.html)
relative link absolute linkrelative link absolute link
biblio.html http://www.lioy.it/01eny/biblio.html
../cv.html http://www.lioy.it/cv.html
res/a1.html http://www.lioy.it/01eny/res/a1.html
HTML (feb'10)
© A.Lioy - Politecnico di Torino (2009-2010) E-19
Document access points
Link without specification of an access point Title
Title doc01
Address
Title doc02
Address
Address
Document access points in the target document, define the access point
through an anchor with the attribute NAME<a name="cuc ita"><a name cuc_ita >La cucina italiana</a>
in the origin document, include the name of the access point in the URL
<a href="doc2.html#cuc_ita">
the access point can also be any element identified through its "id"
<h1 id="cuc_ita">La cucina italiana</h1>
HTML (feb'10)
© A.Lioy - Politecnico di Torino (2009-2010) E-20
Images <img src="polito.gif">
inserts the image contained in the file polito.gif <img src="polito.gif"
alt=“Picture of Politecnico">
inserts the image polito.gif, but, if the browser does not support graphics, it visualises the textPicture of Politecnico
diff b t li ki d i ti i difference between linking and inserting an image:
<img src="polito.gif">(inserts the image inside the page)
<a href="polito.gif"> (by following the link, you visit a page containing only the image)
Reciprocal positioningof text and images<img align=left ...><img align right ><img align=right ...><img align=top ...><img align=center ...><img align=texttop ...><img align=middle ...><img align=absmiddle ...><img align=baseline ...><img align baseline ...><img align=bottom ...><img align=absbottom ...>
HTML (feb'10)
© A.Lioy - Politecnico di Torino (2009-2010) E-21
Image formatting <img width=w height=h ... >
image size
allows rapid visualisation of the page (the browser allows rapid visualisation of the page (the browser does not need to download the image before knowing how much space should be reserved to it)
<img vspace=v hspace=h ... >
minimum distance between text and the image
<img border b > <img border=b ... >
size of the border
Font <font ... > ... </font>
font for the text block included within the tags
deprecated (use CSS) deprecated (use CSS)
attributes:
size= size color= color face= font-family …
the size can be specified in various units:
N (=1…7, default=3), +N, –N
suggested +N –N
HTML (feb'10)
© A.Lioy - Politecnico di Torino (2009-2010) E-22
Colors some predefined colors are accessible by name:
Black, White, Gray, Silver,Yellow, Red, Purple, Fuchsia,Yellow, Red, Purple, Fuchsia,
Maroon, Green, Lime, Olive, Aqua,Teal, Blue, Navy
other colors can be specified through their RGB hexadecimal code ( # rr gg bb )
example: <font color="#ffffff">White!</font>
Standard colorsblack = #000000
silver = #C0C0C0
green = #008000
lime = #00FF00
gray = #808080
white = #FFFFFF
maroon = #800000
olive = #808000
yellow = #FFFF00
navy = #000080
red = #FF0000
purple = #800080
fuchsia = #FF00FF
blue = #0000FF
teal = #008080
aqua = #00FFFF
HTML (feb'10)
© A.Lioy - Politecnico di Torino (2009-2010) E-23
Tables <table ... > ... </table>
attributes:
align= left / center / right align= left / center / right
border= size width= size (n_pixel or %) cellspacing= size cellpadding= size summary= text frame= void / above / below / hsides / lhs / rhs /
vsides / box / border
rules = none / groups / rows / cols / all
Table data <tr ... > ... </tr>
a row of the table
contains normal (<td>) or heading (<th>) cells contains normal (<td>) or heading (<th>) cells
<th ... > ... </th><td ... > ... </td>table data (or heading), which can span multiple cells, horizontally or vertically
colspan= number-of-columns colspan number of columns rowspan= number-of-rows
HTML (feb'10)
© A.Lioy - Politecnico di Torino (2009-2010) E-24
Optional elements of a table <thead>
heading
<tbody> <tbody>
content block
<tfoot>
footer
<caption>
caption text describing the nature of the table
Table: row, header, and data attributes align= horizontal-alignment
left, center, right
valign= vertical alignment valign= vertical-alignment
top, middle, bottom
baseline
bgcolor= color
HTML (feb'10)
© A.Lioy - Politecnico di Torino (2009-2010) E-25
Table: column groups <colgroup span=n width=… align=… valign=…>
structural group of n columns, everyone with the specified attributesspecified attributes
<col span=n width=… align=… valign=…>
definition of attributes for one or more columns
Frames division of a page in zones whose content is
specified by other HTML files
deprecated (use CSS and “include”) deprecated (use CSS and include )
<html><head> ... </head>
<frameset ...><frame ...><frame ...>
<html>. . .</html>
ht l<frame ...><frame ...>
</frameset ...>
</html>
<html>. . .</html>
<html>. . .</html>
HTML (feb'10)
© A.Lioy - Politecnico di Torino (2009-2010) E-26
Frameset and Frame the structure of a page organised in frames is
similar to the traditional one, replacing <body> with <frameset>
it is possible to nest the FrameSet tags to create complex page subdivisions
the content of every frame is specified through:
<frame src=URI name=...>
use the <noframe> tag for the text to be visualised use the <noframe> tag for the text to be visualised by browsers not supporting frames
Space spanned by frames it is possible to specify the portion of the page
occupied by each frame, by using:
percentage (of the available space) percentage (of the available space)
absolute value (in pixel)
"*" to use all the remaining space
in case of “overflow” the scrollbars (H & V) are activated
example (subdivision in 3 horizontal frames): example (subdivision in 3 horizontal frames):
<frameset rows=“20%,50%,30%”>
examples (vertical subdivision):
< frameset cols=“20%,80%”>
< frameset cols=“100,*,100”>
HTML (feb'10)
© A.Lioy - Politecnico di Torino (2009-2010) E-27
Frame navigation links should indicate into which frame (or window)
the target page should be visualised:
<a href=URI target="name of a frame"> … </a> a href URI target name_of_a_frame … /a
special values for target:
"_blank" (new window)
"_self" (in the same frame) = default
"_parent" (higher order frameset)
"_top" (span the entire window)
Frame example (I)<!– initial page --><frameset rows="80%,20%">
<noframe><noframe><p>The page cannot be shown</p>
</noframe>
<frameset cols="100,*"><frame src="menu.html"><frame src="p1.html" name="content"><frame src p1.html name content >
</frameset>
<frame src="footer.html">
</frameset>
HTML (feb'10)
© A.Lioy - Politecnico di Torino (2009-2010) E-28
Frame example (II)<!-- menu.html --><html><head> </head><head> . . . </head><body><p><a href="p1.html" target="content">Pag.1</a></p><p><a href="p2.html" target="content">Pag.2</a></p><p><a href="p3.html" target="content">Pag.3</a></p></body></html></html>
Inline frame (iframe) a frame handled as a single object (e.g. as an
image)
therefore, it can be placed at any point in the page therefore, it can be placed at any point in the page
initially supported only by IE
syntax:
<iframe src=uri …> . . . </iframe>
height=… width=… name=… frameborder=… marginwidth= marginheight=marginwidth=… marginheight=… scrolling=yes/no/auto align=… vspace=… hspace=…
the text inside the tag is ignored by browsers supporting iframe, and visualised by the others (use it for error signalling)
HTML (feb'10)
© A.Lioy - Politecnico di Torino (2009-2010) E-29
DIV and SPAN introduced in HTML 4.0
to group parts and apply formatting more easily
DIV identifies a block (typically browsers place a DIV identifies a block (typically, browsers place a line break before and after a block)
SPAN identifies a part inside a block
frequently used to create (with an appropriate CSS) a page layout without using tables or frames
“id” and “class” allow references from the CSS “id” and “class” allow references from the CSS
<div id="..." class="..."> ... </div>
<span id="..." class="..."> ... </span>
General attributes of HTML tags id = "string"
anchor for a link
reference to an element from a script reference to an element from a script
reference to a field in a form
reference for a specific style in CSS
class = "class1 class2 …"
list of classes to be used e.g. as CSS selectors
title = "title"
visualised when pointing to the element
lang = "language"
for automatic text reading (values: en it fr de …)
HTML (feb'10)
© A.Lioy - Politecnico di Torino (2009-2010) E-30
Favourite icon the little icon near the URL
a 16 x 16 pixel image
old browsers: old browsers:
only in MS icon format
in a fixed position and with fixed name = /favicon.ico
first step to standardization:
<link rel="shortcut icon" href="/icons/my.ico"
new browsers support the de-facto standard:
<link rel="icon"type="image/png" href="/icons/my.png">
/ / ytype="image/vnd.microsoft.icon">