The basics of HTML and CSS are often the very first lessons in “teach yourself to code” tutorials. This makes sense, as these two languages build the basic visuals of a website and are much easier for a beginner to wrap their head around than, say, creating an array in ruby. They teach just enough to get yourself up and running so that you can move onto the fund stuff, but rarely move beyond the simplest of elements, much less teach us why semantic HTML is important. It makes sense to start here, but the fact that these tutorials typically only teach very basic HTML, combined with the fact that many developers never return to these “basics,” means that our knowledge stays, well, basic. I hate to admit that I’m one of these people - I’d rather dedicate my time to learning cool things like flex box (with my favorite frog-based game) and beautiful CSS gradients than learn about all the elements HTML5 gives us beyond
section. But alas, my mentor here at 8th Light has (I suspect) caught me using one too many
div elements, and it’s high time I went back and augmented my HTML knowledge, which at this point has been the very minimum needed to hand-code a website. This might fly at some companies, but 8th Light is all about doing things right and not cutting corners, meaning it’s time for me to give up my constant use of
divs and learn to do things properly now as an apprentice, before I ever have the chance to subject a client to an abundance of
Semantics is the study of the meaning of words and phrases in language, and semantic elements are simply elements that have a meaning. In HTML, a semantic element describes its meaning. For example, the
header element is describing itself - “header” is both its name and its meaning. Alternatively, the element
div is non-semantic, because “div” is not a word that tells us anything about its content.
span is the other commonly used non-semantic HTML element.
Why Bother with Semantic HTML?
Using proper semantic elements in our HTML is important for a lot of reasons. First, it makes it easier for us to go back to and change things later. Also, since it’s easier to understand, a new person on the project can pick up the code they can learn it much faster. Second, it helps us when writing our CSS styles - we can change the look of the site without recoding all of the HTML.
In addition to helping human developers have an easier job, semantic HTML is also useful for browsers. Browser looks for semantic HTML, and when it finds it, two important things happen: first, visually impaired people are able to have speech browsers read the page to them properly. Using semantic HTML aids in accessibility. Second, search engines understand what the content is about and are able to rank your site more accurately. Semantic HTML helps increase SEO.
Non-semantic HTML elements include
div is short for “division,” as in, it divides your elements into chunks of code. A
div creates a block-line element (so essentially there is a line-break before and after it) while
span tags are inline and used for a small chunk of HTML inside a line (for example, a few words in a paragraph). I was unable to figure out the official meaning behind the term “span,” but think it might have to do with the literal meaning of “span,” which is “the full extent of something from end-to-end, as in “the bridge spans a quarter mile over the river.” This might make sense considering a
span element is in-line and spans exactly the area we assign it to (as opposed to the block-element
div, which has a line-break on either end, meaning it covers a bit more than the actual area we assign it to).
I made a quick codepen to show the advantage of using proper semantic HTML over
divs. Other than being the right way to do things, better for screen readers and therefore accessibility, and SEO-aiding, it also results in less code, which is always better.
Categories of Semantic HTML
html element is the room of the document. All other elements have to be decedents of this element.
Metadata contains information about the page (as opposed to everything that comes after in this list, which actually make something appear on the screen). Metadata includes information about styles, scripts, and data that help browsers and search engines render and use the page.
base element specifies the base URL to use for all the relative URLs that are contained within a document. There’s only one
base element per document.
head element is used to hold the rest of the metadata within it, including the page’s title and links to its style sheets, scripts, and google fonts.
link element shows relationships between the current document and an external resource and is what is commonly used to link to style sheets. This element is explained further later on.
title element defines the title of the document, which is shown in the browser bar or page’s tab. The
title can only contain text.
style element contains the style information for a document and is expected to link to a page written in CSS.
meta element is for any metadata that can’t be represented in one of the other meta-related elements listed above. This often includes the line
meta http-equiv="Content-Type" content="text/html; charset=UTF-8”
which designates the character set, as well as the line
meta name="viewport" content="width=device-width, initial-scale=1.0”
which is necessary for CSS media queries (which are used for responsive design styles LINK) to be recognized.
Content section semantic HTML tags include
These are very straightforward. The
body tag encompasses the entire body of your document and is second only to the
html tag. The
header element specifies a header for your document or section, and the
footer element species the footer. The
section tag groups together a thematic group.
h6 are headings, with
h1 being the most prominent, with the largest default style, with each heading decreasing in importance and size after
nav element is short for “navigation” and defines a block of navigation links that will take the user to other main pages. An
aside element definite content aside from the content it’s placed in. The information in an
aside should be related to its surrounding content. The
article tag specifies independent, self-contained content, like a news article, blog post, or comment. Finally, the
address element supplies contact information for its nearest
While this is all pretty obvious, there are a few things that often get skipped over in basic HTML tutorials. First, it’s a common misconception that you should only use
aside once per page but that is not the case. W3schools states: “the
header element should be used as a container for introductory content. You can have several
header elements in one document.” They provide the following example:
aside element is not only used for sidebars (though this example is so overused that you’d be forgiven for thinking this was the case). The
aside tag is for a section of the page that has content that is tangentially related to the rest, but should be separate from that content. Examples outside of a sidebar include an author biography, profile information, and related links.
article tag often gets replaced with the
section tag due to misunderstanding about what the
article tag is actually for, as well as confusion with these two tags in general.
article is for more than just what we think of as an “article” (like a news article) but is for any independent, self-contained content, which also includes things like a blog post or a comment on an article or blogpost. On the other hand,
section is for grouping distinct sections of content or functionality. While the exact difference is heavily debated, my understanding is that
section is broader than
article. This is a good example of how a
section could hold the all the blog content, with an
article used for each blog.
Again, fair few of these were new to me, despite the fact that I took a (paid, in-person) front-end dev class where I coded a few websites by hand. Lots of these get covered in the basic online tutorials, some of them don’t, and there are lots of interesting rules and abilities that the really basic ones have that the casual developer might not know about. Let’s dig into it.
p tag makes a paragraph. This might be the most well-known content tag out there. But did you know that this is more structural than logical? List elements (
ul) cannot be the children of a
p element. So if we had this code:
only the first line, “I need groceries” would be red. In order for the entire thing to be red, we would have to replace the
p elements with a
div element. Alternatively, we could close the
p tags and then write our CSS to style all the elements, but that would require more code.
It’s also important to remember that we shouldn’t use a
p element if there’s something else that is more specific that fits, for example the
address element to write an address.
hr tag is use to represent a paragraph-level thematic break, such as a transition to a new topic in a reference book or a scene change in a story. It results in a line across the width of the page and a blank line on either side. It does not have a closing tag.
Results in this:
hr tag makes that line. Most people would probably want to create a border that they could style to the desired width and color, but this is also an option. Here’s what the line really looks like (I zoomed way in to be able to see it properly):
pre tag represents a block of preformatted text. Essentially, you have have things like paragraphs and spacing that you would have in a text editor like Pages or Word be preserved in your code as if it were a standard text editor. For example, the spacing and layout of the words in the following poem is an important part of the poem and needs to be preserved. Instead of using all sorts of CSS to get it to look like this, we can just copy and paste into Sublime (my preferred text editor for writing code) surrounded by the
pre tags. (This is also useful when you’re writing a code example and want the example to maintain the spacing/indenting rules of that language.)
The html code looks like this (I simply copied text from my Pages document into Sublime, which kept all the weird spacing):
Results in this:
blockquote tag represents a section that is quoted from another source. (Don’t get this confused with
q tag, which you can read about in the next section.) The content of a
blockquote must be quoted from another source, which can be cited with
cite tag if possible. The
cite tag represents the title of a work. An example of both together is:
Only works can be cited, and people are not works, so the
cite tag in this case goes around the name of the book, not the author. It’s also important to note that a citation is not a quote. (Read about quotes in the
q element, which are inline rather than block quotes, in the next section.) If you’re creating quote symbols in spoken text, you wouldn’t cite the quote. For example, this is wrong:
This is correct:
The best way to think about the
figure tag is to think about a text book or manual that has different figures throughout the text. The figure could be an image, chart, graph, or some other type of information. Usually the author will discuss a top and write something like “see figure 4.1.” Figure 4.1 will reference this writing and also have a caption, like “a variety of method cards.” The figure is the
figure and the caption is the
figures elements are usually related to the surrounding flow and is self-contained. A
figure should be self-contained so that it can easily be moved around, from a page to its own dedicated page to an appendix. Also, it’s recommended that
figure elements be given their own name so that they can easily be referenced (“see figure 4.1”) as opposed to identified by where they are placed in the page (“see the figure to the right) in case the figure gets moved later.
figcaption child represents the caption of the figure. If there’s no
figcaption, there’s no caption.
ol element stands for “ordered list,” as in a list that is numbered, such as the steps in an experiment.
ul stands for “unordered list,” such as a grocery list.
li stands for “list item” and is the child of
li element's end tags can be omitted if the
li element is followed by another
li element or if there is no more content in the parent element.
An unordered list results in a list with bullet points. For example, this code:
By default, the listed items in an
ol element will be numbered starting with one. For example, this code:
Has this numbered result:
There is also the option to “reverse” the list. For example, if you were making a list of the top 5 best shows on Netflix right now, and wanted the list to start with 5 and end with 1, you could have:
to result in:
For even more control, you have give your
li elements a value. The following code has the same result as the image above:
Since these html elements are not commonly taught, it’s best to just say they are for creating certain types of lists and start with an example.
dl stands for “description list” and this element represents an list of things that are associated.
dt element represents a term, name, or part of a description within a description list (the
dl element). I couldn’t confirm this, but think that
dt stands for “description term.”
dd element represents a description, definition, or value.
dl requires and opening and closing tag. A
dt element, on the other hand, doesn’t need an end tag if it’s immediately followed by another
dt element or by a
dd element. The
dd element’s doesn’t need a closing tag if it’s immediately followed by another
dd element or by a
main element is used to signify what the main content of a page is. It’s different from
nav in that it’s not sectioning content. This means that it doesn’t contribute to the document outline. What it does do is help screen readers understand where the main content is. It came about because people thought that since we had a
aside, we should also have something that represents the main part of the content. Before this, people often created a
div and gave it a class of “main” or “content.”
Text level semantics
a element is most commonly seen immediately followed by the href attribute to create a link. In that case, the a is the hypertext anchor. The
a element is also seen used as a placeholder for where a link could go later or in navigation link of the current page.
a element can wrap around entire paragraphs, lists, tables, etc, as well as entire sections, so long as there aren’t any buttons, links or other interactive content within it.
em element represents emphasis in content. The placement of emphasis on certain words in a sentence can change that sentence’s meaning, and the
em tag achieves that stress in HTML. For example:
puts the emphasis on “You’re.”
em element is not the same as generic italics. For italics,
i is used. The
i element represents a span of text that carries a different mood or voice or indicates a different type of text, such as taxonomic designation, a technical term, a word or phrase from another language, or a thought.
Often, a class attribute is used on the
i element to identify why the element is being used (for example, a word in a different language as opposed to a taxonomic term). This is done so that if it needs too be changed at a later date, the author doesn’t have to search the entire document and annotate each use.
i with a class:
i result in the same style of slanted text, it’s important to use the right element, and a class name if necessary, for screen readers and for ease of change later on.
strong is to indicate strong importance, seriousness or urgency. On the other hand, the
b element is used to represent a span of text to which attention needs to be drawn for utilitarian purposes but does not convey any extra importance. An example of this might be a keyword in a text. The
b element is also often used with a class to identify why the element is being used and make changes later on easier. Again, although they both result in what we think of as “bold” text, it’s important to use whichever better fits the purpose for screen readers.
small element represents small print and other side comments that should have a smaller text than the text around it.
s element represents content that is no longer relevant or accurate, and results in a strikethrough of the text.
q element represented quoted text. However, using
q elements to mark up quotations is optional if you’d rather use explicit quotation punctuation. The
q element should not be used for times when quotations would be used for something other than a quote, for example when someone is being sarcastic.
dfm element represents the defining instance of a term. For example,
In this case, “Irregardless” becomes italicized.
abbr element represents an abbreviation or acronym. The explanation of the abbreviation/aconym is optional and is done with the title attribute. Doing this results in the explanation appearing the use hovers their mouse over the word. For example,
Results in a small box with the words “World Health Organization” appearing when the user hovers over “WHO”. (Unfortuantely, this was impossible to get a screenshot of, but run this through on your machine to see exactly what happens.)
ruby tag specifies a ruby annotation. A ruby annotation is a small extra text that’s attached to the main text to explain its pronunciation or meaning. The
rt element gives the information and
rp (which is optional) defines what to show for browsers that don’t support ruby annotations. It’s common in Japanese text.
In this example, the character needs and explanation, so it is put in the
When rendered, it looks like this:
There are lots of rules about how to handle things like compound words, nesting, ancestry, and other things, so if you’re going to use this, be sure to look at its HTML spec.
data element represents its contents, along with a form of those contents in the value attribute that is machine-readable. The value attribute must be present and must be a representation of the element’s contents in a machine-readable format.
data element is used when the computer needs some information but the user doesn’t. For example:
In this example, the computer can understand the value given (which is the CPU code of the products). The user doesn’t necessarily want to see this, but it needs to be there so that the computer can do other things with it.
time element is used to mark a time in the browser. It doesn’t show up as anything special to users (but that doesn’t mean it’s not still important!) The content in a
time element can include dates, times, time zones and durations. We use the “datetime” attribute then specifying a date.
code element changes the way a word looks to indicate that it’s code and not normal text. It represents a fragment of computer code. (Note that there is no formal way to specify which language the code is written in.)
For example, writing:
Creates this result:
The “time” word looks different because it was in the
code element, and this difference signifies to the user that this is fragment from a programming language.
var element represents a variable. It could be a variable in a mathematical expression or programming language, or an identifier representing a constant, a symbol for a physical quantity, a function parameter or just a placeholder for a variable. Text inside
var tags appear italicized to the user.
samp element represents sample output from a computer program.
kbd variable is short for “keyboard” and represents keyboard input. This is useful when giving directions. The HTML spec gives the following example:
Just like with the
code element, the text inside of
kbd has a different style when rendered:
sub element represents a subscript and the
sup element represents a superscript.
u element represents text with non-textual annotation. This could include labeling a piece of text as a proper nam in Chinese or labeling the text as misspelt. The
u element makes text underlined. It’s important to not use this when a user might think that the underlined text is a link, as links are often underlined to show that they are clickable. Also, there’s usually some other markup element that’s more appropriate to use than
u, such as
em for emphasis. The
u element is rarely used.
mark element represents text that is marked (highlighted) for reference purposes. It results in the text having a yellow background.
bdi element represents text that needs to be isolated from its surroundings for the purposes of bidirectional text formatting. This is useful when working with user-generated content where we can’t be sure of the directionality, perhaps because it could be written in a language with a different character set. The example from the HTML spec uses a name written in Arabic:
bdi element was used, these names and post numbers output to an unordered list with bullet points as expected. But if the
bdi tags weren’t there, the number “3” would jump to be next to the word “User.” This happens in the code, before anything is rendered to the user’s screen.
bdo element allows the author to override the Unicode bidirectional algorithm by explicitly specifying a direction override for its children. This is done by specifying the “dir” attribute to equal “ltr” for left-to-right or “rtl” for right to left.
br element represents a line break. It should only be used then the text actually calls for a line break, such as poems and addresses. It’s often used to create more space in styles, but the correct way to do this would be to use CSS to specify things like margin and padding. It should not be used for separating thematic groups in a paragraph.
wbr represents a line break opportunity, as opposed to an actual line break. If you wanted to have a bunch of words all be together without spacing but still wanted them to be able to wrap in a readable fashion, you would use this.
but leaves the opportunities for line breaks to occur if the screen size is small enough to need it.
Links and Link Types
In HTML, link types indicate the relationship between two documents. One links to the other using a
link element, and
a element or an
link element specifies the relationship between the current document and external one. One of the most popular ways to use this is to import a CSS stylesheet to an index.html page. For example:
link rel="stylesheet" href=“CSS/style.css"
The “rel” (relationship) value in this case is “stylesheet” but it can be set to a lot of different values. For example, a real of “help” would indicate that the link is leading to a resource about the whole page. A value of “author” indicated that the page leads to information about the author or how to contact her and “license” leads to licensing information.
a element is the anchor element, and defines a hyperlink to a location on any other page on the web or to a location on the same page. It can also be used to create an anchor point at various spots in a page so that links aren’t limited to connecting only to the top of the page. The HTML specs say this is obsolete, but with the current trend of “endless scroll” style pages, they’ve made a resurgence.
There are several attributes that can be given to the
a element. The “href” attribute is required for linking to external pages. The “target” attribute is commonly used and dictates where to display the linked source. When “target” is set to “_blank” the link will open in a new tab.
area element is less commonly used than
a but still worth knowing about. It creates a clickable “hotspot” on an image. This must be done within the
map element. On an image, an attribute called “usemap” creates an ID, and that ID is set with the coordinates of where in the image will be clickable. W3Schools has a great example of this with an image of a planet in space. The planet is clickable while the black space around it is not. This is done by setting the “usemap” attribute to an ID name, and then giving the ID a specific area of the image using the
At the end of the day, learning all the nitty gritty of semantic HTML is dull, but extremely important. First because it’s key to screen readers and accessibility for people who are low-vision/blind, second because of factors like SEO, and third because if you want to write quality, sustainable code, you have to do things by the book. After spending a few days pouring over the HTML specs, I have a piece of advice: aid your understanding of some of the weirder elements with W3Schools. Towards the end of writing this post I had one of their “Try it!” windows open and was dumping the example from the specs into it to see what happened. A shortcoming of the specs is that they don’t actually show you what en example results in. The
mark element is examplained as “marked or highlighted” but I didn’t realize until I actually rendered the example that the marked area literally has a bright yellow background. A detailed and technical explanation is all good and well, but be sure to actually run their examples or you might not actually understand what’s going on. Also, watch for some strangely political examples. There was a pro-atheism thread through the spec examples, as well as one that seems to be throwing shade at W3Schools. Regardless of all of this, understanding that HTML is a lot more than just headers and
p tags is important, and spending a day reading the specs and testing out the examples will make you a much more solid front end developer.