I’ve been dealing a bit with the WCAG accessibility guidelines at work, and one of the concepts that keeps recurring is that various kinds of information about the page is supposed to “be programmatically determined”. But what does that mean, and why is it important?
The thing is, when you can look at at a web page, the human brain is very good at figuring out the relationships between the parts of the page based just on their visual presentation. See some bold faced type? Must be there for emphasis. Those bigger runs of text? Must be headings and subheadings. And tables make sense when you can run your eyes up and down the columns, and across the rows.
HTML was originally designed as a way to mark up text documents semantically for the scientific community. But as the web became more popular, more and more presentational elements were added, and the focus of of web development turned to how pages look. For example, I distinctly remember my then-current copy of the The Non-Designer’s Web Book recommending that you create headings by using the <font>
tag to bump up the font size, and using a <br>
tag to separate it from the text body, because browsers made headings too big and put too much white space around headings, and it couldn’t be adjusted.
That’s all great if you can see the page, but pretty useless if you can’t. And many people can’t, so they have to rely on software to read the pages to them. Making sure that information about your page can be “programmatically determined” just means using the features of HTML to embed information about the page in the markup so software can figure it out.
Examples of making things able to be “programmatically determined”
Let’s start with headings and subheadings. If you just use font sizing and a break tag to denote a heading, like we used to, it looks like just a part of the paragraph to a screen reader. On the other hand, if you use one of the heading elements, like an <h3>
, the screen reader can recognize it, and indicate it to the listener.
Form controls are another problem. Typically, the labels for the controls are recognizable because visually they’re right next to the control. Not so easy for the screen reader — it doesn’t know which words are tied to which control, and if a table is used to lay out the form, the label might not even be adjacent to the control in the HTML source. This is solved by wrapping the label text in the <label>
element, and using its “for” attribute to indicate which control it’s associated with. The software can now determine the relationships programmatically.
Another problem with forms is organization. Forms can get pretty big, and hard to comprehend. You can indicate relationships between form fields by using the <fieldset>
and <legend>
elements to group sections of the form; this also has the benefit of grouping elements visually. Assistive technology can now recognize the organization, because it’s in the structure of the page.
Emphasis can also be indicated structurally. <em>
tags indicate emphasis, and are generally rendered in italic, while <strong>
elements indicate strong emphasis and are generally rendered in bold. Screen readers can pick up on these cues, and add the proper inflection.
On the other hand, the <b>
and <i>
tags have no intrinsic meaning — they just indicate bold face and italic text. The recommendation is to use <em>
and <strong>
if you want to convey emphasis, and a span
element with a style declaration for a presentational effect.
Here I differ a little from the recommendation; if my intent is strictly presentational, I’m willing to use <i>
since my intent is just presentation. For example, <p>I <em>loved</em> visiting the USS <i>Constitution</i>.</p>
, which would be rendered as:
I loved visiting the USS Constitution.
The difference is that “loved” is emphasized, whereas the italic on “Constitution” is just a typographic convention.
Finally, there are tables. Back in Ye Olde Netscape 4 days, we used to use make table headings by setting the color and background of the first row of cells, and maybe making the text bold. This is fine if you can scan up and down the rows, but sucks if the you’re listening to the table cell by cell. The right way is to use the <th>
(table header) element to mark up headers. Even better is to use the scope
attribute on the <th>
element to indicate if it’s a header for a row or a column, and to use the <thead>
, <tfoot>
and <tbody>
elements to indicate sections of the table. With this markup in place, software can then decipher the relationships between cells and communicate them to the user.
Don’t abuse semantic elements
A last part of making sure that information about your document can be “programmatically determined” is to recognize semantic elements for what they are, and to not pollute your document for their presentational side effects. The biggest offender here is the <blockquote>
element, abused far and wide because it has indented sides. Well, dummy, it’s intended to convey the fact that the enclosed text is a quotation, and if it isn’t, you’re saying something about the text that isn’t correct, and can confuse software that’s trying to figure out your content programmatically. If all you want to do is indent text, use styles to do so. Other elements that tend to get misused for their presentational effects are the list elements <ul>, <ol>, <dl> <dd><dt>
.
But Why Should I Care?
Making information about your content “programmatically available” benefits more than screen readers. It makes your content more accessible to software generally — and that includes search engines as well.