Rule n° 125 - Each page’s source code indicates the content’s main language.

A web page is written in a language that is not necessarily identifiable by the tools that analyse the pages. This information must be introduced to all automatic tools by the source code of the pages.

  • Cultivate content indexing by language.
  • Facilitate machine translation.
  • Enable correct reading of the content by a synthesized speech tool.
  • Improve the accessibility of content for people with disabilities.
  • Improve the way content is taken into account by search engines and indexing tools.


Fill in the lang attribute of the html root element using the appropriate language code (as indicated in the registry maintained by IANA: http://www.iana.org/assignments/language-subtag-registry). In practice, for French this is: <html lang="fr"> (in HTML) et <html lang="fr" xml:lang="fr"> (in XHTML).

Otherwise, in more complex cases, the content language can be indicated by the different parent elements: head, body, title, etc.

Verification consists in checking that the lang attribute of the html element is present and relevant (or failing that check its descendant elements) in the source code. In the source code of each page:

  • Check that the default language of the content is indicated by the lang attribute of the html element, for example <html lang="fr"> (in HTML)
  • If not, check for each content element that is is at least inherited form a parent element (head, body, title, etc.) from its langattribute.

Check the validity and relevance of the language code used. For this, use for example the Language Subtag Lookup Tool by Richard Ishida, https://r12a.github.io/app-subtags/.

Common cases of incorrect language codes include jp instead of ja for Japanese, lu instead of lb for Luxembourgish, gr instead of el for Greek, lat instead of la for Latin, and oci instead of oc for Occitan. Additionally, the codes mul for "multiple languages" and und for "undetermined language" must not be used in web content. Finally, the xml:lang can also be entered in addition to the lang attribute, but it is not sufficient to comply with this best practice.

