Since I have stopped my current project with the new theme, I am now allowing myself to blog about other topics again. This week, I want to write about an important global HTML attribute: the lang
attribute.
Maybe some of you have never (actively) used this attribute, but it’s quite important. Not only does it tell a search engine, what language the content on your website is written it, it also tells an assistive technology, like a screen reader, which voice to use when reading the text. It can even be set inline for a single word or parts of a text.
The main lang
attribute however is set on the <html>
tag. In WordPress, this is handled by the language_attributes()
function, you would usually find in a header.php
file in a classic theme. In a block theme, this is handled automatically in Core.
Reasons to overwrite the lang attribute
You would usually not want to change the value of the lang
attribute, since WordPress will use the correct one, based on the language setting of your website. But there are cases in which you might want to change it.
Multilingual websites
WordPress can only have one frontend language, unless you install a multilingual plugin. I use MultilingualPress, which is based on multisite. In a WordPress multisite, you can set language per sub-site. This will automatically use the correct lang
attribute in each site.
If your website does not use a multilingual plugin, but you have one page with a different language, you could overwrite the lang
attribute with some code.
Loading of external code
Another use-case is when you use plugins, that would use the lang
attribute to load some external data. I came across a cookie banner plugin this week, which would load the text for the banner from an external resource. It would use the exact value of the lang
attribute, but it expects a value like en
, so only with two characters. WordPress however is using a value like en-US
, which would not work for this cookie banner. So we need to strip the second part of the value.
CSS using the attribute
A good example for a use-case in CSS is the quotes
property. Different languages are using different quotation marks. When you want to use the proper quotation marks in a <q>
HTML tag, you usually don’t have to do anything, since the browser will handle that for you, as the value is set to quotes: auto
. But if you want to overwrite this, you could do the following:
q {
quotes: "«" "»" "‹" "›";
}
This would always use quotes that are used in French and other languages, even if your lang
attribute is set to en
.
Some CSS libraries to use the lang
attribute to change styles, but they might be doing it like this:
[lang="en"] q {
/* Some styles */
}
This would not work, if the value is en-US
for the lang
attribute. There is the CSS :lang()
pseudo-class that would work here:
:lang(en) {
/* Some styles */
}
If you use en
here, it would also work for en-US
, en-GB
, etc. But if you use en-US
, it would not work for only en
as well.
As the CSS from such a framework might be static, overwriting it might be a bit too complicated, so you might also want to change the value of the global lang
attribute of the <html>
tag.
How to change the value?
Let’s say, we want to change the value to a static other value for a specific page, you could do something like this:
function my_static_lang_attribute( $output ) {
$object = get_queried_object();
if ( $object && str_contains( $object->post_name, 'english' ) ) {
return 'lang="en-US"';
}
return $output;
}
add_filter( 'language_attributes', 'my_static_lang_attribute' );
This would overwrite the lang
attribute of any page/post with “english” in the permalink to lang="en-US"
for the <html>
tag.
As you can see from the function, the filter would not only return the value, but also the attribute name. If you look at the full code of the get_language_attributes
function, you can see that the function may return other attributes like dir
as well:
function get_language_attributes( $doctype = 'html' ) {
$attributes = array();
if ( function_exists( 'is_rtl' ) && is_rtl() ) {
$attributes[] = 'dir="rtl"';
}
$lang = get_bloginfo( 'language' );
if ( $lang ) {
if ( 'text/html' === get_option( 'html_type' ) || 'html' === $doctype ) {
$attributes[] = 'lang="' . esc_attr( $lang ) . '"';
}
if ( 'text/html' !== get_option( 'html_type' ) || 'xhtml' === $doctype ) {
$attributes[] = 'xml:lang="' . esc_attr( $lang ) . '"';
}
}
$output = implode( ' ', $attributes );
/**
* Filters the language attributes for display in the 'html' tag.
*
* @since 2.5.0
* @since 4.3.0 Added the `$doctype` parameter.
*
* @param string $output A space-separated list of language attributes.
* @param string $doctype The type of HTML document (xhtml|html).
*/
return apply_filters( 'language_attributes', $output, $doctype );
}
And plugins could also hook into this filter, so overwriting the $output
with something static might not work. Unfortunately, there is no filter to change the $lang
value only, and hooking into get_bloginfo()
, to overwrite the language
might break some other places, where this code is used. If you want to strip the second part of the value, you could use some regular expression like this:
function my_dynamic_lang_attribute( $output ) {
return preg_replace( '/lang="(\w+)([^"]+)"/', 'lang="$1"', $output );
}
add_filter( 'language_attributes', 'my_dynamic_lang_attribute' );
If you need something even more complex, it’s probably best to just overwrite the whole function.
Conclusion
The lang
attribute is a very important attribute every website should always set. But the value might not always be, what you need it to be. In those cases, you have a filter you can use to overwrite its value, but always make sure not to return something invalid.