Experimenting with HTML5 image markup and shortcodes in WordPress

Recently I have been experimenting with the markup and styling of images in WordPress. Unbeknownst to me, the core WordPress developers have also been working on this. Version 3.9 has just hit the airwaves and one of its leading features (for developers, anyway) is the introduction of semantic HTML5 markup for caption and gallery shortcodes. This formalizes some of the hacks and snippets WordPress developers have been using for some time now.

I took things one step further by developing a general “image” shortcode, initially to apply the same markup to images with or without captions. Later I began to mess around with dynamic image and attachment link generation, something that isn’t really feasible when images are hard-coded into post contents.

Let’s start with how captions were handled prior to WordPress 3.9, when the output looked something like this (adapted from Joost Kiens):

<div id="attachment_100" class="wp-caption alignnone" style="width: 150px">
    <img src="http://example.com/wp-content/uploads/2014/05/example.jpg" alt="A cool image" width="150" height="150" class="size-thumbnail wp-image-100">
    <p class="wp-caption-text">A caption describing the image above.</p>

With the introduction and popularization of the HTML5 figure and figcaption elements it made sense to wrap the image and caption like so:

<figure id="attachment_100" class="wp-caption alignnone" style="width: 150px">
    <img src="http://example.com/wp-content/uploads/2014/05/example.jpg" alt="A cool image" width="150" height="150" class="size-thumbnail wp-image-100">
    <figcaption class="wp-caption-text">A caption describing the image above.</figcaption>

This is more or less what you get in WordPress 3.9+ when you add this to your theme’s functions.php:

add_theme_support( 'html5', array( 'gallery', 'caption' ) );

Not bad, but I think we can do better.

For starters, in-line styling has to go. Leave that to the theme. Next, we might want to get rid of explicit height and width attributes altogether. While we’re at it, why not add structured data and accessibility attributes?

Of course, any changes made to captions won’t have any impact on other images. As such, I feel like there is a case to be made for a generalized image shortcode. Allow me to explain.

Once we start changing markup for captioned images we are leaving images without captions in the dust. This means that we will have two kinds of images in our posts: captioned images wrapped in a figure element with all the extra HTML5 goodness we care to introduce… and a bunch of naked img elements without any additional markup. This is less than ideal, particularly if we as theme developers wish to apply the same styling to all our images.

But wait, there’s more.

Say, for instance, that we wish to display different sized images in different contexts? (I don’t mean responsive images—that’s another topic entirely.) To give an example, my blog is designed to display a sidebar except when viewing full-width content, mainly big, beautiful images. With vanilla WordPress I would have to insert actual HTML into my posts referencing the largest image size I want to display. But on my main page, where I display a sidebar, these large images will be downsized by the CSS declaration max-width: 100%. This works—but it consumes way more bandwidth than necessary.

A solution to both these issues is to use a shortcode to dynamically generate image markup. This way we can use whatever custom markup we want!

A few advantages to using an image shortcode:

  • We gain total control of how images are displayed in post contents.
  • When drafting posts it will be much easier to make small changes (e.g. to alignment, sizing, and captions).
  • We will never need to alter post contents should we wish to change how images are displayed.

And a few disadvantages to using an image shortcode:

  • Additional server-side processing is required to display posts. Caching will help reduce the impact, of course. You do cache, right?
  • Plugin lock-in (since this kind of code should never be added to a theme). Once you start using any new shortcode you are obligated to maintain the code that handles it forever—and this is no exception.
  • Making this play nice with the visual editor will take some additional effort beyond what I am investing here.

Now, this is not something most WordPress users are going to want to mess around with. If you don’t know your way around the codebase I’d recommend sticking with vanilla WordPress and doing things the WordPress way. You’ll save yourself a lot of headaches. If, however, you’d like to take control of single images in WordPress, read on.

To explore how an image shortcode might be implemented I am going to walk you through some example code from Ubik Imagery. I am still working on the code and would strongly recommend browsing the source on GitHub if you are serious about implementing this on your own blog. I don’t have any plans to spin this code into a publicly available plugin so you’re also welcome to do so if you wish.

To begin with we need to be able to generate the actual shortcode when inserting images in the post editor. This is very straightforward:

// Generate an image shortcode when inserting images into a post
function ubik_image_send_to_editor( $html, $id, $caption = '', $title = '', $align = '', $url = '', $size = 'medium', $alt = '' ) {

  if ( !empty( $id ) )
    $content = ' id="' . esc_attr( $id ) . '"';

  if ( !empty( $align ) && $align !== 'none' )
    $content .= ' align="' . esc_attr( $align ) . '"';

  // Allows for dynamic attachment URL generation
  if ( !empty( $url ) ) {
    if ( strpos( $url, '?attachment_id=' ) === false ) {
      $content .= ' url="' . esc_attr( $url ) . '"';
    } else {
      $content .= ' url="attachment"';

  if ( !empty( $size ) && $size !== 'medium' )
    $content .= ' size="' . esc_attr( $size ) . '"';

  // Alt attribute defaults to caption contents which may contain shortcodes and markup; process shortcodes and strip out any resulting markup
  $alt = esc_attr( strip_tags( do_shortcode( $alt ) ) );

  // Set the alt attribute if it isn't identical to the caption contents
  if ( !empty( $alt ) && $alt !== $caption )
    $content .= ' alt="' . esc_attr( $alt ) . '"';

  if ( !empty( $caption ) ) {
    $content = '[​image' . $content . ']' . $caption . '[/image]';
  } else {
    $content = '[​image' . $content . '/]';

  return $content;
add_filter( 'image_send_to_editor', 'ubik_image_send_to_editor', 10, 9 );

This code is optimized to produce image shortcodes without superfluous information. Most values are optional and include sensible defaults.

The image shortcode optionally includes a URL and a caption, among other things. This means you will never need to use the WordPress core caption shortcode for images ever again! To accommodate this functionality the shortcode is self-closing.

Additionally, attachment URLs are dynamically generated when post contents are displayed, eliminating the need to correct links at any point in the future. Why would you ever need to correct an attachment link? Usually, if you attach an image to another post, the link will change—but not in your post contents, where it is hard-coded. The same is true when you change the slug of a post.

Of course, if you explicitly set some other URL you are on your own!

Here’s an example of the image shortcode in its most basic form:

[​image id="1981"/]

The actual output when inserting an image in the editor is a bit longer:

[​image id="1981" size="large" alt="Safety first"/]

The alt attribute is generated from information available on the media editor panel, either alt text, caption, or title, whatever is available. Since I sometimes like to use shortcodes in captions I have implement some code to strip them out in the example above.

Now, for a more elaborate example of image shortcode usage, have a look at this:

[​image id="1981" align="right" url="attachment" size="large"]An unusually grungy example of the characters seen on construction barriers in Japan.[/image]

This is pretty much everything we need to work with. Now, here is the code that registers the shortcode:

// Create a really simple image shortcode based on HTML5 image markup requirements
function ubik_image_shortcode( $atts, $caption = '' ) {
  extract( shortcode_atts( array(
    'id'            => '',
    'title'         => '',
    'align'         => 'none',
    'url'           => '',
    'size'          => 'medium',
    'alt'           => ''
  ), $atts ) );

  return apply_filters( 'ubik_image_shortcode', ubik_image_markup( $html = '', $id, $caption, $title, $align, $url, $size, $alt ) );
add_shortcode( 'image', 'ubik_image_shortcode' );

As you can see this is just a wrapper for another function, the one that actually generates the markup. This way other image-displaying functions in WordPress can hook into a single function. Consider image format posts with featured images, attachment templates, or existing caption shortcodes, for example. We can beautify the markup of all these images by calling one master function.

Of course, the moment we start playing around with code that rightfully belongs in a theme we introduce a new layer of complexity: fallback code in case the plugin housing our image shortcode is deactivated. This is a bit out of scope for this article but if you’re interested in how I have gone about this you are welcome to browse the source for Pendrell, the theme I use on this blog. The file you are looking for is presently located at pendrell/lib/media.php, but be warned: this theme is under active development and things may change or break.

Moving right along, existing caption shortcodes are handled with this function (again, adapted from Joost Kiens):

// WordPress core caption shortcode wrapper
function ubik_media_caption_shortcode( $val, $attr, $html = '' ) {
  extract( shortcode_atts( array(
    'id'      => '',
    'align'   => 'none',
    'width'   => '',
    'caption' => '',
    'class'   => ''
  ), $attr) );

  // Default back to WordPress core if we aren't provided with an ID, a caption, or if no img element is present
  if ( empty( $id ) || empty( $caption ) || strpos( $html, '<img' ) === false )
    return '';

  // Pass whatever we have to the general image markup generator
  return ubik_image_markup( $html, $id, $caption, $title = '', $align, $url = '', $size = '', $alt = '' );
add_filter( 'img_caption_shortcode', 'ubik_media_caption_shortcode', 10, 3 );

The HTML contents of the caption shortcode sent to ubik_image_markup are passed through more or less unaltered. This allows for backwards compatibility with posts that already contain caption shortcodes. (Update: as Justin Tadlock noted in the comments below, the caption shortcode can handle much more than just images. The code above has been updated to check for the present of an img element.)

Now to cut to the heart of the matter. Here is the master function used to generate HTML5 image markup:

// Generalized image markup generator; used by captioned images and image shortcodes; alternate markup presented on feeds is intended to validate
// Note: the $title variable is not used at all; it's WordPress legacy code; images don't need titles, just alt attributes
function ubik_image_markup( $html = '', $id, $caption, $title = '', $align = 'none', $url = '', $size = 'medium', $alt = '', $rel = '' ) {

  // If the $html variable is empty let's generate our own markup from scratch
  if ( empty( $html ) ) {

    // No fancy business in the feed
    if ( is_feed() ) {

      // The get_image_tag function requires a simple alignment e.g. "none", "left", etc.
      $align = str_replace( 'align', '', $align );

      // Default img element generator from WordPress core
      $html = get_image_tag( $id, $alt, $title, $align, $size );

    } else {

      // Dynamic image size hook; see Pendrell for an example of usage
      // Use case: you have full-width content on a blog with a sidebar but you don't want to waste bandwidth by loading those images in feeds or in the regular flow of posts
      // Just filter this and return 'medium' when $size === 'large'
      $size = apply_filters( 'ubik_image_markup_size', $size );

      // Custom replacement for get_image_tag(); roll your own instead of using $html = get_image_tag( $id, $alt, $title, $align, $size );
      list( $src, $width, $height, $is_intermediate ) = image_downsize( $id, $size );

      // If the image isn't resized then it is obviously the original; set $size to 'full' unless $width matches medium or large
      if ( $is_intermediate === false ) {

        // Test to see whether the presumably "full" sized image matches medium or large for consistent styling
        $medium = get_option( 'medium_size_w' );
        $large = get_option( 'large_size_w' );

        if ( $width === $medium ) {
          $size = 'medium';
        } elseif ( $width === $large ) {
          $size = 'large';
        } else {
          $size = 'full';

      // With all the pieces in place let's generate the img element
      $html = '&lt;img itemprop="contentUrl" src="' . esc_attr( $src ) . '" ' . image_hwstring( $width, $height ) . 'class="wp-image-' . esc_attr( $id ) . ' size-' . esc_attr( $size ) . '" alt="' . esc_attr( $alt ) . '" /&gt;';


    // Generate a link wrapper from the $url variable; optionally generates URL and rel attribute for images explicitly identified as attachments
    if ( !empty( $url ) ) {
      if ( $url === 'attachment' ) {
        $url = get_attachment_link( $id );
        $rel = ' rel="attachment wp-att-' . esc_attr( $id ) . '"';
      // Now wrap everything in a link
      $html = '&lt;a href="' . esc_attr( $url ) . '"' . $rel . '&gt;' . $html . '&lt;/a&gt;';

  // If the $html variable has been passed (e.g. from caption shortcode, post thumbnail functions, or legacy code); we don't do much here
  } else {
    // Add itemprop="contentURL" to image element when $html variable is passed to this function; ugly hack but it works
    if ( !is_feed() )
      $html = str_replace( '&lt;img', '&lt;img itemprop="contentUrl"', $html );

  // Sanitize $id, not that this should really be a problem
  $id = esc_attr( $id );

  // Initialize ARIA attributes
  $aria = '';

  // Caption processing
  if ( !empty( $caption ) ) {
    // Strip tags from captions but preserve some text formatting elements; this is mainly used to get rid of stray paragraph and break tags
    $caption = strip_tags( $caption, '&lt;a&gt;&lt;abbr&gt;&lt;acronym&gt;&lt;b&gt;&lt;bdi&gt;&lt;bdo&gt;&lt;cite&gt;&lt;code&gt;&lt;del&gt;&lt;em&gt;&lt;i&gt;&lt;ins&gt;&lt;mark&gt;&lt;q&gt;&lt;rp&gt;&lt;rt&gt;&lt;ruby&gt;&lt;s&gt;&lt;small&gt;&lt;strong&gt;&lt;sub&gt;&lt;sup&gt;&lt;time&gt;&lt;u&gt;' );

    // Replace excess white space and line breaks with a single space to neaten things up
    $caption = trim( str_replace( array("rn", "r", "n"), ' ', $caption ) );

    // Do shortcodes and texturize (since shortcode contents aren't texturized by default)
    $caption = wptexturize( do_shortcode( $caption ) );

    // If the caption isn't empty generate ARIA attributes for the figure element
    if ( !is_feed() )
      $aria = 'aria-describedby="figcaption-' . $id . '" ';

  // Prefix $align with "align"; saves us the trouble of writing it out all the time
  if ( $align === 'none' || $align === 'left' || $align === 'right' || $align === 'center' )
    $align = 'align' . $align;

  // There's a chance $size will have been wiped clean by the `ubik_image_markup_size` filter
  if ( !empty( $size ) )
    $size = ' size-' . esc_attr( $size );

  // Return stripped down markup for feeds
  if ( is_feed() ) {
    $content = $html;
    if ( !empty( $caption ) )
      $content .= '&lt;br/&gt;&lt;small&gt;' . $caption . '&lt;/small&gt;';

  // Generate image wrapper markup used everywhere else
  } else {
    $content = '&lt;figure id="attachment-' . $id . '" ' . $aria . 'class="wp-caption wp-caption-' . $id . ' ' . esc_attr( $align ) . $size . '" itemscope itemtype="http://schema.org/ImageObject"&gt;' . $html;
    if ( !empty( $caption ) )
      $content .= '&lt;figcaption id="figcaption-' . $id . '" class="wp-caption-text" itemprop="caption"&gt;' . $caption . '&lt;/figcaption&gt;';
    $content .= '&lt;/figure&gt;' . "n";

  return $content;

This is a big chunk of code. I have been liberal in commenting the code—hopefully it isn’t too hard to follow. A high-level overview of what’s going on here:

  • Check whether or not the $html variable is already filled (e.g. when this function is called from the caption shortcode, the image post format template in a theme, etc.). This allows for backwards compatibility with existing caption shortcodes in post contents.
  • Generate the img element and optionally wrap it in a link if the $html variable is empty. This proceeds through several steps:
  • Hook the $size variable so that other functions can alter the output. This allows for context-dependent image sizing. So, for instance, my theme checks whether the current view is full-width or not and will adjust the size accordingly.
  • Generate classes to apply to the image based on its size. There is some code here that checks the image height and width and attempts to match it with default values for the theme.
  • Optionally generate a link to wrap the image with. If the $url variable equals “attachment” a dynamic link will be generated with the rel attribute set to WordPress defaults.
  • Captions are sanitized and neatened. I like being able to use basic formatting and shortcodes in my captions. This section is also easily removed if it is of no use to you.
  • Whatever HTML was generated or passed is then wrapped in a figure element spiced up with structured data and ARIA attributes. The caption, if one exists, is wrapped in a figcaption element.
  • The markup generated for feeds is very basic to ensure that the feed will still validate.

Here’s an example of the markup generated by this function (with spaces and tabs added for readability):

<figure id="attachment-2044" aria-describedby="figcaption-2044" class="wp-caption wp-caption-2044 alignnone size-large" itemscope itemtype="http://schema.org/ImageObject">
  <img itemprop="contentUrl" width="960" height="640" src="http://synapticism.com/x/taiwan-taipei-wanhua-motorbike-tunnel-960x640.jpg" class="attachment-large wp-post-image" alt="From the underbelly of the future city we dreamt into being."/>
  <figcaption id="figcaption-2044" class="wp-caption-text" itemprop="caption">From the underbelly of the future city we dreamt into being.</figcaption>

The corresponding shortcode:

[​image id="2044" size="large"]From the underbelly of the future city we dreamt into being.[/image]

This is still a work in progress so my approach may change… but for now I am very happy with how this work. The code may be a bit jumbled but the actual user experience is exactly what I was looking for. I upload images, insert them into the editor, and can easily experiment with different layouts or change captions as needed.

Hopefully this experiment will be of use to someone out there!

Related posts

3 responses

  1. I just wanted to note that the “caption” shortcode is not just for images. You can put anything in it (video, audio, etc.), so you’ll want to make sure that when overwriting it, non-images still work.

  2. I know this post is almost 2 years old, but thanks so much for detailing all of this. I’ve continued to be frustrated with WP’s lack of support for responsive markup and/or developers’ constant tendency to include extraneous markup in the HTML that is output to the page. Even the current (as of 2016) support for responsive HTML image markup is implemented in a way that leaves a lot to be desired, and so I appreciate being able to use your post as a starting point for customizing my theme.


Markdown and HTML enabled in comments.
Your email address will not be published. Required fields are marked *