I wanted to restore "structured data" to this site using some of the basic schema.org types that Google Search supports for their search results. The process was fairly straight forward...
- Add
structured-dataBlock withinbaseof.htmlfor shared template definitions. - Add
structured-data.htmlPartial for generation of default types.- Use
dictandjsonifyfor safely encoding JSON data. - Add Global Properties based on Hugo's common
Pagemetadata. - Add Detailed Properties based on site- and type-specific metadata.
- Use
- Update Content Templates to include the partial on relevant pages.
- Custom Pages with unique structured data requirements.
- Validate Structured Data to bulk test the generated site before publishing.
Continue reading for more commentary, review the original commit, or view the latest code.
Add structured-data Block
I chose JSON-LD as the easiest method to generate and embed data. Google Search requires this data to be embedded within the head or body tag, so I created a new block within the layouts/baseof.html template.
<html {{/* ... */}}>
<head>
{{/* ... */}}
{{ block "structured-data" . }}{{ end }}
</head>
{{/* ... */}}
Add structured-data.html Partial
Most pages can be generated using the same generic template, so I decided to use a partial named structured-data.html.
Use dict and jsonify
Rather than use string interpolation which can lead to JSON encoding issues, I used Hugo's dict function to build in-memory data structures. Within structured-data.html, I initialized a $sd variable with basic properties...
{{- $sd := dict
"@context" "https://schema.org"
"url" .Permalink
"author" ( dict
"@type" "Person"
"name" "Danny Berger"
"url" .Site.BaseURL
)
"copyrightHolder" ( dict
"@type" "Person"
"name" "Danny Berger"
"url" .Site.BaseURL
)
"copyrightYear" ( time.Format "2006" ( or .PublishDate .Lastmod ) )
-}}
Then, at the end of the file after any modifications of $sd, I use the jsonify function to encode it as JSON (and the safeJS function to avoid the output being double escaped).
<script type="application/ld+json">{{- $sd | jsonify | safeJS -}}</script>
Add Global Properties
There are a few schema.org properties that can be applied to any types, so I can include this snippet. The 2006-01... sequence ensures the dates are formatted in the ISO 8601 format recommended by schema.org.
{{- with .PublishDate -}}
{{- $sd = merge $sd ( dict "datePublished" ( time.Format "2006-01-02T15:04:05Z07:00" . ) ) -}}
{{- end -}}
{{- with .Lastmod -}}
{{- $sd = merge $sd ( dict "dateModified" ( time.Format "2006-01-02T15:04:05Z07:00" . ) ) -}}
{{- end -}}
{{- with .Params.description -}}
{{- $sd = merge $sd ( dict "description" . ) -}}
{{- end -}}
For hierarchical, section-based content, I use the following snippet to include the isPartOf property.
{{- if and ( ne . .CurrentSection ) .CurrentSection .CurrentSection.RelPermalink -}}
{{- $sd = merge $sd ( dict "isPartOf" .CurrentSection.Permalink ) -}}
{{- else if and .Parent.RelPermalink ( ne .Parent.RelPermalink "/" ) ( ne .Parent.RelPermalink .RelPermalink ) -}}
{{- $sd = merge $sd ( dict "isPartOf" .Parent.Permalink ) -}}
{{- end -}}
I also have some site-specific, global parameters that I wanted to include. The following snippet generates keywords properties based on my custom tagging conventions.
{{- with .Params.nav.tag -}}
{{- $keywords := slice -}}
{{- range $tag, $_ := . -}}
{{- $keywords = $keywords | append (replace $tag "-" " ") -}}
{{- end -}}
{{- if gt (len $keywords) 0 -}}
{{- $sd = merge $sd ( dict "keywords" $keywords ) -}}
{{- end -}}
{{- end -}}
The following snippet uses a nested template to generate structured data for my "place" taxonomies. I happened to use a partial, though it could probably be a type-specific template, too. I used the unmarshal function to convert the rendered template string back into a dictionary that can be added to $sd.
{{- with .Params.nav.place -}}
{{- $contentLocation := slice -}}
{{- range $place, $_ := . -}}
{{- $contentLocation = $contentLocation | append ( partial "nav-place/structured-data-ref.txt" ( $.GetPage ( printf "/nav-place/%s" $place ) ) | unmarshal ) -}}
{{- end -}}
{{- if gt (len $contentLocation) 0 -}}
{{- $sd = merge $sd ( dict "contentLocation" $contentLocation ) -}}
{{- end -}}
{{- end -}}
Add Detailed Properties
Next, I added more conditional logic based on the page type. I use the post type as a generic blog post, so include the BlogPosting type and wordCount property.
{{- if eq .Type "post" -}}
{{- $sd = merge $sd ( dict
"@type" "BlogPosting"
"headline" .Title
"wordCount" .WordCount
) -}}
I use a media content type for my images, so I include some logic for that. This implementation is very specific to my site's metadata conventions. The full source has some good examples of building nested objects such as wrapping the ImageObject type as the mainEntity, PropertyValue types for the exifData property, and several other image-related properties.
{{- else if eq .Type "media" -}}
{{- $mainEntity := dict "@type" "ImageObject" -}}
{{- with .Params.mediaType -}}
{{- with .captureTime.time -}}
{{- $mainEntity = merge $mainEntity ( dict "dateCreated" . ) -}}
{{- end -}}
{{- with .exifProfile -}}
{{- $exifData := slice -}}
{{- with .apertureValue.number -}}
{{- $exifData = $exifData | append ( dict
"@type" "PropertyValue"
"name" "fNumber"
"value" .
) -}}
{{- end -}}
{{/* ... */}}
{{- if gt (len $exifData) 0 -}}
{{- $mainEntity = merge $mainEntity ( dict "exifData" $exifData ) -}}
{{- end -}}
{{- end -}}
{{ end }}
{{- with .width -}}
{{- $mainEntity = merge $mainEntity ( dict "width" . ) -}}
{{- end -}}
{{/* ... */}}
{{- $sd = merge $sd ( dict
"@type" "WebPage"
"name" .Title
"mainEntity" $mainEntity
) -}}
Finally, if no other type matches, I make sure it falls back to a WebPage type. Google doesn't specifically use this type for their search enhancements, but it helps complete a valid, typed entity.
{{- else -}}
{{- $sd = merge $sd ( dict
"@type" "WebPage"
"name" .Title
) -}}
{{- end -}}
Update Content Templates
With the _partials/structured-data.html file available, I can now include it in my existing templates; namely page.html and section.html.
{{ define "structured-data" }}
{{ partial "structured-data" . }}
{{ end }}
Custom Pages
The generic, partial template is great for most of the pages, but some pages need one-off structured data. For example, in home.html I wanted to use the ProfilePage type with a Person entity to describe some basics about myself.
{{ define "structured-data" }}
<script type="application/ld+json">{{ ( dict
"@context" "https://schema.org"
"@type" "ProfilePage"
"url" .Permalink
"name" .Title
"mainEntity" ( dict
"@type" "Person"
"name" "Danny Berger"
"familyName" "Berger"
"givenName" "Danny"
"sameAs" ( slice
.Permalink
"https://www.linkedin.com/in/dpb587/"
"https://twitter.com/dpb587"
"https://github.com/dpb587"
"https://gitlab.com/dpb587"
)
"homeLocation" ( dict
"@type" "Place"
"address" ( dict
"@type" "PostalAddress"
"addressLocality" "Albuquerque"
"addressRegion" "NM"
"addressCountry" "US"
)
"name" "Albuquerque, New Mexico"
)
"image" ( absURL "/assets/images/dpb587-20221205b~256.jpg" )
) ) | jsonify | safeJS }}</script>
{{ end }}
Validate Structured Data
While prototyping, I validated my changes using Google's Rich Results Test. It only supports testing live web pages or manually pasted code, so I copy-pasted a few pages manually. The only errors/warnings it found were related to using localhost URLs.
To bulk test all my new structured data, I tried to "dog food" the Structured Data API and hacked together a script to iterate all the JSON-LD references after hugo build...
grep -lsnr '<script type="application/ld+json">' . \
| sort \
| xargs -P8 -I{} -- bash -c 'dovalidate "{}"'
And then sent it to the endpoint and print any messages, if they show up, for manual review...
curl --fail -sS -o- \
'https://api.namedgraph.com/toolkit.v0/structuredData.process' \
--header "Authorization: Bearer ${NG_API_TOKEN}" \
--form sourceFile=@"${file}" \
--form experimental=web.googlesearch.validator=true \
| jq -r --arg file "${file}" '
"\($file)\n\([
.. | .messages? | select(.)[]
] | map("+++ \(.info.severity) [\(.info.title)] \(.info.message)")[]
)"
'
It found a typo that I was able to fix (Google silently autocorrects these), but otherwise no surprises.
./entries/minikube-and-bridged-networking-20200411/index.html
+++ ERROR [Unknown Property] The "datePUblished" term is not a known schema.org/Property.
...
I committed audit-structured-data.sh script to manually use later in case I make further changes to the structured data templates.
Monitor Results
Once published, Google Bot and other crawlers will take some time to discover the new structured data. The Search Console offers basic insights through their "Enhancements" and "Core Web Vitals" reports.
Of course, whether or not any structured data actually makes a difference to search results and rankings is ultimately up to the search engines.