The goal of this post is to equip you with the minimum viable knowledge required to do so. This post won’t go into the nitty gritty details, describe the history, or give you extreme detail on specifics. There are a lot of incredible write-ups that already do this — I suggest giving them a read if you are interested in diving deeper (I’ll link out to my favorites at the bottom).
- If not, what is the ideal solution?
I’m going to be using Sitecore’s Symposium landing page through each of these talking points to illustrate how to answer the questions above.
We’ll cover the “how do I do this” aspect first, and at the end I’ll expand on a few core concepts and link to further resources.
How do we go about doing this?
Ask the client
Ask, and you shall receive! Seriously though, one of the quickest/easiest things you can do as a consultant is contact your POC (or developers on the account) and ask them. After all, these are the people who work on the website day-in and day-out!
As you make progress, jot down notes about content that isn’t being loaded in, is being loaded in wrong, or any internal linking that isn’t working properly.
In order to get a more accurate depiction of what Googlebot is seeing, we need to attempt to mimic how it crawls the page.
How do we do that?
Use Google’s new mobile-friendly testing tool
At the moment, the quickest and most accurate way to try and replicate what Googlebot is seeing on a site is by using Google’s new mobile friendliness tool. My colleague Dom recently wrote an in-depth post comparing Search Console Fetch and Render, Googlebot, and the mobile friendliness tool. His findings were that most of the time, Googlebot and the mobile friendliness tool resulted in the same output.
In Google’s mobile friendliness tool, simply input your URL, hit “run test,” and then once the test is complete, click on “source code” on the right side of the window. You can take that code and search for any on-page content (title tags, canonicals, etc.) or links. If they appear here, Google is most likely seeing the content.
Search for visible content in Google
It’s always good to sense-check. Another quick way to check if GoogleBot has indexed content on your page is by simply selecting visible text on your page, and doing a site:search for it in Google with quotations around said text.
In our example there is visible text on the page that reads…
“Whether you are in marketing, business development, or IT, you feel a sense of urgency. Or maybe opportunity?”
When we do a site:search for this exact phrase, for this exact page, we get nothing. This means Google hasn’t indexed the content.
Crawling with a tool
From here you can input your domain/URL and see the rendered page/code once your tool of choice has completed the crawl.
When attempting to answer this question, my preference is to start by inputting the domain into Google’s mobile friendliness tool, copy the source code, and searching for important on-page elements (think title tag, <h1>, body copy, etc.) It’s also helpful to use a tool like diff checker to compare the rendered HTML with the original HTML (Screaming Frog also has a function where you can do this side by side).
For our example, here is what the output of the mobile friendliness tool shows us.
After a few searches, it becomes clear that important on-page elements are missing here.
We also did the second test and confirmed that Google hasn’t indexed the body content found on this page.
The implication at this point is that Googlebot is not seeing our content the way we want it to, which is a problem.
Let’s jump ahead and see what we can recommend the client.
Question 3: If we’re confident Googlebot isn’t seeing our content properly, what should we recommend?
How do we do that?
You want server-side rendering
The fix here is to instead have Sitecore’s landing page load on their server. In other words, we want to take the heavy lifting off of Googlebot, and put it on Sitecore’s servers. This will ensure that when Googlebot comes to the page, it doesn’t have to do any heavy lifting and instead can crawl the rendered HTML.
In this scenario, Googlebot lands on the page and already sees the HTML (and all the content).
There are more specific options (like isomorphic setups)
This is where it gets to be a bit in the weeds, but there are hybrid solutions. The best one at the moment is called isomorphic.
In this model, we’re asking the client to load the first request on their server, and then any future requests are made client-side.
If you’re looking to recommend this as a solution, please read this post from the AirBNB team which covers isomorphic setups in detail.
AJAX crawling = no go
(However, I am interested to hear any case studies from anyone who has implemented this solution recently. How has Google responded? Also, here’s a great write-up on this from my colleague Rob.)
- Ask the developers.
- Check to see if GoogleBot is seeing content the way we intend it to.
- Google’s mobile friendliness checker.
- Doing a site:search for visible content on the page.
- Give an ideal recommendation to client.
- Server-side rendering.
- Hybrid solutions (isomorphic).
- Not AJAX crawling.