Using Fetch & Render and Log File Analysis To (Try And) Understand When Google Indexes JS

29 December 2017 Andy Allen Leave a comment Company Updates

If you read our last post then you’ll remember that we discussed the topic of how long it takes Google to index JS changes made to a webpage (or, more specifically, the changes you can make through our Onsite Optimiser tool such as metadata changes).

This is very important to understand for anyone who is going to update SEO elements using JavaScript. Remember here that we are talking about ‘updating’ an element of the page, in that the raw un-rendered HTML content differs to the rendered HTML content.

In that post we shared the timings that users of our tool experience with having their changes indexed by Google, and this differed greatly whether it was the first or not-the-first JS change made since Google crawled the page.

In this post we’re going to expand on that a little further by looking specifically at what bots Google is using to discover JS, and see if this differs between pages that do and do not have JS-made changes applied**.

Spoiler alert! It kind of does / doesn’t. Meh.

 

Methodology

In order to discover how Google’s crawling behaviour differs when presented with JS content we did the following:

  1. Created two sets of pages; one group who have JS-updated content and the other group who do not
    • For those that do, we created smaller sub-set of pages: first or not-the-first JS change made to the page
  2. Submitted these pages for indexing via the ‘Fetch + Render’ tool in Search Console
  3. Monitored our server logs to see how Google crawled these pages and to see if there was a difference between the two sets of pages
  4. Monitor the SERPs to see if the content was indexed correctly

Pretty simple really. In fact it’s probably already been done and detailed in another blog post somewhere! If you know of one please share with us on social media 🙂

Test Results

So the results were fairly interesting, although we have to admit they were not as conclusive as we’d hoped to see. This could be due to the small sample size we’re playing with, so we’d love it if anyone else has additional findings (whether they show the same or different results).

There is definitely an obvious pattern of Googlebot activity that aligned with when we hit ‘Fetch & Render’ and then ‘Request Indexing’ for a webpage. Our pages were hit by Googlebot/2.1 twice in quick succession.

Every page was also hit by the WRS Googlebot (the bot that renders as Chrome/41 would), either later the same day or the next day.

For pages where a JS change was made to the page, and it was the first time a JS change had ever been done, this hit by the Chrome/41 Googlebot aligned very closely with when we saw the updated changes indexed (i.e. appear in the SERPs).

However, for pages where a JS change was made, but it was not the first JS change Google had encountered, this visit by the Chrome/41 Googlebot did not necessarily align with an update in the SERPs.

So, here are the tests… (these tests were done at different times so excuse the inconsistency there. However, we’ve included as much log data as possible so you guys can analyse for yourselves too)

Test 1 – Page with no JS-made changes

The example page is https://thewebshed.co/reputation-monitoring/. We’ve made no changes to this page using JS before submitting this page to fetch & render (‘F+R’) and requesting indexing (‘I’) on the 20th December.

So we saw one hit from the F+R request and another from the I request. Straight forward. Notice though that the Chrome/41 bot hit the page the next day.

Test 2 – Page with JS-made change but not updated

The example page is https://thewebshed.co/citation-audit-alignment/. There is an update to the page title of this page made via JS, but this was done (and indexed) before this test on the 20th December.

Pretty much the exact same behaviour as the test above.

Test 3 – Page with first ever JS-made change 

The example page is https://thewebshed.co/csv-uploads-now-within-onsite-optimiser/. We updated the page title and H1 via JS just before submitting this page on 27th December.

So, this page saw a similar pattern of hits to the above two pages. One thing to note though was that the JS content was not indexed until after the Chrome/41 bot had visited the page (on the 28th). This leads us to think it was only through this bot that Google was aware of the content change (i.e. sent it to the indexer ‘Caffeine’; more on that in the conclusion to this post).

Test 3 – Page with not-the-first ever JS-made change 

The example page is https://thewebshed.co/free-trial/. We updated the page title via JS before submitting this on 28th December.

Almost identical pattern to the above (although notice no additional visit of ‘regular’ Googlebot before the Chrome/41 bot hit on the 29th), and as with the previous page the new JS content change was only indexed after the Chrome/41 bot hit.

Another example in this test is https://thewebshed.co/keyword-tracker/. We updated the page title again via JS before submitting it on the 20th December.

This one’s a bit of an anomaly as Google (despite crawling with Chrome/41 bot) has still not indexed the new JS-updated page title. However, we’ve made quite a few JS-made changes to this page over the last few weeks so we knew this one would take longer to be indexed.

Conclusion

So – as with any Google test it seems – the results did vary between individual page tests so we can only conclude “this is what it looks like Google does” rather than “this is exactly what Google does”!

There’s one big piece of the jigsaw missing from this test; Google’s indexer ‘Caffeine’. Barry Adams does a great job of explaining the difference between crawling and indexing in terms of JavaScript (read it here). He pointed out to me on Twitter recently that even though a page may have been crawled, that doesn’t necessarily translate to it being rendered and indexed:

Understanding how Caffeine operates in more detail will, I’m sure, fill in a few of the gaps missing from our analysis. If you know of any great resources about Caffeine do let us know on Twitter!

 

Next Steps

This is a really easy test to duplicate so it would be great to see if anyone else has done/will do the same and share their results with the community. The more data we have the better we’ll understand how Google indexes JS.

We’ll continue to test in the same way, over a larger set of pages to try and get a better understanding. It’s also very likely that Google continue to improve the speed in which they index JS, and hopefully also come out with some documentation around the topic!

Hope you enjoyed the post, please follow us on Twitter for any latest updates.

 

**NB – This post is by no means a definitive guide/answer. It is merely what we have seen from our testing in the wild, using a very small subset of pages from this website. Extrapolate if you wish, but we advise that everyone tests themselves. The more data we have as an industry the more knowledgable we’ll be 🙂

Tags: