Google indexing
WordPress has justannounced a significant change concerning the way WordPress is goingto handle Google bots blocking and website indexing in general. Anew approach will be used in the new 5.3 release – Robots Meta Tagwill be replacing Robots.txt.
How does it work?
Thereis currently a certain setting in WordPress - discourage searchengines from indexing (found in Settings→Reading), using therobots.txt and adding a Disallow line in the code. This preventssearch engines from crawling the site and in turn indexing it inGoogle.
It’s a standardpractice using Robots.txt in order to block the indexing of awebsite.
But what exactlydoes indexing mean? The crawling of a certain site by Google Bot iswhat indexing consists of. It ishow Google knows about your website and shows it on its searchresults. You can stop Google downloading a webpage usingRobots.txt and it’s blocking mechanism. This should stop Googlefrom showing your pages in its Search results.
However,the robots.txt mechanism only stops Google from crawling a certainpage and it could still add the rest of your site it to its index ifit could discover any URL of the site, which makes this option a bitinefficient.
WordPress 5.3will indeed prevent Google Indexing
WordPress 5.3 willabandon the Robots.txt approach, meaning that when a user chooses theoption " discourage searchengines from indexing this site” it will prohibit Google fromcrawling this site.
With the newWordPress 5.3 the far more reliable Robots Meta Tag approach will beadopted in order to prevent indexing of a website. The change is agreat improvement and will assure users that your blocked pages willnot be shown in Google Search.
Why stop yoursite from being indexed?
A lot of people maystart to wonder – why would I stop my site from being visible onGoogle, isn’t this the main purpose of writing content, creatingnew designs and making it user friendly for visitors?
The answer is yesand no. If you have a clone of your website – a beta version orstaging site where you first make the changes and you later implementin your live website, then you certainly do not want that content forbe visible. It would make the Search results on Google look strangeor duplicate your content, which in turn leads to penalty fromGoogle.
Robots.txt
Using Robots.txt forblocking pages from Google was the way to do it until now. However,despite of its popularity, this method was not very reliable. As ofSeptember 2019, Google no longer supports the robots.txt whichprompted for the change in WordPress.
Robots.txt was never“official”, despite being honored by most search engines, but asthings have evolved, it’s no longer fit for this purpose.
Robots.txt Vs.Robots Meta Tag
The idea behind Robots.txt was to keep a page out of Google’s Index, this is now Robots Meta Tag’s job. The Robots Meta Tag is a specific tag which tells the Google Search engines which pages and links to follow on your website and which not to follow. Keep in mind that when you have a lot of links going out of your site this lowers your page rank. Keep some of the Google juice with you by preserving somelinks.
The Meta Robots Taghas 4 main functions:
Follow – thecommand tells the search engine crawler to follow the links in thewebpage.
Index – thiscommand tells the search engine crawler to index the webpage.
Nofollow – thecommand is used NOT to follow the links in the webpage.
Noindex– thecommand tell the crawler NOT to index that specific webpage.
If you are relyingsolely on robots.txt to prevent indexing certain pages from yourwebsite now is the time to take action! WordPress will handle thisfor you but make sure to contact our Fixed team should you encounterany issues!