
Meta Robots, otherwise known as a “meta robots tag” is an HTML element that gives search engines instructions to follow. Search engine crawlers use this information to determine various factors about a website’s URL.
Why are Meta Robots Tags Important to SEO?
Meta robots tags are important because it gives SEO managers, product managers and developers control (in addition to the robots.txt file) over how crawlers inspect your pages. This control is in the form of parameters which usually allow or disallow crawlers from doing certain things with your page’s content.
Using Meta Robots Tags
Meta robots tags are found in either the <head> section of the HTML (meta tag) of a page or within the HTTP header (x-robots-tag). These can be used interchangeably and there is no preferred version for Google’s search engine. Meta robots tags are made up of two parts: the name field and the content field. The name field specifies who should follow the directive (which crawler) and the content field specifies what the crawler should do.Below is a list of various parameters that are accepted by crawlers in the content field. Use these to strongly guide search crawlers while they are reviewing your website.All - Discover anything on the page. This is the default if no directive exists.Noindex - Does not add this URL to search engine results pages (SERPs)Follow - Follows links on this page and passes link juice from your site to the links. Nofollow - Does not allow link juice to pass from your site to any link on the page (including navigation or internal links)Noimageindex - Does not add images within this URL to SERPs.None - Shortcut for noindex, nofollow.Noarchive - Does not show a cached version of this page in SERPsNosnippet - Does not show a text snippet or video preview in SERPsNotranslate - Does not offer a translated version of this page in SERPsMax-snippet: [length] - Sets a maximum length in characters of a text snippet in SERPsMax-image-preview: [setting] - Sets a maximum size (standard or large) of an image preview in SERPsMax-video-preview: [time] - Sets a maximum number of time in seconds as a video snippet in SERPsNoodyp - Does not allow for the use of DMOZ/open directory project description in SERpsUnavailable_after: [date/time] - Does not show this URL within SERPs after specified date/time.Not all search engines respect all tags and some search engines have unique tags (ex. Noyaca for yandex). Since the unique values are rare, we have listed the most common ones above since those are 99% of the tags you should ever encounter. In addition to parameters, it’s possible to give certain instructions to specific crawlers by adding the user agent you’re focusing on in the name field. For example, <meta name=”robots” content=”noindex”>gives instructions to all robot crawlers not to show the page in SERPsAnd<meta name=”googlebot” content=”noindex”>
gives instructions only to Google not to show the page in SERPsIt’s possible to give different directives to many user agents / crawlers to follow.
Benefits of Meta Robots Tags
Meta robots tags are useful in helping guide crawlers. This is most useful in directing them away from content and not listing content in organic rankings. While it might sound amazing to have everything indexed, on large dynamic websites there can be many pages that are not useful for users to land on from search. For example, a print version of a page or a thank you page or a page with very little content on it that is only useful in context of a user’s experience. Instead of having search engines waste their time on these pages, meta robots tags give you some control to reinforce the pages you do want indexed and shown in SERPs.Another benefit is that these tags allow you to pass link juice from a page that you don’t want to appear in search results to other pages that are linked within those pages. The follow directive is especially useful here.
Meta Robots Best Practices
Use Robots.txt + Meta Tags. Some SEO teams rely on Robots.txt to disallow certain pages, but that’s not enough. Since crawlers can find their way to pages you don’t want indexed via links on other websites, a robots.txt directive doesn’t always work. These backlinks give crawlers a backdoor and by using meta robots tags to noindex such pages creates an even safer way to remove low-value pages from organic search indexesUse HTTP header tags. For files that are not webpages but you want to control how search engines treat them, consider using the x-robots-tag within the HTTP header. Pdfs, images, videos and other non HTML content can be controlled with the HTTP header tags.
Meta Robots vs. Robots.txt
Robots.txt files are often cited as a great way to allow for SEO managers to direct google (and other crawlers) away from large chunks of unimportant content. It does make it easy to disallow directories and more very quickly. However, you still need either an HTTP header or meta robots tags on each page to make sure those pages are not found from random backlinks. Robots.txt is not equivalent to meta robots. They both have pros/cons.
Tracking Changes with Meta Robots Tags
Once you’ve setup your meta robots tags, making sure they don’t change is tricky. There are three major ways to track meta robot changes: SEO audits, internal tools, and an SEO change monitoring tool.Traditional SEO Audits would have to be run after any changes occur on your website. For example, you’d compare the SEO audit of all the noindex and/or nofollow tags to what should be the case from your records. Sometimes changes occur without your knowledge (ex. a third party plug in) so staying up to date on any changes is a challenge. Then you have to repeat this on a timely basis (at least every week for most sites).Internal tools sometimes can reduce the manual work from traditional SEO audits, where there can be a system of record comparing what tags should be present to what is actually in the code. Maintaining these systems is complex.An SEO tracking tool or change monitoring system, like SEORadar, will track and compare what is currently served from your site to what it was serving the last time it was fetched. High organic traffic and zero organic traffic pages can be equally impacted by meta robots tags changing (no one would want a high traffic page to suddenly become noindex and dropped from SERPs). Make sure that you can control which URLs are tracked with your SEO change tracking tool so that you capture meta robots tag changes on both the high organic traffic and zero organic traffic pages.
Common Questions About Meta Robots Tags
Will all crawlers respect the meta robots tags?
It depends on the crawler. Malicious crawlers will not necessarily respect these directives. As such, meta robots should not be used as security to any private or sensitive information.
Do I need to use HTTP headers and meta robots tags?
Based on the page type, one or the other work. The best practice is to not use both HTTP headers and meta robots tags together. You wouldn’t want the search crawler to get confused if there was a difference between the two directives.
Can I control the exact snippet that Google will display?
Yes, there is a data nosnippet attribute that can be used within content’s span, div or sections that can define what not to use as snippets. Additionally you can set the maximum snippet length.
Future of Meta Robots Tags
Since meta robots tags have been a standard for a number of years, we see major search engines continuing to adhere to the directives. It’s possible that over time, the directives will be either less stringently followed or more so. For example, the data nosnippet is in part because of the European Copyright Directive which requires search engines to adhere to the directive.
CTA / Additional Resources
Below are relevant resources that delve even further into meta robots from a technical SEO perspective.