Prevent staging website from being indexed by search engines

Working on a website, wheater it already went live or is a new project, you should alsways have a copy of the website in staging environment. But you probably don’t want that website being index by search engines. In this blog posts, I will explain some ways to prevent the indexing.

Method 1: Use the default setting

In the dashboard of your website, navigate to “Settings | Reading”. Here you will find the setting “Search Engine Visibility” with a checkbox to “Discourage search engines from indexing this site”. This will dynamically create a robots.txt file to tell seach engines not to index your site. But as the note says “It is up to search engines to honor this request”, so some search engines might just still index the page.

Method 2: Prevent indexing in the server config

The first method usually work pretty good. There is only one big issue. The setting is stored in the database an can easily being overwritten. How? Well, you might want to import the database from the live website at one time, to see the latest content on your staging. Do you always remember to go back to that setting and enable it again on your staging? You might forget it.

Therefore preventing the indexing in the server configuration is a safer way, as it stays, even when importing a database from live. Simply add the following lines to the configuration:

Header set X-Robots-Tag "noindex"
Header merge X-Robots-Tag "noarchive"

You can do that in the .htaccess of the staging website. This is most useful, when you host it on the same server or you cannot change the global configuration. But you have to be careful not to overwrite the .htaccess file with the one from the live server.

If you have a dedicated staging server and you want to index none of the website hosted there, just add the lines to the global configuration, like the /etc/apache2/apache2.conf or a similar file.

Method 3: Use a maintenance plugin

You could also protect your website from search engines and from other users by using a maintenance plugin. This will usually add a password protection to your site or you have to login, in order to see the website’s content. This can also be useful to give other people access (like the client), but not everyone. This method although has the same issue, that once you import the live website, you have to make sure to reactivate this plugin, as the state of activated plugins is also stored in the database.

Method 4: Use “basic access authentication”

With the “basic access authentication” aka. htaccess protection, you can prevent access to the website without the need of a plugin. With an Apache webserver, you can add a few lines to your .htaccess file and any visitor to you page has to enter a username an password. This method is safe againt imported live databases but when you overwrite the .htaccess from the live website, you have to restore the protection again. Therefore you can also store it in your global (per site) server configuration.

Conclusion

There are many different ways to protect your staging website from being indexed. Whatever method you use, always make sure that the method is still working, after you did an import from live. You can remove websites from search engines, but you have to do it per domain and for every single search engine using their individual tools.

Posted by

Bernhard is a full time web developer who likes to write WordPress plugins in his free time and is an active member of the WP Meetups in Berlin and Potsdam.

Leave a Reply

Your email address will not be published. Required fields are marked *