Working on a website, wheater it already went live or is a new project, you should alsways have a copy of the website in staging environment. But you probably don’t want that website being index by search engines. In this blog posts, I will explain some ways to prevent the indexing.
Method 1: Use the default setting
In the dashboard of your website, navigate to “Settings | Reading”. Here you will find the setting “Search Engine Visibility” with a checkbox to “Discourage search engines from indexing this site”. This will dynamically create a robots.txt file to tell seach engines not to index your site. But as the note says “It is up to search engines to honor this request”, so some search engines might just still index the page.
Method 2: Prevent indexing in the server config
The first method usually work pretty good. There is only one big issue. The setting is stored in the database an can easily being overwritten. How? Well, you might want to import the database from the live website at one time, to see the latest content on your staging. Do you always remember to go back to that setting and enable it again on your staging? You might forget it.
Therefore preventing the indexing in the server configuration is a safer way, as it stays, even when importing a database from live. Simply add the following lines to the configuration:
Header set X-Robots-Tag "noindex"
Header merge X-Robots-Tag "noarchive"
You can do that in the .htaccess
of the staging website. This is most useful, when you host it on the same server or you cannot change the global configuration. But you have to be careful not to overwrite the .htaccess
file with the one from the live server.
If you have a dedicated staging server and you want to index none of the website hosted there, just add the lines to the global configuration, like the /etc/apache2/apache2.conf
or a similar file.
Method 3: Use a maintenance plugin
You could also protect your website from search engines and from other users by using a maintenance plugin. This will usually add a password protection to your site or you have to login, in order to see the website’s content. This can also be useful to give other people access (like the client), but not everyone. This method although has the same issue, that once you import the live website, you have to make sure to reactivate this plugin, as the state of activated plugins is also stored in the database.
Method 4: Use “basic access authentication”
With the “basic access authentication” aka. htaccess protection, you can prevent access to the website without the need of a plugin. With an Apache webserver, you can add a few lines to your .htaccess
file and any visitor to you page has to enter a username an password. This method is safe againt imported live databases but when you overwrite the .htaccess
from the live website, you have to restore the protection again. Therefore you can also store it in your global (per site) server configuration.
Conclusion
There are many different ways to protect your staging website from being indexed. Whatever method you use, always make sure that the method is still working, after you did an import from live. You can remove websites from search engines, but you have to do it per domain and for every single search engine using their individual tools.