0

I am running Nginx 1.1.19 on an Ubuntu server 12.04 and I'm having trouble doing the Googlebot, see the robots.txt file. I used the examples this post, but I did not get success. To test the service, I access the Webmaster Tools, click on "Integrity > Search as Googlebot"... just that I'm getting messages from "Not Found", "Page not available" and "robots.txt file is not accessible"....

I would also confirm if the configuration should be performed on the file nginx.conf or file "default" in /etc/nginx/sites-enabled, because in later versions, I noticed that might be different. This is my basic settings.

root /usr/share/nginx/www;
index index.php;

# Reescreve as URLs.
location / {
    try_files $uri $uri/ /index.php;
}
0

3 Answers 3

2

I managed to solve my problem by adding the command "rewrite" the policy server, as code below. After that, I returned to Google Webmasters, redid the search with the Googlebot and it worked. Take this opportunity to leave here my code that does the redirect port 80 to 443 prefix and non-www to www.

# Redirect HTTP to HTTPS and NON-WWW to WWW
server {
    listen 80;
    server_name domain.com.br;
    rewrite ^ https://www.domain.com.br$1 permanent;

# Rewrite the URLs.
    location / {
    try_files $uri $uri/ /index.php;
    }
}
server {
    listen 443;
    server_name www.domain.com.br;

# Rewrite the URLs.
    location / {
    try_files $uri $uri/ /index.php;
}

    root /usr/share/nginx/www;
    index index.php;

    [...] the code continued here
Sign up to request clarification or add additional context in comments.

Comments

2

If you’re managing an environment similar to a production and want to keep bots from indexing traffic, it’s customary to add a robots.txt file at the root of your website to disallow all. Instead of creating a two-line plain text file, you can do this with only nginx:

location = /robots.txt {
  add_header  Content-Type  text/plain;
  return 200 "User-agent: *\nDisallow: /\n";
}

Add this into your configuration management as determined by environment, or add it by hand, and no longer worry if Google might start broadcasting your dev site to the world.

https://alan.ivey.dev/posts/2017/robots.txt-disallow-all-with-nginx/

1 Comment

you are my life saver!
-1

Check out my answer here.

With regards to adding it to either your main nginx.conf file or your /etc/nginx/sites-available files, that's up to you, whether you want it to be global or site-specific, respectively.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.