What Is Robots.txt SEO and How to use it In blogger - Detailed Explanation of robots.txt File
Robots.Txt
Guys, Do you Love, "."
Sure, you Do That`s Why you are here,
So, Today I am gonna talk about legitimate SEO Tip that can help you increase your traffic right away; just kidding, there are no tip that can increase your traffic right away,
But this tip can help you a lot, and there is no rocket-science behind implementing it.
It’s designed to work with search engines, but surprisingly, it’s a source of SEO juice waiting to be unlocked.
There are too many methods for improving SEO without requiring any technical experience; neither are they difficult to implement nor time-consuming.
So, when you are ready, get along with me; I'll show you how to change your robots.txt file perfectly so that your website get loved by google.
What Is Robots.txt File?
A robot.txt file (also known as the robots exclusion protocol or standard) is a tiny file that contains a set of texts that tell search engine crawlers what page/post they want to index what page to index(crawl). and which pages not to indexed(Crawl)
Blogger, if you do not add a custom robot.txt file, Blogger uses the default robot.txt file that can be modified.
The robots.txt file can be accessed for any site by adding another robots.txt site domain.
the file we use in this blog can be accessed from
https://www.stealprice.online/robots.txt
Let`s Say when a search engine about to visit your page, it will check the robots.txt file instruction before crawling.
let`s say a Search engine find this type of robots.txt on your website
the basic skeleton of a robots.txt file. |
This Is a very Basic Types of robots.txt
Let`s Understand what is says
The asterisk(*) after “user-agent” means that the robots.txt file applies to all web robots that visit the site.
The slash after “Disallow” means that the robot will not visit any pages (Not Posts) on the site.
If you want search engine bots to crawl specific posts or content, the bots will crawl and index your site based on that content alone.
As Google Says:
“You don’t want your server to be overwhelmed by Google’s crawler or to waste crawl budget crawling unimportant or similar pages on your site.”
Components Of Robots.txt file
User-agent
It determines the type of crawler to which the commands will be given. To select the Facebook crawler facebookexternalhit / 1.1,
for example, we will use these lines.
User-agent: facebookexternalhit/1.1
Disallow: /p
Disallow: /search/label
Allow: /
User-agent: *
Allow: /
How to Customize the robots.txt file
To customize your robots.txt file for your blog, log in to Blogger Settings Search preferences ⇽ Indexing and crawlers Custom robots.txt file and click Edit then select Yes and finally add the lines you want to use
Complete the explanation for more information on customizing the file
The default robots.txt file
If you don't customize the robots.txt file, then Blogger uses this:
User-agent: Mediapartners-Google
Disallow:
User-agent: *
Disallow: /search
Allow: /
Sitemap: https://www.stealprice.online/sitemap.xml
This file contains important information that directs the reptiles; there are different types of reptiles, including search spiders(crawler) that specialize in searching for topics to archive, and there are crawlers specialized in fetching information such as facebookexternalhit / 1.1 of Facebook and its function is to crawl the links that have been shared on Facebook to bring pictures and information Stored in it or the Alexa crawler ia_archiver whose function is to browse sites for content analysis, related sites, and other analytical functions ...
There are many types of crawlers, each crawler and the function developed to perform it.
Robots.txt file components
User-agent
It determines the type of crawler to which the commands will be given. To select the Facebook crawler facebookexternalhit / 1.1, for example, we will use these lines.
User-agent: facebookexternalhit/1.1
Disallow: /p
Disallow: /search/label
Allow: /
User-agent: *
Allow: /
Sitemap: https://www.stealprice.online/sitemap.xml
The commands specified in red will be applied to the Facebook crawler only - the commands are explained below - but if we use the * sign in the user agent then specify all other crawlers without the specified in the code, the command specified in green means allow: / found in the user-agent * will not work on the Facebook crawler Because our Facebook crawler has assigned its own commands alone.
User-agent commands
Disallow -
You can add a Disallow line to prevent a crawler from accessing all or some of your blog pages.
Example:
User-agent: *
Disallow: /
This line will block all kinds of crawlers from accessing your site.
* His job is to identify all kinds of reptiles - Note
And Disallow prevents access to any page on the site that starts with / means all pages; if you want to prevent access to some pages, you can specify it like this example.
User-agent: *
Disallow: /p
Disallow: /2018/03/something.html
Disallow: /search
Thus, crawlers will not be able to access the mentioned pages, and these are examples of some links that cannot be reached when using these lines:
https://www.stealprice.online/p/contact.html
https://www.stealprice.online/2018/03/something.html
https://www.stealprice.online/search/label/something
https://www.stealprice.online/search?q=something
Either if Disallow is empty like:
User-agent: *
Disallow:
Crawlers will access all of your pages.
Allow -
You can add an Allow line to allow the crawler to access specific pages; for example, use.
User-agent: *
Disallow: /p
Allow: /p/important.html
Disallow prevents access to any page in the blog because the Blogger hosts all the pages on links beginning with / p,
except that Allow allowed one page, which is important.
Sitemap
It tells the search spiders where to store the map of your blog. To facilitate access to blogging links in Blogger, the blog map can be accessed in different ways.
https://www.stealprice.online/feeds/posts/default?orderby=UPDATED
https://www.stealprice.online/feeds/posts/default?alt=atom
https://www.stealprice.online/sitemap.xml
https://www.stealprice.online/atom.xml
From my perspective, the best is the first because it brings topics in the order of the last update, meaning any topic you updated becomes at the top of the map.
You add it at the bottom of the robots.txt file like this.
User-agent: *
Allow: /
Sitemap: https://www.stealprice.online/feeds/posts/default?orderby=UPDATED
Comments #
It is also possible in robots.txt files to add comments as in various programming languages, you may want to add comments to remember something or create an alert about why you added a command or just as a note... To
write a comment, start with # at the beginning of the speech as in the example
User-agent: *
Allow: /
# The page below has been denied access because it contains important information that we do not want to archive
# this page contains sensitive info
Also, Read,
How to get approved for AdSense in 2020| get approved 100%
What Is Blogger Cleanup Code: How To Install New Template in Blogger Easily in 2020
Disallow: /p/secret.html
Sitemap: https://www.stealprice.online/ feeds / posts / default? orderby = UPDATED
How to use the wildcard in links
The wildcard * helps you to define a set of infinite characters.
The wildcard can mean "everything," for example:
User-agent: *
# In the code below, we have blocked access to the entire blog
Disallow: /
#, but we have enabled access to a group of other pages.
Allow: / p / * .html
We made all pages accessible because * can mean everything, a line can define an unlimited number of pages, and these are some examples.
https://www.stealprice.online/p/contact.html
https://www.stealprice.online/p/usage-agreement.html
https://www.stealprice.online/p/privacy-policy.html
We come to the end of the explanation. If there are questions about the robots.txt file, you can ask it in a comment.
I hope the topic is as understandable as possible.