Wednesday, October 5, 2011

2:03 AM

This is the continuing to the 20 Tips to SEO series. Today, I will talk to you about optimizing the Robots.txt file for websites/blogs.

1. Include your sitemap file website link in Robots.txt option.

2. If you have lots of images on the website , induct an image sitemap and if you have broadcasting visualize
images on the website ponder adding a videos sitemap with the regular one.

3. Make sure you omit all the script files,admin folder and all backend stuff from indexing.

4. Open your Google webmaster tool account and check your Robot.text file is set correct.

5. There are wad online tools available over internet but few of them are crap.Just believed on Google webmaster tool because they provide best Robot.txt services. There is an effective Robots.txt generator tool at Site Constellation or Configuration > Crawler Access > Generate Robots.txt

Robots.txt



6. The universal syntax to be compose in the Robots file is this.
User-agent: *
Disallow: /yourfile or your url/
Here, user-agent:* all search agents(Google,bing,ask etc).
/yourfile/ restricts that folder from crawling.The inner-folders will not be crawled too.

7. Particular Image file location for Google Image Robots.
User-agent: Googlebot-Image
Allow: /wp-content/uploads/
If you have utmost of images, specifying a particular folder or file to crawl for Google Image robots is a excellent idea. In the above example, wp-content/uploads/ is the file list where images are.

8. Distinctively you can set your users agents to * making it applicable to all search engine bots/user agents. But there are several different user agents like Google(which itself has different spinders). To know the various list of user agents, check this out.

9. Removed unneeded URLs using Robots.txt

How many URL's you don’t wish to add in search engines to crawl/index, use the following syntax.
User-agent: *
Disallow: /directory/folder or file/
In the sample example, all the URLs beginning with /directory/folder won’t be crawled.

10. If you find that a file omit via Robots.txt is indexed on Google (probably via backlinks from other pages or websites),apply the Metatag No-Index tags to get it excluded.

11. Robots.txt is not the surest style to removed or include a file and folder on search engines.You use the meta index/noindex files and also evaluate the URL credibility from Google Webmasters to cross check.

12. Make surely you follow all the syntax as mention by the standards.here

13. Robot.tx stuff is not the surest way to halt or remove a url from being index on Google.Apply the URL Removal Tool option inside Google Webmasters Tool account to get this done.

14. To comment within the Robots.txt file use the hash (#) symbol.

15. Don’t add all the file or folders names in parallel line.The correct syntax code is to arrange then in each line by file or folder.

User-agent: *
Disallow: /donotcrawl/
Disallow: /donotindex/
Disallow: /scripts/

16. Practice the “Allow” command.
Some selected crawlers like Google supports a new command called the “Allow” command. It lets you particularly dictate what files/folders should be crawled. still, this field is currently not part of the "robots.txt" protocol.

17. Robots.txt for blogger and wordpress.com users.

Blogger/wordpress users cannot upload the robots file , they can utilize the robots meta-tag to command the crawling of bots on particular files.

18. if your website belong to sub-directory, make surely that your Robots.txt is in the root directory always.

19. Make sure your robots file have the correct admittance permissions and not re-write by all.

20. You also use different Robot.txt file stuff but carefuly,some time you take wrong dicussion.