Keyword

What robots.txt do you use to maximise your K2 spidering?

  • Jack Bremer
  • Jack Bremer's Avatar Topic Author
  • Offline
  • Junior Member
More
14 years 5 months ago #81787 by Jack Bremer
Having watched Brian Teeman's excellent "Hidden Joomla Secrets" talk, I was inspired to try to achieve better Google Image Search traffic (most Joomla sites receive none, as you'll understand if you've listened to the talk!).I came up with the following amendments to the standard Joomla! robots.txt file and would welcome your thoughts... (e.g. some places tell me that there is no such thing as an Allow command, but Google themselves say there is)# Allow Google Images to spider the images folder as well as the largest K2 image filesUser-agent: Googlebot-ImageAllow: /images/stories/Allow: /media/k2/items/cache/*_XL.jpg#Allow the Google AdSense spider to spider the entire site for the purpose of applying appropriate ads. This does not communicate with other Google bots, so won't put your pages into the Google listingsUser-agent: Mediapartners-GoogleAllow: /User-agent: *Disallow: /administrator/Disallow: /cache/Disallow: /components/Disallow: /includes/Disallow: /installation/Disallow: /language/Disallow: /libraries/Disallow: /images/Disallow: /media/Disallow: /modules/Disallow: /plugins/Disallow: /templates/Disallow: /tmp/Disallow: /xmlrpc/

Please Log in or Create an account to join the conversation.

More
13 years 6 months ago #81788 by Odin Mayland
I have to agree about google using allow.  

 

In GWT if you type in the directory and click add rule, google will create the following:

 

User-agent: Googlebot-Image

Allow: /images/

Please Log in or Create an account to join the conversation.

  • Jack Bremer
  • Jack Bremer's Avatar Topic Author
  • Offline
  • Junior Member
More
13 years 5 months ago #81789 by Jack Bremer
Been advised to do the disallow before the allow, and because the L image usually shows on an article's page, I'm allowing that and the XL to be listed in Google Images now:

User-agent: *Disallow: /administrator/Disallow: /cache/Disallow: /components/Disallow: /includes/Disallow: /installation/Disallow: /language/Disallow: /libraries/Disallow: /images/Disallow: /media/Disallow: /modules/Disallow: /plugins/Disallow: /templates/Disallow: /tmp/Disallow: /xmlrpc/



# Allow Google Images to spider the images folder as well as the largest K2 image filesUser-agent: Googlebot-ImageAllow: /images/stories/Allow: /media/k2/items/cache/*_XL.jpg

Allow: /media/k2/items/cache/*_L.jpg#Allow the Google AdSense spider to spider the entire site for the purpose of applying appropriate ads. This does not communicate with other Google bots, so won't put your pages into the Google listingsUser-agent: Mediapartners-GoogleAllow: /

Please Log in or Create an account to join the conversation.


Powered by Kunena Forum