As mentioned in this article I have checked my server-logfiles. I want to know how often the googlebot and the google-imagesbot visit my sites per day. The question ist „How often does Google crawl my sites?“ – „What is my daily crawling-rate?“ – This artikle is about this: how can I analyse my own logfiles. What programms do I need? And what should I do – step by step? Here the way I do it. The following method to analyse the logfiles is manually. You can be sure that no script or programm does a mistake.
First of all you have to download the logfiles. Use a ftp-programm eg. filezilla. This is a freeware tool, easy and comfortable. It works since years for me.
Usually the logfiles are saved in a „logs“-folder in the root of your webspace. Most times these files are compressed as a zip. So you need to unzipp the files. Therefore I use „IZArc“ – nice little programm that you can use with the context-menu of the right mouse.
Open logfiles with UltraEdit
Open the logfile with a fast Texteditor. I use UltraEdit. This little programm costs a little bit (about 60 $). But I don’t want to miss it. I don’t know any programm that could work so fast and comfortable with long text-lists. And a logfile usually is a very long list.
The opened logfile contains all activities on your website(s), each in one line. In this tutorial I only want to know how many images are crawled per day. Because of this I search under „Search in Files“ for „googlebot“. I have only installed the german version, sorry. But I think it is similar in the english version:
Now we have a list of all Googlebot activities. The next step is to seperate the list manually into single days (if it isn’t).
During the steps you should delete the pathname in the beginning of the result-list by using „search and replace“. You can also delete other phrases that allways repeat.
The next step is to seperate the images. First create a new UltraEdit-File. Search for the image-types via „Search in Files“:
and copy each result-list into the new file.
Seperating the googlebots
This clean list contains all images that where crawled by a googlebot. Finally you can seperate the two googlebots. Search for them via „Search in Files“ for
In the end you get both lists with the concrete value how often the bots have crawled your images. Would be nice to know – if somebody wants to tell here its interesting for all of us.
If somebody has a little script that does this directly on your server please feel free to post the link in the comments.
Category: Google imageSearch | Author: Martin Missfeldt 2 Kommentare
With a high traffic site, you may need to stay with command line tools instead of a GUI like ultra edit. See http://www.dynamicalsoftware.com/analytics/selection for a gentle introduction to some tools that can really help you analyze your log files.