How to analyse a logfile (eg. check crawling-rate of images)

Logfile analysis

Logfile analysis

As mentioned in this article I have checked my server-logfiles. I want to know how often the googlebot and the google-imagesbot visit my sites per day. The question ist „How often does Google crawl my sites?“ – „What is my daily crawling-rate?“ –  This artikle is about this: how can I analyse my own logfiles. What programms do I need? And what should I do – step by step? Here the way I do it. The following method to analyse the logfiles is manually. You can be sure that no script or programm does a mistake.

Download logfiles

Seo head

Seo head

First of all you have to download the logfiles. Use a ftp-programm eg. filezilla. This is a freeware tool, easy and  comfortable. It works since years for me.

Usually the logfiles are saved in a „logs“-folder in the root of your webspace. Most times these files are compressed as a zip. So you need to unzipp the files. Therefore I use „IZArc“ – nice little programm that you can use with the context-menu of the right mouse.

Open logfiles with UltraEdit

Open the logfile with a fast Texteditor. I use UltraEdit. This little programm costs a little bit (about 60 $). But I don’t want to miss it. I don’t know any programm that could work so fast and comfortable with long text-lists. And a logfile usually is a very long list.

The opened logfile contains all activities on your website(s), each in one line. In this tutorial I only want to know how many images are crawled per day. Because of this I search under „Search in Files“ for „googlebot“. I have only installed the german version, sorry. But I think it is similar in the english version:

Logfile analysis with UltraEdit

Logfile analysis with UltraEdit

Now we have a list of all Googlebot activities. The next step is to seperate the list manually into single days (if it isn’t).

During the steps you should delete the pathname in the beginning of the result-list by using „search and replace“. You can also delete other phrases that allways repeat.

Seperating images

Google images

Google images

The next step is to seperate the images. First create a new UltraEdit-File. Search for the image-types via „Search in Files“:

  • .jpg
  • .jpeg
  • .gif
  • .png

and copy each result-list into the new file.

Seperating the googlebots

This clean list contains all images that where crawled by a googlebot. Finally you can seperate the two googlebots. Search for them via „Search in Files“ for

  • Googlebot/2.1
  • Googlebot-Image/1.0

In the end you get both lists with the concrete value how often the bots have crawled your images. Would be nice to know – if somebody wants to tell here its interesting for all of us.

If somebody has a little script that does this directly on your server please feel free to post the link in the comments.

Category: Google imageSearch | Author: Martin Missfeldt   2 Kommentare

2 Comments zu "How to analyse a logfile (eg. check crawling-rate of images)"

  1. Glenn

    With a high traffic site, you may need to stay with command line tools instead of a GUI like ultra edit. See http://www.dynamicalsoftware.com/analytics/selection for a gentle introduction to some tools that can really help you analyze your log files.

Martin Mißfeldt

Author: Martin Missfeldt, Berlin based german artist, about creative image optimization and youTube video-optimization. This is the english version of tagSeoBlog.de. More

Latest Posts

Categories

SEO Workshop

Seo-Workshop with Martin Missfeldt & Marco Janck:

Seo for Images and HLP
5. Nov. 2018 (in Berlin)
Language: German

Seo-Workshop
More about it see this article (german)