Parse Apache Logs
Lets create a simple script to parse Apache Logs and print the 20 most common visited urls, I think this is a very useful script that can be easily ported in every case:
counter = collections.Counter()
for filename in glob.glob('var/log/apache2/*.gz'):
for line in gzip.open(filename):
mo = re.search(r'GET (.*) HTTP/1', line)
if mo is not None:
url = mo.group(1)
counter[url] +=1
result = heapq.nlargest(20, counter.items(), key=lambda(url,cnt): cnt)
pprint result
Σχόλια