Bots, Spiders, and Crawlers
If you've ever looked at a log file on a server, you know that a lot of the traffic is not generated by humans. Depending on who you ask, anything from 40% to 64% of total traffic is generated by machines. And lately, I've been thinking more and more if I should just try to block the traffic coming from all those spiders and crawlers. Just to see what happens.
So here's the plan. I downloaded my access.log from the server and created two reports using goaccess. One is a report with all the data. The other is a report with the
ignore-crawlers option turned off. I'll now update my robots.txt and disallow all bots for the next 7 days to see what happens. In a week I'll download the log again and we can compare and see if something has changed.