Bots, Spiders, and Crawlers

If you've ever looked at a log file on a server, you know that a lot of the traffic is not generated by humans. Depending on who you ask, anything from 40% to 64% of total traffic is generated by machines. And lately, I've been thinking more and more if I should just try to block the traffic coming from all those spiders and crawlers. Just to see what happens.

So here's the plan. I downloaded my access.log from the server and created two reports using goaccess. One is a report with all the data. The other is a report with the ignore-crawlers option turned off. I'll now update my robots.txt and disallow all bots for the next 7 days to see what happens. In a week I'll download the log again and we can compare and see if something has changed.

Where do you go from here?

Follow via RSS or Email. Donate on Ko-Fi. Thoughts? Comments? Feeling lonely? Want me as your first reader? Get in touch. Sometimes I send a newsletter from the top of a mountain. I ask people to talk about themselves and their blogs on "People and Blogs".