It's a lot less resource intensive not to use a hosts file. This might not be a concern for people with modern machines, but having a hosts file with 12,000 lines in it does take a certain amount of processing.
If I might mention my own site for a minute, I maintain a list of ad server (and tracking server) hostnames: http://pgl.yoyo.org/adservers/
You can view the list as a dnsmasq config file, a BIND config file, and a bunch of other formats.
The issue with that, in Windows at least, is that host lookups become a lot slower with a larger hosts file; a local caching DNS server with a block list is possibly a better solution, and one I think is already adopted by some.
Huh? I used to have ~100 hosting clients per IP address, none of whom were in any way related to each other (other than in having chosen me as a hosting provider).
How big would the list need to get before it starts affecting performance? There is obviously some kind of lookup for every HTTP request against the hosts file. I assume the hosts file is converted into some sort of hash list?
If you want to put in the effort you can sniff the hostname lookups and if it's done halfway dedicated name, as an entry for 0.0.0.0 to the hosts file.
HOSTS files are static. They were never designed for blocking ads or tracking. And for all we know, every connection does a linear search through the HOSTS file so the larger it gets, the more wasted time, because it was never designed to have millions of entries.
I'd tried this in the past but my machine slowed to a crawl. I guess it was to do with the algorithm used for handling the list of hosts (this sounds like a job for a bloom filter).
I've just tried the list you provided and it seems to be ok. Will try it for a while to see how I get on.
I've always wondered if Windows or other operative systems read the entire hosts file everytime they want to resolve an address. Maybe a big hosts file is bad for network performance?
reply