17 points | by datafreak_ 2 hours ago
2 comments
I've been doing the same bit wider scope, for the whole Crux list, pruned to apex domains, and looking for CMS signals - how's your throughput?
I'm not doing any headless browser stuff, or many requests, so hyper optimised for speed.
I do grab robots.txt - didn't really see much in llms.txt or humans.txt in the wild, does yours?
I'm seeing about 6.6% block rate, but that does climb over time.
Fab project otherwise!
I've been doing the same bit wider scope, for the whole Crux list, pruned to apex domains, and looking for CMS signals - how's your throughput?
I'm not doing any headless browser stuff, or many requests, so hyper optimised for speed.
I do grab robots.txt - didn't really see much in llms.txt or humans.txt in the wild, does yours?
I'm seeing about 6.6% block rate, but that does climb over time.
Fab project otherwise!