Thank you for the guide. What are the main advantages of this method compared to a simple add-on https://github.com/tarampampam/random-user-agent?
The add-on is designed for users of browsers to anonymize their user agents while manually browsing the web. The User-Agents package is a lower-level library that's more useful for people engaging in web scraping or large-scale data collection. The add-on generates user agents that look realistic based on simple patterns and heuristics, while User-Agents generates real user agents and browser fingerprints based on how frequently they're used in the wild.
I'm going with a simpler solution to get around bot detection, so far it's working pretty well: https://gist.github.com/endel/b75d3a6af0fcae066dfa09cc066f56ee
What I do is basically increment the version of Chrome / WebKit, and change the operational system identifier. Even though it might generate a version that's non-existent, the server that was blocking my requests is not blocking anymore.
Thanks for the great project and repo! Can you please collect HTTP header "Accept-Language" statistics on your traffic and share with public? The same way as you did with User-Agents, but group the records by country and set weights for each record? This will be just fantastic help from your project to the community!
The example record can look like:
{ country: 'us', value: 'en-us', weight: 76.2 } { country: 'us', value: 'fr-CH, fr;q=0.9, en;q=0.8, de;q=0.7, ;q=0.5', weight: 0.233 } { country: 'de', value: 'de;q=0.9, ;q=0.1', weight: 21.097 }
A lot of people need this and you can make this easily. Thanks a lot!