I’d like to generate previews for my app from user-generated links, to show them the same way Twitter and other social websites do it.
Of course, i could bypass that easily with the plenty of libraries that are available for this very task, but i would like to play by the rules and write a proper crawler.
I tried changing the IP i’m requesting from - using my 4G hotspot, my home connection and a couple of different servers, but everything always leads me to that page.
I set my User-Agent correctly “mypreview-bot/1.2 (+https://mywebsite/bot.html)”, i’ve even tried to say what my intentions are by adding “(like TwitterBot)” [the same thing Telegram bots do]. I always read and obey the robots.txt file and do no more than a request a minute on any given website.
So, what am i doing wrong? The websites don’t even have “i’m under attack” enabled, so i guess Cloudflare just hates me.
The request is very simple:
**Host:** thewebsite **Connection:** close **Accept-Encoding:** gzip **User-Agent:** mypreview-bot/1.2 (+https://website/bot.html) **Accept:** */*
Am i doing something wrong with the headers? Do i have to apply somewhere for approval? Or should I just resign myself to needing to run cfscrape like all malicious bots do?