It sounds like you haven’t properly secured your origin server. 185.220.101.164 is not a Cloudflare IP and is an IP of a TOR Exit.
At the very least, you should only allow Cloudflare IPs.
There are many automated tools and integrations to help you do this and update them automatically for you. You could also use a Cloudflare Tunnel to achieve a similiar goal.
The above assumes you have your DNS Records Proxied (). If you have records as DNS-only for any reason (i.e you want to serve media, or your application requires it), then doing that would break your website.
dpgworld.me doesn’t load for me, but it does point at 54.38.158.61. If that is your Web Server’s IP, you should try to get a new IP. If it isn’t your Web Server’s IP, the connecting IP of that request, 185.220.101.164, is a TOR Exit, so it could just be a malicious scanner faking host headers.
With or without proxy, you should only serve requests matching the hostname/servername you expect. You can do this in nginx for example using server blocks and the server_name directive.
Edit: This was mostly based on the assumption that you didn’t properly configure restoring visitor IPs, if you have, ignore and read cbrant’s message
Thank you for your comprehensive response. The 54.38.158.61 is not my IP address.
Also my nginx is configured to only respond to my actual domain name.
My registrar is Cloudflare, but if you do not use a tunnel to reach Cloudflare or just limit the IP to their IP addresses, you will get a lot of funky traffic.
My main objective in opening this thread is to learn how and why this is happening.
the dpgworld.me and other domains that I see in my access logs with a 200 HTTP status code really baffled me. Why and who would spin up domains and link to mine? What would they gain here?
The reason that it is not working for you or anyone else for that matter is that when I posted this, I just blocked all HTTP access to my website, and by doing that, those pesky domains stopped working. However, prior to that, visiting them would just land you on my website using HTTP.
If the server has been configured to restore original visitor IPs, as recommended, then the IP on an access log would be of the visitor, not a Cloudflare IP.
Actually, it’s their site with your content on it.
It looks to me this is a case of someone who’s copied your site content, but done a poor job at that. So some links to your domain are still there.
You can check by googling their domain with the site operator, as in site:example.com. You’ll see a few results, and if you click on the ellipsis (vertical 3 dots), then on “Cached”, you will get to the content they have scraped off your site, and some still carry links to your domain, as opposed to theirs. That’s probably why you are getting the referral links on your access logs.
I got confused by the same thing (and thus was wrongly confident that he didn’t secure his origin and they were requests from someone accessing his web server from that site pointing at it), but in the default nginx logging format, that is the referer header.
I suspect this is more of bot activity rather than some people doing it. because I see a lot of domains in my access logs that are similar to the one I initially shared.
Just wondering what would they gain by doing it since my blog is just a small non-profit hobby for me.
It just looks like they proxying my site for some reason. Nothing has changed on their end and there are quite a few *.me sites I see in my logs. Although there are non .me domains as well but are way less. and all are HTTP not HTTPS
Since your site is available under the Creative Commons — Attribution-NonCommercial-NoDerivatives 4.0 International — CC BY-NC-ND 4.0, these may be IT students working on projects (“Go out there, create a bot and copy a whole site’s content, but only if under CC license”). Of course it could be something more nefarious. You never know. And as matter of fact, it’s a bit of a waste of time to try to figure out why hackers do what they do. If their action is a threat to your resources, devise a way to block or challenge (captcha) them.