Need to block someone stealing content, Firewall rules not working

A website named JustTheRecipe.com is scraping my content and stealing it. Server logs look like this:

www.christmas-cookies.com 172.70.34.248 - - [05/Dec/2022:11:05:59 -0500] “GET /recipes/treats-for-animals/peanut-butter-dog-treats/ HTTP/2.0” 200 33783 “https://www.justtherecipe.com/” “Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36” “xxx.xxx.59.249” “cache:BYPASS”

I have created a firewall rule to block hostname “justtherecipe.com” like so (I also added their parent companies just in case, for the future:

(http.host contains “justtherecipe”) or (http.host contains “loopedin”) or (http.host contains “streamline”)

In the log above, “xxx.xxx.59.249” is my personal IP address. That was me testing JustTheRecipe to see if it would scrape Peanut Butter Dog Treats. It did. JustTheRecipe scraped the content, but it’s showing my IP address so I am a little confused by that.

How can I configure the firewall to block this service?

Answer these questions to help the Community help you with Security questions.

What is the domain name?

Have you searched for an answer?
Yes

Please share your search results url:

I implemented this workaround from 2021 but it is not working as described above

When you tested your domain using the Cloudflare Diagnostic Center, what were the results?
n/a

Describe the issue you are having:
see above

What error message or number are you receiving?
The website continues to scrape the content regardless of the firewall rules implemented

What steps have you taken to resolve the issue?

  1. See above

Was the site working with SSL prior to adding it to Cloudflare?
n/a

What are the steps to reproduce the error:

  1. Go to JustTheRecipe.com
  2. Input any URL from my website christmas-cookies.com.
  3. You will see it scraped my content

Have you tried from another browser and/or incognito mode?
Yes

Please attach a screenshot of the error:
Screenshot attached of Justtherecipe.com having scraped the content of this page:
https://www.christmas-cookies.com/recipes/low-carb-cookies/keto-forgotten-cookies/

Since it’s WordPress, I’d try with disabling “feed” and “rss” for first step, nevertheless, would experiment a bit with restricting WP-JSON and also try out using Rate Limiting feature, if it would help :thinking:

Unfortunately, this happens :confused:

If it’s also stealing your copyrighted images, you could file some kind of an abuse I guess :thinking:

Yes, if it’s copyrighted material, slap them with a lawsuit. As for stopping them, first of all be sure to log what they’re doing with https://www.abuseipdb.com/ - every time. This will, eventually, blacklist their IP/s so others know - help the community. As for the technical side, try using the ‘referrer’ rule as opposed to the ‘hostname’ rule. I’d be very curious to know the UA that’s being used as well.

Thank you to everyone for your input!

This topic was automatically closed 15 days after the last reply. New replies are no longer allowed.