Sitemap blocking not working

What is the name of the domain?

example.com

What is the issue you’re encountering

sitemap blocking blocks all including google

What steps have you taken to resolve the issue?

with this rule (http.request.uri contains “/sitemap_index.xml”) or (http.request.uri contains “/page-sitemap.xml”) or (http.request.uri contains “/post-sitemap1.xml”) or (http.request.uri contains “/post-sitemap2.xml”) or (http.request.uri contains “/post-sitemap3.xml”) or (http.request.uri contains “/post-sitemap4.xml”) or (http.request.uri contains “/post-sitemap5.xml”) or (http.request.uri contains “/category-sitemap.xml”) or (not cf.client.bot) or (not ip.src.asnum in {15169})

What are the steps to reproduce the issue?

Matched service
Export event JSON

Service
Custom rules
Action taken
Block

    Ruleset
    default

Rule

Request details

Ray ID
908297b00f463ac0
IP address
66.249.65.204
ASN
AS15169 GOOGLE
Country
United States
User agent
Mozilla/5.0 (compatible; Google-InspectionTool/1.0;)
HTTP Version
HTTP/1.1

Referer
None (direct)
Method
GET
Host
printerdrivers.com
Path
/post-sitemap2.xml
Query string
Empty query string

Your rule logic appears to be:

if path contains a sitemap variant
or
not a bot
or not google

So you are blocking google as it is caught by the first sitemap test so you need “and” not “or”

I think what you need is

(http.request.uri contains “sitemap” and not cf,client.bot and not ip.src.asnum in {15169})
then block

when i use this i get ‘(http.request.uri contains “sitemap” and not cf,client.bot and not ip.src.asnum in {15169})’ is not a valid value for expression because the expression is invalid: Could not parse filter expression: Filter parsing error (1:28): (http.request.uri contains “sitemap” and not cf,client.bot and not ip.src.asnum in {15169}) ^^^^ invalid digit found in string while parsing with radix 16

also tried (http.request.uri contains “sitemap” and not cf.client.bot) not blocking anyone

I think its the “not valid expression” issue is quote conversion:

(http.request.uri.path contains “sitemap” and not cf.client.bot and not ip.src.asnum in {15169})

Blocks me if I try to access sitemap.xml on my site

What other rules do you have?
Is there an earlier rule that has a skip option in it?

Your site blocks me if I try to access post-sitemap2.xml - Ray Id: 908307f63eafcd25
sitemap.xml - Ray Id: 9083096bfe5b385c

Is it working as expected now?

1 Like

the rule order was the broblem

1 Like

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.