WAF rule with low and upper letters

If there 2 user-agents:
Yoono
yoono
can I use only one (any) of them in a WAF rule to block the both?

Hi,

You can use the function lower() and then match against the lowercase version:

lower(http.user_agent) eq "yoono"

would match both Yoono and yoono.

Functions must be used in the Expression Editor, as they are not available in the Expression Builder.

If I have several hundreds of user agents in one rule, then how can I combine this pattern

(http.user_agent contains "crawler") or (http.user_agent contains "Extractor")

with your example

lower(http.user_agent) eq "yoono"

to minimize the length of the rule?

Hmm… there’s not much that can be done when the operator is “contains”, other than repeat each pattern:

(lower(http.user_agent) contains "crawler") or (lower(http.user_agent) contains "extractor" or…)

Unless you are on a Business Plan or higher, where a “matches regex” operator can be used.

(lower(http.user_agent) matches "(string1|string2|...)")

I think that the following will not work:

(lower(http.user_agent) contains "crawler" or "extractor" or…)

What do you think?

Negative. I am on the freeplan.

Test and see what you get.

1 Like

It will be complicated to reproduce. If there is no the answer, then I will do.

Chrome > Dev Tools > Network Tab > Network Conditions > pick custom user agent

or

curl -sI https://example.com -A 'my user agent'

Also, will

(lower(http.user_agent) contains "Active")

include the both user agents:
ActiveBookmark
ActiveWorlds

The way the function works is by modifying the field that was used as its parameter.

So if a request comes in with user agent “Active”, the result of the function lower(http.user_agent) would be “active”, which is then matched against the operator and value provided.

Therefore, you should use instead:

(lower(http.user_agent) contains "active")

It was not clear to me. Should I use low letters always to match all:
ActiveBookmark
ActiveWorlds
activebookmark
activeworlds
?

Yes.

When you use the lower() function as suggested, it’s like you’re asking WAF:

— If you transform all letters from user agent to lowercase, would the result match “yoono”?

And the answer is yes, both for “Yoono” and “yoono”.

If you use the operator “contains”, the expression would also match “YoonoMatrix”, “yoonoCrazy”, “Yoono is not yoono”, etc. etc.

When I was blocking 1600 AS numbers, it took 3 rules out of 5 in the WAF. So I have only 2 rules left on the free tariff to block 1000 user-agents of bad bots. Seams that the task is unfeasible. Free tariff is more or less useless.

“There was an error fetching your rules.”
Seams that I have 739 botnames which I want to block and I can not use the short rule like

(lower(http.user_agent) contains "crawler" or "extractor")

I tested the following rule with the action Block:

(lower(http.user_agent) contains "bot" and not lower(http.user_agent) contains "google" and not lower(http.user_agent) contains "bing" and not lower(http.user_agent) contains "VKRobot") or (lower(http.user_agent) contains "bet") 

And got the result:


So, why the VKRobot was blocked?

In an rule, logical operators (like or) combine two or more expressions, not two or more values. So, it should be:

(lower(http.user_agent) contains "crawler" or lower(http.user_agent) contains "extractor")

So, why the VKRobot was blocked?

It was blocked because that isn’t a possible match. Once you lowercase the whole UA with the lower() function, you must match the result against a lowercase value.

1 Like

I finished to create the rule, but I am out of limits on the free tariff: “size 33330 exceeded the maximum allowed of 4096 (Code: 20127)”


Is there a way to minimize the code?