Ingesting and analyzing CloudFlare logs in ClickHouse?

I’ve got a fair amount of zones and traffic going through Cloudflare. I would like to do analysis and periodic reporting on zones, custom domains, certain paths, error rates, etc, etc.

As Cloudflare’s out of the box reporting is a bit limited I probably need to roll my own.Ideally I would ingest all logs into a database and use that to run queries. I have used other (OLAP) databases like Azure Data Explorer for analysis of high volume request logs so I’ve got an idea of what it would take. (Also, I have read blog.cloudflare.com/http-analytics-for-6m-requests-per-second-using-clickhouse/)

For a number of zones I have setup LogPush to S3 to have the raw logs available.

My question: has anybody done something similar? I am looking for:

  • table/index definition code (e.g. which column types are best for storing CF’s http logs)
  • any existing automation to automatically ingest new logs into CF (and prevent duplicates)
  • any other plumbing

My preference would be to not start from scratch :slight_smile: