Cloudflare WARP DoH breaks gethostbyname on Linux

Hi,

When resolving some hostnames, gethostbyname is broken on Linux. This doesn’t affect only the Linux client, since I was able to reproduce the issue from a Docker container running on a Windows host which was running WARP. The getaddrinfo function is not affected by this bug.

To test this, run python3 -c "import socket; print(socket.gethostbyname('test.s3.amazonaws.com'))". When WARP is running, the following exception is raised:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
socket.gaierror: [Errno -5] No address associated with hostname

When testing gethostbyname natively from C, the error is NO_RECOVERY. The man page says A nonrecoverable name server error occurred.. I checked the source code for this and there’s a number of reasons this error could occur and I couldn’t figure out the exact cause.

// TEST C CODE FOR WARP DOH ERRORS
#include <netdb.h>
#include <stdio.h>
#include <unistd.h>
#include <limits.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>

int main(int argc, char* argv[])
{
        struct hostent* host = gethostbyname("test.s3.amazonaws.com");
        if (!host) {
                printf("error %d \n", h_errno);
                switch (h_errno) {
                        case HOST_NOT_FOUND:
                                printf("HOST_NOT_FOUND\n");
                                break;
                        case NO_DATA:
                                printf("NO_DATA\n");
                                break;
                        case NO_RECOVERY:
                                printf("NO_RECOVERY\n");
                                break;
                        case TRY_AGAIN:
                                printf("TRY_AGAIN\n");
                                break;
                        default:
                                printf("unknown\n");
                }
        } else {

                printf ("Name: %s \n", host->h_name);
                char* alias = host->h_aliases[0];
                while (alias) {
                        printf ("\tAlias: %s \n", alias);
                        alias++;
                        break;
                }
        }
        struct addrinfo hints = {}, *addrs;
        hints.ai_flags = AI_CANONNAME;
        char port_str[16] = {};
        int err = getaddrinfo("test.s3.amazonaws.com", port_str, &hints, &addrs);
        if (err != 0)
        {
                fprintf(stderr, " %s\n", gai_strerror(err));
        } else {
                char* address[24];
                inet_ntop(AF_INET, &(((struct sockaddr_in *)addrs->ai_addr)->sin_addr), address, 24);
                printf("addr: %s\ncannonname: %s\n", address, addrs->ai_canonname);
        }
}

This also breaks getent hosts test.s3.amazonaws.com which returns no results.

Anyone else experiencing this issue? This makes WARP DoH absolutely unusable on Linux.

EDIT:

I have investigated a bit and this happens because WARP strips CNAME entries from the ANSWER section of DNS responses. Since the ANSWER entry has a different canonical name and no CNAME alias is present, the entry is rejected.

Example DNS response from WARP:

; <<>> DiG 9.18.1-1ubuntu1.1-Ubuntu <<>> test.s3.amazonaws.com

;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id:
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
; COOKIE: 3e9e560d77e7f8bc (echoed)
;; QUESTION SECTION:
;test.s3.amazonaws.com. IN A

;; ANSWER SECTION:
s3-w.us-east-1.amazonaws.com. 5 IN A 52.217.202.73

;; Query time: 50 msec
;; SERVER: 192.168.x.x#53(192.168.x.x) (UDP)
;; WHEN: Fri Aug 05 16:53:35 UTC 2022
;; MSG SIZE rcvd: 149```

Relevant Glibc source code:
https://code.woboq.org/userspace/glibc/resolv/nss_dns/dns-host.c.html#917

Version 2022.4.235 does not have this problem.
1 Like