[SOLVED] 1.1.1.1 over cloudflared, Linux, nonstandard port fails after some time


#1

Am running Debian stretch on a LAN server that also runs local authoritative DNS, DHCP, DDNS for LAN clients. I installed cloudflared listening on a non-standard port using --port NNN so it does not interfere with the existing DNS server. I set the DNS server to forward to cloudflared on this port. dig responses show it is working. After some time, sometimes minutes and sometimes hours, something goes wrong and I get many log messages such as that below, and DNS forwarding is no longer working. After some further time (hours?) the issue appears to resolve and it’s working again, at least as far as dig requests show.

In the message below, 192.168.X.Y is the server in question. What could be the issue here?

cloudflared[8818]: time="2018-04-02T06:52:03-07:00" level=error msg="failed to connect to an HTTPS backend \"https://cloudflare-dns.com/.well-known/dns-query\"" error="failed to perform an HTTPS request: Post https://cloudflare-dns.com/.well-known/dns-query: dial tcp: lookup cloudflare-dns.com on 192.168.X.Y:53: read udp 192.168.X.Y:50592->192.168.X.Y:53: i/o timeout"


#2

Hi mike,

This has only happened once so far but I experienced similar problem.

I’m on LinuxMint 18.3 running cloudflared as service on localhost at port 55. Port 53 is used by dnsmasq and I set dnsmasq to listen to port 55. Cloudflared service works but after some time it stopped with this error messages.

Apr 04 11:00:44 Mint cloudflared[18673]: time=“2018-04-04T11:00:44+07:00” level=error msg=“failed to connect to an HTTPS backend “https://cloudflare-dns.com/dns-query”” error=“failed to perform an HTTPS request: Post https://cloudflare-dns.com/dns-query: dial tcp [2400:cb00:2048:1::6810:6f19]:443: socket: address family not supported by protocol”.

Edit: It happened again a few times, before and after reboot.


#3

Do you have a dynamic IP address? May it happen after the IP changes?


#4

Hi, jedisct1!

Thanks for dnscrypt :grin:

You’re absolutely correct! I have a dynamic IP address. Reconnecting my internet connection triggers the error. Previously I’m using google’s dns-over-https with secure-operator and don’t get the same error. Any config that I missed, probably?

Edit: I should have mentioned how my config is.
Content of cloudflared.service:

[Unit]
Description=Cloudflare DNS 1.1.1.1
After=network.target

[Service]
Type=simple
ExecStart=/usr/local/bin/cloudflared proxy-dns --port 55

[Install]
WantedBy=multi-user.target

Also this couple of lines in /etc/dnsmasq.conf

no-resolv
server=127.0.0.1#55

#5

For my situation both LAN and WAN IPs are static and are connected 24/7.


#6

Maybe something that needs to be fixed in cloudflared. I haven’t looked at how keepalive was implemented there yet.

Can you try dnscrypt-proxy and check if you have the same issue with it? Do not change the default parameters (especially the keepalive value).


#8

I used dnscrypt-proxy for a long time but moved to secure-operator since it was more convenient for my situation. The main use for me is to bypass blocks from the country’s regulator. The restriction with dnscrypt-proxy is I have to installed it in a device that directly connect to the internet which often I don’t have control of. I can’t just installed in my laptop, for example.

Though, I tried with latest dnscrypt-proxy. Setting dnscrypt proxy to use cloudflare dns works even before manually choosing it as default server. It was chosen because it was the fastest. But I didn’t try 1.1.1.1 with dnscrypt for long since the problem I mentioned above


#9

Could you try with:

/usr/local/bin/cloudflared proxy-dns --port 55 --upstream https://1.1.1.1/dns-query

Or see the https://developers.cloudflare.com/1.1.1.1/dns-over-https/cloudflared-proxy/ for service example.
I think the problem in your case is you’re using the default hostname endpoint with floating addresses, which won’t work if cloudflared is your only DNS proxy. I think I should make that a default going forward.


#10

That’s it!

Adding upstream option to the command solves the failed request issue. This is the command looks like now:

/usr/local/bin/cloudflared proxy-dns --port 55 --upstream https://1.1.1.1/dns-query --upstream https://1.0.0.1/dns-query

It’s also going along great with dnsmasq.

Thanks @mvavrusa and sorry @mike7 for hijacking your thread.


#11

Well, hijacked thread or not, using the --upstream options in my case, along with an update to version 2018.4.2, seems to have solved my problem. It has been running for 5 hours in production environment without issue, meaning the local DNS forwards to localhost port NNN on which cloudflared is listening. I’ll mark the post solved if it remains running for at least 24 hours and ideally also determine whether it’s the software update or the ``-upstream` option that solved it.

UPDATE running perfectly for the last ~60 hours.


#12

Hi, I’m having the same problem (cloudflared is running on a RaspberryPi where PiHole is running too) and I tried the solution you suggest, but I keep having exactly the same problem:

[email protected]:/var/log $ sudo systemctl status dnsproxy.service
● dnsproxy.service - CloudFlare DNS over HTTPS Proxy
Loaded: loaded (/etc/systemd/system/dnsproxy.service; enabled; vendor preset: enabled)
Active: active (running) since Sat 2018-05-12 14:22:42 UTC; 2min 52s ago
Main PID: 26215 (cloudflared)
CGroup: /system.slice/dnsproxy.service
└─26215 /home/pi/argo-tunnel/cloudflared proxy-dns --port 54 --upstream https://1.1.1.1/dns-query --upstream https://1.0.0.1/dns-query

May 12 14:25:28 raspberrypi cloudflared[26215]: time=“2018-05-12T14:25:28Z” level=error msg=“failed to connect to an HTTPS backend “https://1.0.0.1/dns-query”” error="failed to perform an HTTPS request: Post https://1.0.0.1/dns-query: net/http: request canceled while wa
May 12 14:25:28 raspberrypi cloudflared[26215]: time=“2018-05-12T14:25:25Z” level=error msg=“failed to connect to an HTTPS backend “https://1.1.1.1/dns-query”” error="failed to perform an HTTPS request: Post https://1.1.1.1/dns-query: net/http: request canceled while wa


#13

I noticed that my raspberrypi clock / timezone was 1 hour behind. I updated my timezone and clock synced correctly. After that I restarted the service and… it looks like the cloudflared daemon is working correctly. Is it a pure coincidence or was it the real problem? Cheers


#14

I think it was the real problem, clock errors cause TLS failures.


#15

I spoke too quickly. After it initially worked, it stopped working after 10 minutes. I tried to restart the service but it didn’t help, so I have deactivated it again :confused:

The error was exactly the same:

May 12 15:00:20 raspberrypi cloudflared[31909]: time=“2018-05-12T16:00:20+01:00” level=error msg=“failed to connect to an HTTPS backend “https://1.1.1.1/dns-query”” error=“failed to perform an HTTPS request: Post https://1.1.1.1/dns-query: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)”
May 12 15:00:21 raspberrypi cloudflared[31909]: time=“2018-05-12T16:00:21+01:00” level=error msg=“failed to connect to an HTTPS backend “https://1.0.0.1/dns-query”” error=“failed to perform an HTTPS request: Post https://1.0.0.1/dns-query: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)”


#16

Hi - I am having the same problem as well. Did you ever get this resolved? If so, could you please share?

I just see continuous TCP connections out to 1.1.1.1 that do not ever get a connection established.

tail /var/log/syslog

pihole cloudflared[490]: time=“2018-05-23T04:09:52Z” level=error msg=“failed to connect to an HTTPS backend “https://1.1.1.1/.well-known/dns-query”” error=“failed to perform an HTTPS request: Post https://1.1.1.1/.well-known/dns-query: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)”

netstat -ant
tcp 0 1 192.168.0.111:33318 1.1.1.1:443 SYN_SENT
tcp 0 1 192.168.0.111:33320 1.1.1.1:443 SYN_SENT
tcp 0 1 192.168.0.111:33314 1.1.1.1:443 SYN_SENT
tcp 0 1 192.168.0.111:33316 1.1.1.1:443 SYN_SENT


#17

Hi, no I never resolved this. I switched back to using 9.9.9.9 without any TLS on my PiHole :confused:


#18

Not really solved then :frowning:

Can you try dnscrypt-proxy and see if you experience the same issue?


#19

Hi,

I just tried using dnscrypt-proxy and it doesn’t work at all :confused:

I followed the instructions here https://github.com/pi-hole/pi-hole/wiki/DNSCrypt-2.0 for the part regarding the installation of the client. Then I used Scott tutorial for the part about modifying pihole settings https://scotthelme.co.uk/securing-dns-across-all-of-my-devices-with-pihole-dns-over-https-1-1-1-1/

Before I switched pihole settings from 9.9.9.9 to 127.0.0.1#5353 I could resolve hosts using the dnscrypt-proxy client:

[email protected]:/opt/dnscrypt-proxy# ./dnscrypt-proxy -resolve google.com
Resolving [google.com]

Domain exists:  yes, 4 name servers found
Canonical name: google.com.
IP addresses:   2a00:1450:4009:800::200e, 172.217.23.14
TXT records:    v=spf1 include:_spf.google.com ~all facebook-domain-verification=22rm551cu4k0ab0bxsw536tlds4h95 docusign=05958488-4752-4ef2-95eb-aa7ba8a3bd0e
Resolver IP:    74.63.26.244 (res300.lhr.rrdns.pch.net.) 

but after I switched to 127.0.0.1#5353 and restarted the dnsmasq.service I got this:

[email protected]:/opt/dnscrypt-proxy# ./dnscrypt-proxy -resolve google.com
Resolving [google.com]

Domain exists:  probably not, or blocked by the proxy
Canonical name: -
IP addresses:   -
TXT records:    -

Is there anything I can do to give you more information?

Cheers


#20

UPDATE
I noticed in the netstat output, the TCP connections to 1.0.0.1 were getting established, and there was not any of the “failed to connect to HTTPS backend” messages in the logs for 1.0.0.1.

netstat -ant
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 192.168.0.111:36254 1.0.0.1:443 ESTABLISHED

So, I changed the cloudflared connection to use 1.0.0.1 first, and now I am getting the DNS-over-HTTPS functionality on the PiHole. I am not sure whats going on, or why its not working with 1.1.1.1.

ExecStart=/home/pi/argo-tunnel/cloudflared proxy-dns --port 54 --upstream https://1.0.0.1/.well-known/dns-query --upstream https://1.1.1.1/.well-known/dns-query

My PiHole still will not connect to 1.1.1.1, but at least its working over 1.0.0.1!