Problem with CloudFlare automatic TTL being too short for Google DNS
I moved one site DNS to CloudFlare and noticed my mail client has often timed out on DNS lookup. The problem seems to be with Google DNS 18.104.22.168 and 22.214.171.124, which would return SERVFAIL to nslookup most of time.
nslookup -debug -type=A mail.mydomain.com 126.96.36.199
Server: 188.8.131.52 Address: 184.108.40.206#53 ------------ QUESTIONS: mail.mydomain.com, type = A, class = IN ANSWERS: AUTHORITY RECORDS: ADDITIONAL RECORDS: ------------ ** server can't find mail.mydomain.com: SERVFAIL
but when querying CloudFlare , OpenDNS or tools like dnschecker.org it would always work fine.
I use my router as DNS server and I believe the ISP internally uses Google DNS (do not see it in settings, it just says Automatic DHCP configuration) and that's why the mail client has been acting up.
220.127.116.11 and 18.104.22.168 are not single nodes but load balancers and everytime queried, different node answers, I can see it by jumping TTL in the debug answers when I get an answer from time to time. The TTL is always very low, highest value I seen was 150.
I believe that CloudFlare's automatic TTL is too short for propagation within Google DNS network and with the amount of DNS queries Google network handles, my insignificant low priority domain probably spends most of the time waiting in queues to be refreshed from its authoritative DNS. After changing TTL to 3600, I started seeing mostly good results from Google DNS.
Btw. when testing Google 22.214.171.124, CloudFlare 126.96.36.199 and OpenDNS 188.8.131.52 for latency, Google is slowest, and the other two are comparable
UPDATE: Google offers public DNS cache flushing tool here: https://developers.google.com/speed/public-dns/cache
After I flushed my main domain, did about 20-30 nslookups on each 184.108.40.206 and 220.127.116.11 and it works 100% now with 1hr TTL.