Dig on www.wikipedia.org fails, symptom of larger issues?

Good Afternoon,

I was trying to set up a squid configuration, and was successful, except that I noticed a lot of sites didn’t load (with the URL filter disabled).

I began investigating thinking somehow wikipedia ended up on one of my local DNS’ blocklists (pihole) but nope, it was perfectly happy to resolve www.wikipedia.org. Puzzled, I sshed into my ipfire box and ran a dig on www.wikipedia.org and got back a SERVFAIL. I tried resolving other domains, and it worked fine, but occasionally I would get a SERVFAIL on a site I would think would be fine.

The only thing that ended up fixing me not being able to browse wikipedia was disabling the proxy on my client entirely, but thats not really the solution I want.

What should I do to figure out why using the proxy breaks wikipedia?

Howzit Wes,
and welcome…

If you get SERVFAIL occasionally on other domains as well, then the problem is not with Wikipedia but more likely something localised, or ISP.
Assumptions is the mo… so in the spirit of things I will assume you are running the DNS on the ipfire and not on a pi-hole.
You will have to provide a bit more details on your configuration.

  1. Are you on TCP, UDP or TLS?
  2. What DNS servers are you using or on recursive?
  3. Are your default Firewall Options set to forward, or have you changed to blocked?
  4. And if blocked, what FW rules have you set?

That should do to get the ball rolling…

Hey Andreas, thanks for responding to me.

  1. I was experiencing more SERVFAILs when I was using udp, so I switched over to TCP since I have a relatively small number of clients
  2. My upsteam DNS are quad9 and openDNS
  3. My default firewall options are allowed, not blocked
  4. N/A

Thanks a bunch!

No worries…
:thinking: I should probably have started off by asking which Core you are on, Core 151 was just released today.

If you are on anything 150+ you should be fine. Prior to 150 I was also experiencing weird DNS problems, the only way it worked was me doing DNS via TCP. My system is now on DNS on TLS and works fine.
If you are still on the 140+ side of things, please update.

If you have a look at the “not recommended“ section lower down on the List of Public DNS Servers, you will find both Quad9 and OpenDNS are listed.

For the sake of going through a process of elimination, change your current used ones to one of the recommended ones. Google DNS for example, as we know it works without any reported hiccups.
See how that goes and give feedback…

1 Like

I was on core 150, I just upgraded to 151 this morning.

I added in cloudflare as the primary DNS to hopefully fix things. However when I run dig through ssh on the firewall, I am still getting the below SERVFAIL:

 [root@USS-Defiant ~]# dig www.wikipedia.org 
 
 ; <<>> DiG 9.11.21 <<>> www.wikipedia.org
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 48275
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;www.wikipedia.org.             IN      A

;; Query time: 287 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Tue Oct 27 08:10:36 EDT 2020
;; MSG SIZE  rcvd: 46

But, if I dig from a client machine that isn’t connected to the proxy, but uses the nameservers handed out via DHCP, the dig works fine:

captain@USS-Excelsior:~/WSL2-Linux-Kernel-4.19.128-microsoft-standard$ dig www.wikipedia.org

; <<>> DiG 9.16.1-Ubuntu <<>> www.wikipedia.org
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 987
;; flags: qr rd ad; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0
;; WARNING: recursion requested but not available

;; QUESTION SECTION:
;www.wikipedia.org.             IN      A

;; ANSWER SECTION:
www.wikipedia.org.      0       IN      CNAME   dyna.wikimedia.org.
dyna.wikimedia.org.     0       IN      A       208.80.154.224

;; Query time: 110 msec
;; SERVER: 172.21.64.1#53(172.21.64.1)
;; WHEN: Tue Oct 27 12:14:57 EDT 2020
;; MSG SIZE  rcvd: 118

Any ideas Andreas?
Thanks again for the help

try: dig +dnssec wikipedia.org @YOUR-IPSTREAM-DNS-IP
to test the sersers that you have configured for unbound.

1 Like

It seems like Cloudflare is the problem (had a look online, you are not alone). I would suggest doing another check, but replace 1.1.1.1 with 8.8.8.8. You don’t need to change the DNS server entries, you can just changed the dig command to reflect it.
For example
dig @8.8.8.8 ipfire.org
or
dig +dnssec @8.8.8.8 ipfire.org

Your SERVFAIL looks like it’s trying to do DNSSEC, and ends up forgetting what it was doing, or rather Cloudflare is not responding as expected.

Using 1.1.1.1 in dig gives no SERVFAIL, using 8.8.8.8 gives no SERVFAIL, even using my local DNS (pihole) gives no SERVFAIL. It is only when I run dig without specifying the DNS server that I get a SERVFAIL response.

So to me, this says that unbound has an invalid default DNS, how do I change it to something better?

Below are the commands I ran from my firewall:

[root@USS-Defiant ~]# dig +dnssec www.wikipedia.org @1.1.1.1

; <<>> DiG 9.11.21 <<>> +dnssec www.wikipedia.org @1.1.1.1
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 31546
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags: do; udp: 1232
;; QUESTION SECTION:
;www.wikipedia.org.             IN      A

;; ANSWER SECTION:
www.wikipedia.org.      85879   IN      CNAME   dyna.wikimedia.org.
dyna.wikimedia.org.     79      IN      A       208.80.154.224

;; Query time: 9 msec
;; SERVER: 1.1.1.1#53(1.1.1.1)
;; WHEN: Tue Oct 27 21:30:04 EDT 2020
;; MSG SIZE  rcvd: 91

[root@USS-Defiant ~]# dig +dnssec www.wikipedia.org @8.8.8.8

; <<>> DiG 9.11.21 <<>> +dnssec www.wikipedia.org @8.8.8.8
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 6475
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags: do; udp: 512
;; QUESTION SECTION:
;www.wikipedia.org.             IN      A

;; ANSWER SECTION:
www.wikipedia.org.      21037   IN      CNAME   dyna.wikimedia.org.
dyna.wikimedia.org.     208     IN      A       208.80.154.224

;; Query time: 4 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Tue Oct 27 21:30:11 EDT 2020
;; MSG SIZE  rcvd: 91

[root@USS-Defiant ~]# dig +dnssec www.wikipedia.org @10.0.0.3

; <<>> DiG 9.11.21 <<>> +dnssec www.wikipedia.org @10.0.0.3
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 35081
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags: do; udp: 4096
;; QUESTION SECTION:
;www.wikipedia.org.             IN      A

;; ANSWER SECTION:
www.wikipedia.org.      45343   IN      CNAME   dyna.wikimedia.org.
dyna.wikimedia.org.     549     IN      A       208.80.154.224

;; Query time: 3 msec
;; SERVER: 10.0.0.3#53(10.0.0.3)
;; WHEN: Tue Oct 27 21:30:17 EDT 2020
;; MSG SIZE  rcvd: 94

[root@USS-Defiant ~]# dig +dnssec www.wikipedia.org

; <<>> DiG 9.11.21 <<>> +dnssec www.wikipedia.org
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 14269
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags: do; udp: 1232
;; QUESTION SECTION:
;www.wikipedia.org.             IN      A

;; Query time: 157 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Tue Oct 27 21:30:28 EDT 2020
;; MSG SIZE  rcvd: 46

Wes,

I get a different ip for wikipedia.org 198.35.26.96 but your output shows 208.80.154.224

(my DNS settings, 8.8.8.8, standard/udp)

Both IP addresses reverse WHOIS to Wikipedia. What exactly are you alluding to?

I unfortunately cannot recreate the error you get.

The ; EDNS: version: 0, flags:; udp: 1232 would indicate the one of the DNS services your FW is speaking to has maybe a wrong UDP size limit set. No way for me to check or prove this, but if you Google it, Cloudflare keeps coming up as having problems since September 2020. May be unrelated, maybe not.

Edit UPDATE: every dig +dnssec www.wikipedia.org @IP_OF_DNS_SERVER
Returns contain

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags: do; udp: 512

only Cloudflare returns with

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags: do; udp: 1232

If no DNS server is defined, the FW will operate in Recursor Mode. For a lack of a better description its own DNS server.
In recursive the FW will first try to resolve any DNS queries via a root nameserver, then via a TLD nameserver, and lastly via an authoritative nameserver, before returning the results.

Which nameservers those are has to do with which ISP you are using, and I don’t mean the assigned DNS servers the ISP owns, but rather the authorative nameservers they connect to.

To have some measure of control, you can define which nameservers your FW will use. You do this from within the Web GUI, under the Networking > Domain Name System. Although you still have no clue which authorative nameservers those DNS servers connect to, and this now turns into a trust game. Who do you trust more? :thinking: The DNS roulette… 007, coming to a cinema near you… sorry getting side tracked… would make a nice movie title. :smile:

So… from where I am sitting, one of these seem to be the problem. And I am only talking about the FW.
You said you are using quad9 and openDNS on the FW. Has this changed?
If not, then one of those DNS entries are causing it.
If you are in Recursor Mode then a DNS server your FW has decided to talk to, as outside DNS authorative server, is causing this. And that has most likely something to do with your ISP.

The above mentioned will effect all DNS related queries, that includes dig.

Sorry mate, maybe someone else has a better solution.

Have you enabled IDS (suricata) if yes try to disable it. Sometimes suricata blocks unbounds dns queries without logging. (I have not found the reason for this)

Here is my original DNS configuration in ipfire:

The only DNS server my network should be using is my local one, which also enforces DNSSEC. I am currently only using green and red, so it is accessible from my entire network. If you look at an earlier post, if I specify to dig +dnssec www.wikipedia.org @10.0.0.3 wikipedia resolves just fine. However, if I ommit the @10.0.0.3 ipfire returns a SERVFAIL.

My network clients only get a SERVFAIL when trying to resolve www.wikipedia.org if they are configured to use ipfire’s squid proxy, even if I have clamscan, suricata, and the URL filter all disabled.

It seems like at some point this thread ended up in the weeds (most likely my doing) but I am going to repeat my problem again so hopefully it becomes more clear.

Issue:
When a network client is configured to use ipfire’s squid proxy, it becomes unable to connect to www.wikipedia.org, the only related behavior is that when running #dig +dnssec www.wikipedia.org from a ssh session on the firewall, it returns a SERVFAIL

Expected Behavior:
Since using the squid proxy shouldn’t change the DNS that clients use (all clients should still use the 10.0.0.3:53 that the firewall gave them during DHCP), whether or not the clients are using squid, they should be able to connect to www.wikipedia.org

Does that make my issue more clear? I am not certain it is a DNS issue, it just seems like if squid uses unbound by default, and unbound is returning SERVFAIL, that would be the problem. But IDK if thats the issue.

No. If squid is used in non-transparent mode the client send "GET wikipedia.org " to the proxy and the proxy make the dns resolution. The client need no DNS in this mode.

Oh, I did not know that, thanks.

I do not see an option in the web GUI to set the upstream DNS for the proxy, how can I set the proxy’s upstream DNS?

So your pathway is:
LAN -> fw -> pi-hole -> DNS and return.

If you do a dig +dnssec www.wikipedia.org @10.0.0.3 or dig +dnssec www.wikipedia.org from the FW the pathway according to your Domain Name System ends up being the same. The only difference is that the second one starts querying the local DNS cache first.

This means you should probably flush the cache on the FW

unbound-control -c /var/unbound/unbound.conf reload
Not sure if that actually clears the DNS cache or just restarts unbound, so I guess you can
unbound-control flush www.wikipedia.org
if you only want to do that domain only and/or for everything
unbound-control flush_zone .

There is a . dot at the end ^

Check your pi-hole :face_with_raised_eyebrow: …not meant the way it reads :smile:… it might be using unbound as well.

Correct, DNS queries should follow this path:
LAN -> FW -> pihole (still on LAN) -> cloudflare or google or opendns or quad9 or whoever
So I ran the commands you said to clear and restart unbound, but that doesn’t appear to have worked, if I re-run the dig I get the following:

[root@USS-Defiant ~]# dig +dnssec www.wikipedia.org

; <<>> DiG 9.11.21 <<>> +dnssec www.wikipedia.org
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 58201
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags: do; udp: 1232
;; QUESTION SECTION:
;www.wikipedia.org.             IN      A

;; Query time: 387 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Wed Oct 28 09:12:52 EDT 2020
;; MSG SIZE  rcvd: 46

[root@USS-Defiant ~]# dig +dnssec www.wikipedia.org @10.0.0.3

; <<>> DiG 9.11.21 <<>> +dnssec www.wikipedia.org @10.0.0.3
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 46301
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags: do; udp: 4096
;; QUESTION SECTION:
;www.wikipedia.org.             IN      A

;; ANSWER SECTION:
www.wikipedia.org.      86036   IN      CNAME   dyna.wikimedia.org.
dyna.wikimedia.org.     236     IN      A       208.80.154.224

;; Query time: 2 msec
;; SERVER: 10.0.0.3#53(10.0.0.3)
;; WHEN: Wed Oct 28 09:12:59 EDT 2020
;; MSG SIZE  rcvd: 94

So it seems like unbound is not forwarding DNS queries to the pihole, and I am not sure why.

If you want to check the coms between the FW and pi-hole, open two terminal sessions, in the one you run
conntrack -E -p udp
or
conntrack -E -d 10.0.0.3

The first will show any udp traffic, if you add 53 to the end it will cough up all DNS requests, but that may limit some info. The second one is using your pi-hole as the destination machine. As you only use it for DNS queries, no need to mention protocols or ports.

In the second terminal do the dig again

Alternatively…
Assign the ipfire an external DNS only, take the pi-hole out of the equation. If the all works correctly then you know that the problem has to be with the pi-hole. …but then you are on your own.

I did some more searching this morning and found this older forum post that I think is related: Pi-Hole and IPFire, which way round?

I will investigate more tonight and get back to you, thanks for the help

Okay yeah, IDK how that forum thread didn’t appear when I was first doing my research, but that appears to be the problem.

I have since done some experimentation by removing my pihole from the network and just using squid+unbound, but it didn’t yield as good of a result, although it is more secure.

Anyways, for my home network I think I will experiment with other solutions that may not be as hardened, and keep ipfire in mind for when I need a firewall on a less trustworthy network.

Thanks for sticking with me all.