Domain Name System Status:Broken | 141

Hello everybody,

I installed the IPFire Core update 141 today and i have problems with the new Domain Name System. The state is always shown as “Broken”.

I can’t use the ISP-assigned dsn-servers, because they don’t accept dnssec. That’s the reason why i’ve added long before core 141 two others DNS-servers from the IPFire-list.

There was no problem with the migration to 141, both DNS-servers were migrated, too. “Check DNS Servers” shows “OK” for both, but the state of the DNS Systems is not changing from “Broken” to another status.

I tried more DNS Servers from the IPFire-list, all showed state “OK”, but still DNS System state is “broken”. I also did restarts of IPFire after the changes and i stoppped several services, so there is enough memory available.

What else can i do or check?

The first two entries are the DNS servers from my ISP (not used), the next two entries are the migrated ones, the last two entries i’ve added after the migration:


just upgraded my test-ipfire to 141, I no longer see DNS servers in System > Home (they used to be below Gateway). Rebooted a few times, no change.

The DNS-Servers have moved to Network -> Domain Name System

Have you changed the servers short before?
unbound needs some seconds to be ready after config changes.

If you reload the page it should be working if at least one server states “Ok” like in
your picture or there is an other problem in your unbound config.

In this case try to restart unbound on commandline and check the logs.

1 Like

Hi Dieter,

I have exactly the same problem. My new DNS-page looks like yours - but this means DNS in the router is really not working; for example a shell-command like “ping google.com” just fails. Also, router functions like Pakfire are not working. I’m a bit at loss here, and I’m not certain if for other people this is working. I did a fresh install of Version 141 from scratch and then restored a backup from the previous Version 139.
I would be grateful if anybody could give me some hints.

Thanks and best regards

Christoph

Hello Arne,
thanks for your answer, that helped. Reloading the page doesn’t changed the state, but i looked for the state of unbound on the command line:

#unbound-control status
/etc/unbound/local.d/blocked.conf:1: error: syntax error
read /etc/unbound/unbound.conf failed: 1 errors in configuration file
[1582704436] unbound-control[22608:0] fatal error: could not read config file

I’ve now renamed the blocked.conf file and startet unbound:
# unbound-control start

State now change to “working”.

Thanks, Dieter

Hi Christoph,

try to check the state of unbound on the command line, like i did. That helped for me to find the problem.

I hope, you will find your problem, too.

Regards, Dieter

And i’ve finally found the mistake, which caused the syntax error.

I blocked some domains in the blocked.conf file, but i forgot to start with a “server:”-line. So it was really a syntax-error. After including this line, renaming the file back and restarting unbound, it is now working with the blocking-list.

OK, I progressed a bit in the meantime:
The problem is really that unbound does not start correctly upon startup of the router. After the router is completely fired up, you need to manually start unbound from the command line, only then DNS is working. When you subsequently change anything in the DNS-configuration after that, unbound will crash. But after the configuration change, you can again start it manually.
In the log I get the following message when unbound quits: “fatal error: Could not read config file: /etc/unbound/unbound.conf. Maybe try u nbound -dd, it stays on the commandline to see more errors, or unbound-checkconf”.
The unbound.conf file seems to be correct; I don’t think that’s the real problem.

That is about as far as I got, I hope this additional info is somewhat helpful.

Christoph

1 Like

What does “unbound-control status” says? I got the same fatal error as you, but this command told me more details about the reason.

It says:
[1582708533] unbound-control[24939:0] error: connect: Connection refused for 123
unbound is stopped

O.K., that looks like another problem. Sorry, but i can’t help with this.

Thanks anyway for your help - much appreciated.

It has something to do with the restore of the Version-139 backup file onto the newly installed Version-141 system.
In the meantime I have set up another unit with version 141 from scratch without restoring anything and, voilà, DNS is working as it should.

For the moment I will just stay on Version 139, if I have to recreate everything (especially all VPN) manually, that would be a big chore.

Thank you again and kind regards

Christoph

1 Like

Just to conclude, finally I figured out the problem - it is the backup and restore process.
Originally, I did a backup of version 139, then a clean install of version 141 and restored the backup. This does not work, there is a problem somewere if you do this.
You first have to do just the update from 139 to 141. If you want the new disk layout, now do the fresh install and then restore the 141-backup. Now everything work great!

2 Likes

Hi,

I just upgraded my test machine from core 138 to core 141.

Unbound simply does not solve any request.

Ex1 - boot error - can’t solve the NTP servers to perform clock update at boot:

Setting time on boot…
Error resolving 0.ipfire.pool.ntp.org: Name or service not known (-2)
Error resolving 1.ipfire.pool.ntp.org: Name or service not known (-2) [ OK ]

At command prompt NTP servers are not resolved by unbound

[root@black-x86-64 ~]# nslookup google.com
Server: 127.0.0.1
Address: 127.0.0.1#53

** server can’t find google.com: SERVFAIL

[root@black-x86-64 ~]# nslookup 0.ipfire.pool.ntp.org
Server: 127.0.0.1
Address: 127.0.0.1#53

** server can’t find 0.ipfire.pool.ntp.org: SERVFAIL

Then logs show that unbound is unable to resolve any request from clients in the network:

Mar 15 13:37:59 black-x86-64 unbound: [18575:1] info: generate keytag query _ta-4a5c-4f66. NULL IN
Mar 15 13:38:00 black-x86-64 unbound: [18575:1] info: validation failure dns.msftncsi.com. A IN
Mar 15 13:38:00 black-x86-64 unbound: [18575:0] info: validation failure www.msftncsi.com. A IN
Mar 15 13:38:00 black-x86-64 ntpd[17974]: new interface(s) found: waking up resolver
Mar 15 13:38:00 black-x86-64 unbound: [18575:1] info: validation failure client.wns.windows.com. A IN
Mar 15 13:38:00 black-x86-64 unbound: [18575:1] info: validation failure ipv6.msftncsi.com. A IN
Mar 15 13:38:00 black-x86-64 unbound: [18575:1] info: validation failure skydrive.wns.windows.com. A IN
Mar 15 13:38:03 black-x86-64 unbound: [18575:0] info: validation failure win8.ipv6.microsoft.com. A IN
Mar 15 13:38:03 black-x86-64 unbound: [18575:1] info: validation failure www.microsoft.com. A IN
Mar 15 13:38:03 black-x86-64 unbound: [18575:1] info: validation failure www.facebook.com. A IN
Mar 15 13:38:04 black-x86-64 unbound: [18575:1] info: validation failure fgd1.fortigate.com. A IN
Mar 15 13:38:04 black-x86-64 unbound: [18575:1] info: validation failure www.bing.com. A IN
Mar 15 13:38:05 black-x86-64 unbound: [18575:1] info: validation failure www.google.com. A IN
Mar 15 13:38:36 black-x86-64 unbound: [18575:1] info: validation failure dns.msftncsi.com. A IN
Mar 15 13:38:49 black-x86-64 unbound: [18575:1] info: validation failure client.wns.windows.com. A IN

Unbound process exist:

[root@black-x86-64 ~]# /etc/init.d/unbound status
unbound is running with Process ID(s) 15195.

dns.cgi also shows it OK:

Unbound-contol also shows it OK:

I restarted unbound, this did not fixed above.

[root@black-x86-64 ~]# unbound-control status

version: 1.9.6
verbosity: 1
threads: 2
modules: 2 [ validator iterator ]
uptime: 1258 seconds
options: reuseport control
unbound (pid 18575) is running…

Any idea how to solve it?

PS: I had in core 12x the DNS server setup manually and then moved red0 to DHCP. This is why in dns.cgi there are listed 2 pairs of DNS servers: one coming from DHCP red0 and one that I had them statically added in the past. These values are in /etc/ppp/resolv.conf and seems to be copied in /etc/unbound/forward.conf

The “ISP assigned servers” are taken from DHCP or PPP. You can disable it with the checkbox below.
in your screenshot they are listed but not used. This convert of the settings looks ok.

Strange is that the cgi shows working. Because this test use unbound to resolve “ping.ipfire.org

Do you run suricata? I had serious problems with unbound and suricata in core141. (The should fixed in core142).

1 Like

Yes, I use Suricata…with both ET & Talos sources (merged) and all possible rules active…

Ok, I will wait for 142…my production IPFire is core 139…

Which by the way! still needs unbound restart to make it work (there was a fix from core 138 to 139 to make unbound to work, but for my machine I still need to restart unbound).

So many things failed because unbound does not work: pakfire not working, DNS Blocking script, …talos and ET downloads fail,…etc

@arne_f Arne,

DNS Resolution crash with Norton DNS 199.85.127.30 and 199.85.126.30
But works fine with Cloudflare…

Seems that Norton DNS service does not work with new unbound (it worked fine for the past couple of years)

H&M

Late edit: seems init.d/unbound script has some issues

/etc/init.d/unbound test-name-server 199.85.127.30
199.85.127.30 is validating
/etc/init.d/unbound: line 453: [ off: command not found
199.85.127.30 supports TCP fallback
EDNS buffer size for 199.85.127.30: 4096

It’s a known bug: git.ipfire.org Git - ipfire-2.x.git/commit

But the whole function is removed with core141.

Arne,

I am at: Core-Update-Level: 145

I ran into this trying to find an answer for pretty much the same screenshot issue as posted above for DNS servers showing OK, but with a Broken Status. rDNS does not work. I think it may boil down to a DNSSEC issue, but thought I would get your option. I considered a double PAT situation as IPFire is sitting behind an ISP modem/router but the firewall component is disabled at the ISP equipment. I even tried enabling a status passthrough on TCP connections for a static/dynamic IP I assigned to IPFire for the MAC of the Red Interface. I figured heck, whats the point… Might as well assign it statically so its not a guessing game.

In any event, I have a Green to Red Rule to allow free passage of traffic for 53 TCP/UDP, although i use UDP for DNS. Thought I would allow it in case I need to test TCP.

DHCP on Green is pointing intentionally at 8.8.8.8 and 8.8.4.4 although if I can get it working, I would rather point DHCP at Green Interface of IPFire and let it handle DNS for me. Note… I have checked box for ISP assigned, saved, and checked with still the same issue.

The darn weird thing, is that with my IPFire Red address in passthrough mode at the ISP modem/router I thought I would test unbound as well, since some of the folks up top here had some messages.

Here is a running shell line by line so you can see what I did when I threw the second leg of my network edge in transparent passthrough. I thought below that its strange that is saying its “Broke” but not 1 single message in my testing below saying its really broke. The message saying it can’t get ports is because I tried to start unbound -dd before stopping the service, but that piece I just stopped unbound and then tried again manually on cli… Still… No real error.

[root@ipfire ~]# unbound-control status
version: 1.10.1
verbosity: 1
threads: 1
modules: 2 [ validator iterator ]
uptime: 343085 seconds
options: reuseport control
unbound (pid 3310) is running…
[root@ipfire ~]# cd /etc/unbound/local.d/
[root@ipfire local.d]# ls
[root@ipfire local.d]# unbound -dd
Jun 27 20:40:53 unbound[7736:0] error: can’t bind socket: Address already in use for 127.0.0.1 port 8953
Jun 27 20:40:53 unbound[7736:0] error: cannot open control interface 127.0.0.1 8953
Jun 27 20:40:53 unbound[7736:0] fatal error: could not open ports
[root@ipfire local.d]# unbound-control stop
ok
[root@ipfire local.d]# unbound -dd
Jun 27 20:41:12 unbound[7787:0] notice: init module 0: validator
Jun 27 20:41:12 unbound[7787:0] notice: init module 1: iterator
Jun 27 20:41:12 unbound[7787:0] info: start of service (unbound 1.10.1).
^CJun 27 20:42:41 unbound[7787:0] info: service stopped (unbound 1.10.1).
Jun 27 20:42:41 unbound[7787:0] info: server stats for thread 0: 3 queries, 3 answers from cache, 0 recursions, 0 prefetch, 0 rejected by ip ratelimiting
Jun 27 20:42:41 unbound[7787:0] info: server stats for thread 0: requestlist max 0 avg 0 exceeded 0 jostled 0
[root@ipfire local.d]# unbound-control start
[root@ipfire local.d]# unbound-control status
version: 1.10.1
verbosity: 1
threads: 1
modules: 2 [ validator iterator ]
uptime: 5 seconds
options: reuseport control
unbound (pid 8051) is running…
[root@ipfire local.d]#