Keepalived & DHCP fail-over issues

Hi there,

I have keepalived running on 2 ipfire’s and the fail-over works fine.
Unfortunately, both dhcp servers keep responding to dhcp requests on their green0 IP addresses.
(They are also providing their green0 IP addresses in the DHCP responses instead of the virtual IP generated by keepalived. DHCP Option “DHCP Server Identifier”).

Master ipfire IP ends with .2 on green.
Backup ipfire IP ends with .3 on green
Virtual ipfire IP on green is .1
/etc/dhcpd.conf has option routers set to the .1 virtual IP, which gets ignored.

I have not been able to find a way to tell the dhcp processes to listen and respond on the .1 virtual interface instead of green0. This sesems to be the root cause.

Any idea?

Thanks in advance!
Peter

OK, now I am back again :slight_smile: Maybe we have similar problems?

This is related to the issues I had here DHCP web interface shows wrong status and I have now realised why dhcpd.conf.local was not empty.

  • Yes, DHCP now starts when I deleted the content in dhcpd.conf.local
  • But DHCP via Keepalived no longer seem to work right without the content in dhcpd.conf.local, I can not set up same virtual IP 192.168.222.254 as router gateway for DHCP for both Keepalived servers.
  • How should I do to set up so I have same virtual IP 192.168.222.254 as router gateway for DHCP for both Keepalived servers?

When I use Keepalived, I want both servers to use the virtual IP 192.168.222.254 also as router gateway for DHCP. wiki.ipfire.org - Keepalived say that to 4/6 Make “same” DHCP work during failover

On both IPFires. Add the same “virtual” IPFire failover firewall IP to /var/ipfire/dhcp/dhcpd.conf.local (copy from /var/ipfire/dhcp/dhcpd.conf and edit)

E.g

subnet 192.168.0.0 netmask 255.255.255.0 #GREEN
{
      range 192.168.0.50 192.168.0.150;
      option subnet-mask 255.255.255.0;
      option domain-name "localdomain";
      option routers 192.168.0.254;
      option domain-name-servers 192.168.0.254, 8.8.8.8;
      option ntp-servers 192.168.0.254;
      default-lease-time 3600;
      max-lease-time 7200;
} #GREEN

That is why dhcpd.conf.local had contents. I have tried to add content again to dhcpd.conf.local again, but then DHCP service will not start. This is what I have now (/var/ipfire/dhcp/dhcpd.conf.local is now empty):

My Primary IPFire

[root@ipfire ~]# cat  /etc/keepalived/keepalived.conf
! Configuration File for keepalived

global_defs {
   notification_email {
     someemail@gmail.com
        }
   notification_email_from someemail@gmail.com
   smtp_server localhost
   smtp_connect_timeout 30
   router_id LVS_DEVEL
}

vrrp_instance VI_1 {
    state MASTER
    interface green0
    virtual_router_id 22
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass secretforfailover
    }
    virtual_ipaddress {
        192.168.222.254
    }
}


[root@ipfire ~]# cat /var/ipfire/dhcp/dhcpd.conf
deny bootp;     #default
authoritative;
ddns-update-style none;

subnet 192.168.222.0 netmask 255.255.255.0 #GREEN
{
pool {
        range 192.168.222.50 192.168.222.150;
     }
        option subnet-mask 255.255.255.0;
        option domain-name "localdomain";
        option routers 192.168.222.251;
        option domain-name-servers 192.168.222.251, 8.8.8.8;
        option ntp-servers 192.168.222.254;
        default-lease-time 3600;
        max-lease-time 7200;
} #GREEN

include "/var/ipfire/dhcp/dhcpd.conf.local";


Backup IPFire (it also have a blue Wifi DHCP, not used by normal users)

[root@ipfire2 ~]# cat  /etc/keepalived/keepalived.conf
! Configuration File for keepalived

global_defs {
   notification_email {
     someemail@gmail.com
        }
   notification_email_from someemail@gmail.com
   smtp_server localhost
   smtp_connect_timeout 30
   router_id LVS_DEVEL
}

vrrp_instance VI_1 {
    state BACKUP
    interface green0
    virtual_router_id 22
    priority 50
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass secretforfailover
    }
    virtual_ipaddress {
        192.168.222.254
    }
}


[root@ipfire2 ~]#  cat /var/ipfire/dhcp/dhcpd.conf
deny bootp;     #default
authoritative;
ddns-update-style none;

subnet 192.168.222.0 netmask 255.255.255.0 #GREEN
{
pool {
        range 192.168.222.50 192.168.222.150;
     }
        option subnet-mask 255.255.255.0;
        option domain-name "localdomain";
        option routers 192.168.222.252;
        option domain-name-servers 192.168.222.254, 8.8.8.8;
        option ntp-servers 192.168.222.254;
        default-lease-time 3600;
        max-lease-time 7200;
} #GREEN

subnet 192.168.234.0 netmask 255.255.255.0 #BLUE
{
pool {
        range 192.168.234.100 192.168.234.150;
     }
        option subnet-mask 255.255.255.0;
        option domain-name "localdomain";
        option routers 192.168.234.254;
        option domain-name-servers 192.168.222.252, 8.8.8.8;
        default-lease-time 3600;
        max-lease-time 7200;
} #BLUE

include "/var/ipfire/dhcp/dhcpd.conf.local";

Any ideas?

Hi @raffe

I set up an IPFire clone in my vm testbed which had a green ip of 192.168.200.254 with a dynamic range of 192.168.200.50 to 192.168.200.150

I the tried copying the green section from dhcpd.conf into dhcpd.conf.local and changing the router and dns to 192.168.200.251. Similar to what you did and what is in the keepalived wiki section.

When I tried restarting dhcpc using either the save button in the cgi WUI page or via
/etch/init.d/dhcp restart
I also found that dhcp stopped running and would not start.

In the logs were the following messages

Oct 26 21:26:22 ipfire dhcpd: /var/ipfire/dhcp/dhcpd.conf.local line 4: lease 192.168.200.50 is declared twice!
Oct 26 21:26:22 ipfire dhcpd:         range 192.168.200.50 192.168.200.150;
Oct 26 21:26:22 ipfire dhcpd:                                              ^
Oct 26 21:26:22 ipfire dhcpd: /var/ipfire/dhcp/dhcpd.conf.local line 4: lease 192.168.200.51 is declared twice!
Oct 26 21:26:22 ipfire dhcpd:         range 192.168.200.50 192.168.200.150;
Oct 26 21:26:22 ipfire dhcpd:                                              ^
Oct 26 21:26:22 ipfire dhcpd: /var/ipfire/dhcp/dhcpd.conf.local line 4: lease 192.168.200.52 is declared twice!
Oct 26 21:26:22 ipfire dhcpd:         range 192.168.200.50 192.168.200.150;
Oct 26 21:26:22 ipfire dhcpd:                                              ^
Oct 26 21:26:22 ipfire dhcpd: /var/ipfire/dhcp/dhcpd.conf.local line 4: lease 192.168.200.53 is declared twice!
Oct 26 21:26:22 ipfire dhcpd:         range 192.168.200.50 192.168.200.150;
Oct 26 21:26:22 ipfire dhcpd:                                              ^
Oct 26 21:26:22 ipfire dhcpd: /var/ipfire/dhcp/dhcpd.conf.local line 4: lease 192.168.200.54 is declared twice!
Oct 26 21:26:22 ipfire dhcpd:         range 192.168.200.50 192.168.200.150;
Oct 26 21:26:22 ipfire dhcpd:                                              ^

and it kept going all the way to 192.168.200.150

Basically dhcp does not like there being two subnet definitions for green with the same range of IP’s.

This would suggest that the entry in the keepalived wiki regarding dhcp is either incorrect or not clear enough in what needs to be edited but having a duplicate subnet and range is not acceptable to dhcp.

Hmm, now I’m just guessing, but maybe the above line from /var/ipfire/dhcp/dhcpd.conf should be deleted or commented out? It was working fine before, but don’t have any old copy of dhcpd.conf I can find to look for differences.

EDIT: No, this under break DHCP!
Or maybe only have this in dhcpd.conf.local

subnet 192.168.222.0 netmask 255.255.255.0 #GREEN
{
        option routers 192.168.222.254;
} #GREEN

I have checked through the history of dhcp.cgi in the IPFire git repository and the include statement has been present since at least 2014.

It could be that dhcpd is working differently and in the past allowed duplicate entries and only took the last one but I have not been able to find anything on that in the dhcpd changelogs that I have searched.

dhcp is currently on version 4.4.1 and has been there since 2018. Previous to that it was 4.3.1 since 2015.

Huh, so not really any changes?This is strange.

Manually editing /var/ipfire/dhcp/dhcpd.conf from

        option routers 192.168.222.251;
        option domain-name-servers 192.168.222.251, 8.8.8.8;

To

        option routers 192.168.222.254;
        option domain-name-servers 192.168.222.254, 8.8.8.8;

Works if doing a /etc/init.d/dhcp restart
Maybe some script to check and do this if keepalived is started and restart DHCP? Hmm…

EDIT: As long as I don’t do any changes, I guess it could work with manually editing option routers.

There is also the possibility to use declaration blocks for failover peers in dhcpd.conf, but if also they get replaces if doing changes in WGI a simple change on option routers is easier. As here https://linux.die.net/man/5/dhcpd.conf they say:

The server currently does very little sanity checking, so if you configure it wrong, it will just fail in odd ways. I would recommend therefore that you either do failover or don’t do failover, but don’t do any mixed pools. Also, use the same master configuration file for both servers, and have a separate file that contains the peer declaration and includes the master file. This will help you to avoid configuration mismatches. As our implementation evolves, this will become less of a problem.

About failover peers:

https://www.madboa.com/geek/dhcp-failover/

1 Like

As long as I don’t do any changes, I guess it could work with manually editing option routers. But, for an easy setup, maybe this would be best?:

  • Possibility to add/edit router/gateway directly in the web page /cgi-bin/dhcp.cgi
  • And then just setup the same router/gateway for both servers
  • Also maybe think about using different pools in the servers, e.g:
    primary server range 192.168.0.50 - 192.168.0.150
    and
    backup server range 192.168.0.151 -192.168.0.199

Do you think the possibility to add/edit router/gateway directly in the web page /cgi-bin/dhcp.cgi would be considered by the developers?

I am not sure.
When I was experimenting on my vm setup, trying to duplicate your issue, when I changed the router/gateway IP to a made up one that did not exist, at one point I ended up losing my WUI connection to IPFire. I had to go into the console connection to correct it.

I have the feeling that the developers like to avoid, where possible, setups in the WUI that someone can easily lock themselves out with.

I would suggest that you join the developers mailing list
https://wiki.ipfire.org/devel/mailing-lists
and ask the question there.

1 Like

Hi all,

thanks for the interesting information and for trying to help. Unfortunately this is not addressing my issue.

My dhcp processes work fine and so does the fail-over, both master and backup ipfire.
Using Wireshark, I can also see that the router attribute in the DHCP ACK responses is set to the virtual IP (.1 in my case). That is correct, too.

It seems that my issue is the fact that the backup ipfire responds to DHCP requests. I would expect it keeps quiet until the master IP fire goes down.

I see 2 possible solutions how to achieve this, but cannot get either to work:
Option 1 (preferred): I bind the dhcpd to the virtual IP instead of green0. This seems imposisble, because the virtual IP is not a real interface on ipfire. One cannot provide it to the dhcpd when you start it (dhcpd is started as “dhcpd -q green0” by ipfire).
Option 2: There is a way to disable DHCP on green0 of the backup ipfire when the master ipfire is up.
I have no idea how this could be done.

Best Regards

Peter

Hi Peter!

First, I am not an expert, so I am probably wrong :slightly_smiling_face:

I think keepalived is only for failing-over an IP address, from one machine to another. Nothing more.

If you want to use real failover for DHCP, I think you have to use declaration blocks for failover peers in dhcpd.conf. See above Keepalived & DHCP fail-over issues - #6 by raffe under EDIT:

Otherwise I think both IPFires will just be up and functioning, and if both have normal DHCP both will give IP numbers. That is why I have “solved” it with two different pools on my IPFire servers and manually edit “option routers”:

  • primary server range 192.168.0.50 - 192.168.0.150
    and
    backup server range 192.168.0.151 -192.168.0.199
  • And I have manually changed “option routers” in dhcpd.conf on both servers to the virtual IP 192.168.0.254

Then it does not matter which server gives IP numbers, as the DHCP clients will all use the virtual failover IP as router/gateway and that is the “server” that is working “at the time”.

If dhcpd.conf get overwritten, option routers is easier to edit than a full dhcpd.conf. But I think that one solution could also be to use the include "/var/ipfire/dhcp/dhcpd.conf.local"; in dhcpd.conf to have the declaration blocks for failover peers in dhcpd.conf.local. But I have not had the time to test it, if dhcpd.conf gets overwritten often I may look in to it.