annotation
The last article addressed the issue of organizing redundancy for local area network gateways. As a solution, a script was proposed, which at that time solved the problem, but had several disadvantages. After some time, it turned out to eliminate these shortcomings, partially rewrite the code and get something acceptable at the output. Now we can say that the scripts are sufficiently tested to be called stable. To simplify the understanding of the entire system, the main points on setting up secondary services (in terms of the topic of the article) will be partially duplicated below. The reason is simple - during this time, the ipfw rules were also reworked, the dns went live in AD on Samba4 with bind-frontend and safely updating records from isc-dhcpd using kerberos, as well as secondary dns-servers in the form of bind on gateways, was CARP is tuned ... In general, it became much more interesting, but in more detail about how and how it works - below. All that can be given links to the source, will be designed in this way, so as not to produce the essence. What has been taken from any other places, but which is no longer available, will be presented here with appropriate comments.
Introduction
So, to increase the noise immunity of the communication channel with the outside world by the consumer, there are two ways: reservation of the gateway and reservation of the connection point. In other words, in the first case, a second gateway is put in place, in case the first one fails, in the second case, a backup Internet channel is organized in case of any problems with the main one, and the more they intersect, the better it is. If the first task for FreeBSD is solved by
CARP , then the second, after the organization of the second external channel, can again be solved in several ways. At a minimum, traffic balancing or channel switching can be arranged. Due to the large difference in bandwidth of external channels, the first option did not
fit me, so the main culprit of the publication was written:
ToFoIn is a set of bash scripts that is aimed at solving problems of diagnostics and switching to a working external channel. After completion, it can be used for n gateways and m channels. The situation is poorly represented, where n and m are more than 2, but the following scripts should work up to large values ââof n and m, since logical limit is not set. In general, I suspect that using these scripts you can solve a fairly wide range of tasks depending on the connection status, limited, perhaps, only by your imagination.

Approximately this network topology assumes in its simplest form the use of a set of ToFoIn scripts. Of course, the scripts must also work in the case of a single router, but in this case, the Daemon module will have to be greatly modified to remove the dependence of the sequence of actions on the state of CARP, which will simply be absent in the system. Further reservation of these and other nodes depends only on the degree of importance of the respective services.
Targets and goals
The goal of the project, as before, is to create a universal and easily scalable software package focused on identifying problems with external and internal connections and automatically switching to workable connections. In general, the logic is as follows:
')
- There are n âroutersâ with m external channels on each. In this case, all n "routers" are in a strict hierarchy and are connected to each other with the help of CARP on all the necessary interfaces.
- On all machines, an agent works independently of each other, whose task, based on the current CARP status of his car:
- If backup - identify and configure the machine on the router, which is the master at the moment;
- If master - check the status of connections at the current time and, if necessary, switch between external channels.
Decision
CARP operates in the internal network, which provides backup gateways and allows you to not change the settings of other network devices of the internal network when switching channels.
Dhcpd operates in the primary - secondary mode and, in general, it doesnât matter what other roles its car plays - the connection between dhcpd occurs in the internal network, which routers always look at.
Master bind is removed in AD, which is hidden in the local network, while secondary bind servers operate on gateway routers on an equal basis.
The ipfw rules differ depending on which channel is considered primary at a given moment and are restarted by the Daemon module when the role changes.
Finally, about the scripts themselves. Now the files are located in the appropriate directories, work from their user and have a startup script in rc.d. Tasks requiring root access are handled by sudo. There is an installation script that takes into account the possible presence of an installed version, as well as a fairly detailed configuration file. The modules are the same with minor changes, some almost did not change in functionality:
Daemon - as the name implies, is the main process that starts the testing and switching modules by timer and also tracks CARP.
Tester - tests the availability of communication through external channels still with the help of the ping command. (if it is running, it considers that the car has CARP in Master state)
Judge - based on the test results, determines which external channel is working and whether switching is necessary, performs switching (if it is running, it considers that the machine has CARP in the Master state).
Scout is a new module. Runs when CARP is in Backup status. It is needed to determine which of the remaining routers is currently the primary one.
Logger - is responsible for logging events. It is necessary so that information about events is not duplicated and the magazine is easier to read.
Watchdog - runs on a schedule from the crontab. It determines the "freezing" of all modules and (if possible) tries to solve the problems that have arisen. Those. nail everyone, to put it simply.
In addition to the scripts themselves, it is worth considering some more important files:
Tofoin.conf - a single configuration file.
Tofoin.log is a single event log file.
Result_ <internal channel number> is a work file, test results are added here, created in / tmp next to .pid and other work files.
I am happy to answer questions regarding the work of the modules, explaining the decisions in the comments.
Technical part
Equipment
Compared to the previous time, the gateways moved to P4, received 1536 Mb of RAM and three 40 Gb HDDs each (mirror + spare). Network cards are still PCI, BP are normal, naturally in the presence of a UPS.
The increase in capacity is associated with the released iron and unnecessarily tedious update from the source, but mostly - the first. OS FreeBSD 11.1, FS zfs.
System Component Settings
Read moreThe kernel is built with such additional parameters (something can be set in the loader, but better so):
options IPFIREWALL
Settings / boot / loader.conf:
geom_mirror_load="YES" zfs_load="YES" kern.geom.label.gptid.enable="0" vm.kmem_size="1024M" vm.kmem_size_max="1024M" vfs.zfs.arc_max="512M" vfs.zfs.vdev.cache.size="30M" vfs.zfs.prefetch_disable=1 kern.vty=vt
The /etc/rc.conf settings on the first machine (the CARP setting is of primary interest):
ifconfig_eth0="up" vlans_eth0="vlan111 vlan222" create_args_vlan111="vlan 111" create_args_vlan222="vlan 222" ifconfig_eth1="up" vlans_eth1="vlan333 vlan444 vlan555" create_args_vlan333="vlan 333" create_args_vlan444="vlan 444" create_args_vlan555="vlan 555" ifconfig_eth2="up" vlans_eth2="vlan666 vlan777 vlan888" create_args_vlan666="vlan 666" create_args_vlan777="vlan 777" create_args_vlan888="vlan 888" ifconfig_vlan666="inet 192.168.0.1/24" ifconfig_vlan666_alias0="vhid 1 advskew 100 pass MyPassword alias 192.168.0.5/32" ifconfig_vlan777="inet 192.168.1.1/24" ifconfig_vlan777_alias0="vhid 1 advskew 100 pass MyPassword alias 192.168.1.5/32" ifconfig_vlan888="inet 192.168.2.1/24" ifconfig_vlan888_alias0="vhid 1 advskew 100 pass MyPassword alias 192.168.2.5/32" ifconfig_vlan111="inet 192.168.3.1/30" ifconfig_vlan111_alias0="vhid 1 advskew 100 pass MyPassword alias 1.1.1.2/24" ifconfig_vlan222="inet 192.168.4.1/30" ifconfig_vlan333="inet 192.168.5.1/30" ifconfig_vlan333_alias0="vhid 1 advskew 100 pass MyPassword alias 2.2.2.2/30" ifconfig_vlan444="inet 192.168.6.1/30" ifconfig_vlan444_alias0="vhid 1 advskew 100 pass MyPassword alias 3.3.3.2/30" ifconfig_vlan555="inet 192.168.7.1/30" defaultrouter="1.1.1.1" setfib1_enable="YES" setfib1_defaultrouter="3.3.3.1" setfib2_enable="YES" setfib2_defaultrouter="2.2.2.1" zfs_enable="YES" named_enable="YES" dhcpd_enable="YES" firewall_enable="YES" firewall_logging="YES" firewall_script="/etc/firewall.sh" gateway_enable="YES" tofoin_enable="YES"
Legend:
eth0, eth1, eth2 - physical adapters
vlan666, vlan777, vlan888 - virtual LAN adapters,
vlan222 and vlan555 - adapters for backup communication between external network cards (perhaps no longer needed, were actively used before)
vlan111 - main external channel
vlan444 - backup external channel
vlan333 - telephonySettings / etc / rc.conf on the second machine (the main interest is the CARP setting, some of the repeating lines are removed):
ifconfig_vlan666="inet 192.168.0.2/24" ifconfig_vlan666_alias0="vhid 1 advskew 0 pass MyPassword alias 192.168.0.5/32" ifconfig_vlan777="inet 192.168.1.2/24" ifconfig_vlan777_alias0="vhid 1 advskew 0 pass MyPassword alias 192.168.1.5/32" ifconfig_vlan888="inet 192.168.2.2/24" ifconfig_vlan888_alias0="vhid 1 advskew 0 pass MyPassword alias 192.168.2.5/32" ifconfig_vlan111="inet 192.168.3.2/30" ifconfig_vlan111_alias0="vhid 1 advskew 0 pass MyPassword alias 1.1.1.2/24" ifconfig_vlan222="inet 192.168.4.2/30" ifconfig_vlan333="inet 192.168.5.2/30" ifconfig_vlan333_alias0="vhid 1 advskew 0 pass MyPassword alias 2.2.2.2/30" ifconfig_vlan444="inet 192.168.6.2/30" ifconfig_vlan444_alias0="vhid 1 advskew 0 pass MyPassword alias 3.3.3.2/30" ifconfig_vlan555="inet 192.168.7.2/30" defaultrouter="1.1.1.1" setfib1_enable="YES" setfib1_defaultrouter="3.3.3.1" setfib2_enable="YES" setfib2_defaultrouter="2.2.2.1"
Some rules that will come in handy when configuring ipfw (nat):
to allow CARP traffic:
/sbin/ipfw -q add allow carp from any to any
"Nuclear" nat:
/sbin/ipfw -q nat 1 config log ip vlan111 reset same_ports deny_in unreg_only /sbin/ipfw -q add nat 1 ip from any to any in
using specific routing tables with specific adapters:
/sbin/ipfw -q add setfib 0 all from any to any via vlan666
In general, one could quietly write a separate article about the settings of ipfw applied by me, but this is some other time.
Third Party Software
Read moreSince there is a need to simultaneously work with two or more external channels, it is convenient for this to have several routing tables, one for each channel. And it would be nice if these tables were created at the start themselves. This will help rc.d setfib script. The logic used in ToFoIn assumes that the file name (setfib1, setfib2, etc.) is the same as the table number into which the individual script adds the default route. The table has the default number "0".
DNS servers with Bind in the main role work in the secondary mode, the main role is samba4 + bind, hidden in the local network. The setup of secondary bind is beautifully disclosed in Cricket Lee and Paul Albitts's book DNS and BIND. I donât remember any special requirements that take into account the use of samba4 for secondary servers, and I donât have any mention of them in the settings file. Unless, for different Internet channels, you may need to create 2 different files, which will then be copied to the ToFoIn script into the place from which the bind itself will read it. This is due to the fact that when you specify both providers' addresses in the same file, given that bind only works with one routing table, there is a problem with resolving addresses from upstream servers that are inaccessible at some point.
Failover isc-dhcpd. Dhcpd is not critical for the work of ToFoIn, moreover, its absence will not affect the operation of the scripts at all, however, it seems to me that it is enough to place the dhcp server on gateways and then the issue of failover still arises. And here, in comparison with the last time, it became more interesting ... In addition to the settings necessary for the failover, which I described
last time (the beginning of the âpre-settingâ section within the drop-down menu).
You also need a script to securely update dns entries in AD using samba4. The samba4 server itself should simply be installed. Setup and launch is not required, we are only interested in management tools that come with it. Other information can be found in the section âDHCP with dynamic DNS updatesâ
at .
It looks scary, but it works.
This completes the configuration of third-party software.
Little about ToFoIn
All the text of the project along with the installation script is available on
gitlab .
Finally, an example of the parameters of the ToFoIn settings file is considered:Number of routers used in the system:
RNUMBER=2
When using additional subnets, you need to set a default route when the router becomes main. Here you can specify the number of the corresponding setfib file that will be restarted. In this example, setfib2:
ADDITLAN=2
Internal adapter name:
INT_IF=vlan666
All other interfaces on which routers are connected through CARP. Required to control and maintain the same state of all interfaces:
ALL_IF="vlan111 vlan333 vlan444 vlan666 vlan777 vlan888"
The vhid that was used when setting up CARP:
CARP_VHID=1
IP addresses in the internal network of other routers in order of importance, if necessary, are then simply used ASERV_IP_2, ASERV_IP_3, etc.
ASERV_IP_1=192.168.0.2
Number of external connection channels:
CNUMBER=2
Settings for the main external connection channel:
Adapter Name:
EXT_0_IF=vlan111
Routing table number:
RTABLE_0=0
Default Gateway:
DEFAULT_GATEWAY=2.2.2.1
Settings for backup external connection channel:
Adapter Name:
EXT_1_IF=vlan444
Routing table number:
RTABLE_1=1
The default gateway is not required, since for all routing tables except the main one, the setfib script <table number> is used by the rc.d, which is supposed to be the same as the table number by logic.
Tester module parameters:
The number of addresses that will be checked:
TNUMBER=2
Addresses of machines for which ping requests are sent. It is best to use the domain name in the first case, and only after this ip address:
PTARGET_0=ya.ru PTARGET_1=8.8.8.8
The number of ping packets sent by one target:
PNUMBER=2
Judge module settings
The number of successful tests of the main channel before returning to it. The return time to the main channel after the resumption of its work is approximately calculated by the formula: (WNUMBER + 1) * JUDGEPERIOD seconds.
WNUMBER=3
Logger module settings
These 2 parameters indicate the frequency with which the Logger will record repeated events. After recording the event, the next time the LOGFREQ1 number of repetitions is reported, then the LOGFREQ2 number of repetitions. Only events in succession are counted.
LOGFREQ1=5 LOGFREQ2=20
Module start timers in seconds
Tester module launch period. It makes sense to calculate based on the time of unsuccessful attempts to test all targets.
TESTERPERIOD=240
The launch period of the Judge module. Do not install less than TESTERPERIOD.
JUDGEPERIOD=300
Scout module launch period.
SCOUTPERIOD=360
The waiting period before checking the timers for the Tester and Judge modules. It is logical to set less than or equal to the value of TESTERPERIOD.
SENSITIVITY=60
The time after which a working module is considered to be hung. Used by the Watchdog module.
TESTERLIMIT=40 JUDGELIMIT=30 LOGGERLIMIT=20 SCOUTLIMIT=120 WATCHDOGLIMIT=150
Paths to files and directories
The path to the ipfw script.
FIRESCRIPT=/etc/firewall.sh
Ipfw settings If the ipfw settings are not in a separate file, FIRESCRIPT = FIRESETDEF.
FIRESETDEF=/etc/firewall/config
Path to ipfw settings for the main external channel:
FIRESET_0=/etc/firewall/config_0
The path to the ipfw settings for the backup external channel, if necessary, you can continue further FIRESET_2, etc .:
FIRESET_1=/etc/firewall/config_1
Bind settings paths
BINDSETDEF=/usr/local/etc/namedb/named.conf
Bind settings for the main external channel:
BINDSET_0=/usr/local/etc/namedb/named.conf.0
Bind settings for the backup external channel, if necessary, you can continue further BINDSET_2, etc .:
BINDSET_1=/usr/local/etc/namedb/named.conf.1
Paths to all executable ToFoIn files:
DAEMON=/local/sbin/tofoin/daemon.sh TESTER=/usr/local/sbin/tofoin/tester.sh JUDGE=/usr/local/sbin/tofoin/judge.sh LOGGER=/usr/local/sbin/tofoin/logger.sh SCOUT=/usr/local/sbin/tofoin/scout.sh WATCHDOG=/usr/local/sbin/tofoin/watchdog.sh
The event log. This file is now NOT created during installation:
LOGFILE=/var/log/tofoin.log
Temporary files and directories are created when the respective modules are started, some are deleted when stopped:
DIR_TMP=/tmp/tofoin DIR_PID=/var/run/tofoin JUDGEMETER=/tmp/tofoin/judgemeter PREVSTATE=/tmp/tofoin/prevstate SCOUTGATE=/tmp/tofoin/scoutgate LOGTMP=/tmp/tofoin/logger.tmp LOGMETER=/tmp/tofoin/logmeter DAEMON_PID=/var/run/tofoin/daemon.pid TESTER_PID=/var/run/tofoin SCOUT_PID=/var/run/tofoin/scout.pid JUDGE_PID=/var/run/tofoin/judge.pid LOGGER_PID=/var/run/tofoin/logger.pid WATCHDOG_PID=/var/run/tofoin_watchdog.pid
Total
It turned out to be quite a workable and reliable set of scripts that copes well with the task of switching to the working channel in the case of 2 routers with 2 external communication channels.
Plans
My plans for this project are, perhaps, rewriting from bash to pure sh in order to get rid of the extra software on the server. On the other hand, now everything works amazingly and I donât really want to interfere in this process, besides the transition to sh is fraught with more terrible language constructions necessary to achieve the same result.
For the rest, probably, it would be worthwhile to think about the best implementation of test modules.
References:
â
Previous articleâ
ToFoIn project page on gitlab