📜 ⬆️ ⬇️

Mikrotik: script to switch to the Internet backup channel

I want to share my script for switching to the backup Internet when the main one disappears, and return to the main one as soon as it starts working again. I’ll say right away that channels are available one by one, there will be no load-balance here. Both channels are PPP connections (in my case one is wired, the second is a 3G whistle). The script is made specifically as the most flexible monitoring tool, since other options, in particular check-gateway, are not entirely correct for me.

The basic principle is simple: a raised VPN channel does not mean that the Internet works through it. I check by pinging several external addresses. You can think up when the pings are not an indicator of the work, but I omit these cases, in the script you can specify any other way of checking, under the situation. Other features: backup channel - a mobile network, and it connects only in the absence of the main channel, the rest of the time the interface is turned off. When returning back to the main channel, its operability is checked correctly. A method other than pinging with an interface. Well, the route distance at the interfaces changes dynamically and is always not equal, which allows simultaneous operation of the channels, but the traffic is sent only to one of them.

If you understand, you can easily alter the script, if the providers, or one of them, gives static.

So, I will consistently describe what setting is needed for the script to work, and then I will describe the main points of the work in pieces. At the end there will be a whole script.
')
Suppose there are 2 PPP connections: ISP1 is the primary and ISP2 is a backup, both are configured and work separately. We set dial-on-demand = no and add-default-route = yes on them , then set the default-route-distance parameter for ISP2 to be one more than for ISP1. We configure standard things like NAT, labeling packets and response connections using the same interface where the request came from, routes for tagged packets:

Training
/ip firewall mangle add action=mark-connection chain=forward connection-mark=no-mark \ in-interface=ISP1 new-connection-mark=ISP1 passthrough=no add action=mark-routing chain=prerouting connection-mark=ISP1 in-interface=\ bridge-local new-routing-mark=to_ISP1 passthrough=no add action=mark-connection chain=forward connection-mark=no-mark \ in-interface=ISP2 new-connection-mark=ISP2 passthrough=no add action=mark-routing chain=prerouting connection-mark=ISP2 in-interface=\ bridge-local new-routing-mark=to_ISP2 passthrough=no /ip firewall nat add action=masquerade chain=srcnat out-interface=ISP1 add action=masquerade chain=srcnat out-interface=ISP2 /ip route add distance=1 gateway=ISP1 routing-mark=to_ISP1 add distance=1 gateway=ISP2 routing-mark=to_ISP2 

Also assume that the local address of the router is 192.168.xx.yy, and the subnet is 192.168.xx.0 / 24. This data, like the interface names, needs to be changed to its own. This is not the whole setting, but about everything in order.

Variables
 global FailoverTimes; global FailoverLastTime; global FailoverLastBackTime; local ifMain "ISP1"; local ifRes "ISP2"; local scriptName "Failover"; local state 0; local pingNum 0; local pingRes; local routeDist; local routeDist2; local tmp; local ip { xxxx; yyyy; zzzz }; local pingSrcAddr 192.168.xx.yy; 

We define variables: we write the names of the interfaces in ifMain and ifRes , the local address of the router in pingSrcAddr (it will be clear later why it is needed), and 3 external addresses that will be pinged to check the channel in the ip array.

Single instance
 if ( [len [/system script job find where script=$scriptName]] > 1) do= { error "single instance" }; delay 15; 

Let us run only one copy of the script. Delay in case of launch at the start of RouterOS, we give time to climb connections.

Skip a bit, and move on to the main part. The script works endlessly, or rather, until it is stopped or an error occurs. In an infinite loop, it analyzes the current state of the state variable and performs the necessary actions. Consider them.

State 0
  if ($state = 0) do= { do { if ($pingNum >= 3) do= { set $pingNum 0; } if ([ping ($ip->$pingNum) count=1] = 0) do= { set $pingRes [ping ($ip->0) count=2]; set $pingRes ($pingRes+[ping ($ip->1) count=2]); set $pingRes ($pingRes+[ping ($ip->2) count=2]); if ($pingRes = 0) do= { set $FailoverLastTime "$[/system clock get date] $[/system clock get time]"; set $FailoverTimes ([tonum $FailoverTimes] + 1) set $state 1; log info "$scriptName: state changed 0->1"; } } set $pingNum ($pingNum + 1); if ($state = 0) do= { delay 15 }; } while ($state = 0); } 

State 0 - when the main channel is running. Once every 15 seconds, we check one of the three specified addresses in sequence; if there is no answer, we check all 3 addresses. Deaf - we initiate the transition to the backup channel. It is strictly stated that the addresses in the array are 3. If this is not the case, you will have to correct it.

State 1
  if ($state = 1) do= { if ( [/interface l2tp-client get $ifMain default-route-distance] > 10) do= { /interface ppp-client set $ifRes default-route-distance=1; } /interface enable $ifRes; beep frequency=2000 length=250ms; delay 500ms; beep frequency=2000 length=250ms; delay 500ms; delay 6; /interface disable $ifMain; set $routeDist ([/interface ppp-client get $ifRes default-route-distance] + 1); /interface l2tp-client set $ifMain default-route-distance=$routeDist; /interface enable $ifMain; set $state 2; log info "$scriptName: state changed 1->2"; } 

State 1 - switching channels. It is important here which PPP connections are used. In the example, ISP1 is l2tp-client, and ISP2 is ppp-client. If others, you need to correct them in lines with the default-route-distance .

After switching on the backup channel, we wait 7 seconds. This is enough time for me, for which the 3G connection rises. During this time, the current connections and new ones hang out in timeouts, while the main VPN has not yet broken, and the answers of the router like dest unreachable are minimized.

Sound indication to an amateur, maybe even work at night. If you do not need - we remove.

Further, the main channel is disabled, its default-route-distance is set to 1 more than the backup channel, and it is turned back on. Due to this, we have the opportunity to wait for the return of the main channel without interfering with the work of the Internet through the reserve.

Looking ahead, when switching back to the main channel and disconnecting the reserve, its default-route-distance will increase again by 1. With each switch, the route distance of PPP connections increases sequentially. In order for them not to go too far, the current value is checked here and a reset to 1 occurs when 10 is exceeded (the figure does not matter, taken for example, theoretically a maximum of about 250).

State 2
  if ($state = 2) do= { do { if ( [len [interface find where name=$ifMain and running] ] = 1) do= { set $pingRes [ping ($ip->0) src-address=$pingSrcAddr count=2]; set $pingRes ($pingRes+[ping ($ip->1) src-address=$pingSrcAddr count=2]); set $pingRes ($pingRes+[ping ($ip->2) src-address=$pingSrcAddr count=2]); if ($pingRes > 0) do= { set $state 3; log info "$scriptName: state changed 2->3"; } } if ($state = 2) do= { delay 15 }; } while ($state = 2); } 

State 2 - waiting for the restoration of the main channel. It is worth noting that the state of the reserve is not interesting. If he has not connected, nothing can be done, all the conditions for him are created, and in fact we are only interested in the main channel.

It is expected to raise the VPN of the main channel, and then through it with an active reserve, attempts are made to ping external addresses. Made it complicated, but correct. If you write ping xx.xx.xx.xx interface = $ ifMain , then according to the developers, this may or may not work. It uses ping from the local address of the router. It is assumed that it is always there, otherwise why the router is needed. I did not use the external address of the main channel, because the provider gives it a dynamic one. We understand how to tell the router to send such pings through the main channel, even when its route is inactive (route distance is longer than the backup):

Donastroyka
 /ip firewall mangle add action=mark-routing chain=output comment=Failover_script_rule \ dst-address=!192.168.xx.0/24 new-routing-mark=to_ISP1 passthrough=no \ protocol=icmp src-address=192.168.xx.yy /ip route rule add action=lookup-only-in-table routing-mark=to_ISP1 src-address=\ 192.168.xx.yy/32 table=to_ISP1 

The ping traffic used here is non-standard. This is the output traffic coming from the router itself to the external address. Usually, in such cases, the src-address of the router takes the address of the interface at which the packet leaves. By specifying the local address of the router as a src-address , we sort of place it for the same NAT behind which there is a local boot. Further, such traffic is marked with the routing-mark of the main channel, and the packets go through the main channel at the expense of the route with a label.

The second rule is also necessary. Without it, if the main channel suddenly falls again, then the pings, even marked to_ISP1 , will follow the route without a backup channel label, which will result in an incorrect return to the main channel. This is how RouterOS works, if a channel is not connected, then routes, even those marked, are disabled. To make it a little clearer, imagine that state = 2, the main channel is raised, but traffic does not go through it. A ping in this case will take 6 seconds. So if at this time the main channel is turned off, then the pings will begin to pass through the reserve. The second rule excludes this.

We note that pings to LAN from the router are not marked and work as usual.

State 3
  if ($state = 3) do= { /interface disable $ifRes; set $routeDist ([/interface l2tp-client get $ifMain default-route-distance] + 1); /interface ppp-client set $ifRes default-route-distance=$routeDist; set $state 0; set $FailoverLastBackTime "$[/system clock get date] $[/system clock get time]"; log info "$scriptName: state changed 3->0"; beep frequency=500 length=500ms; } 

State 3 - transition to the main channel. After the pings on the main channel have started to pass, it is enough to turn off the backup VPN and the main one will be used. Next, we change the default-route-distance of the backup one more than the main one, and give a sound signal. Pay attention to the type of PPP connections, and change if necessary.

On this cycle closes and returns to state 0.

Now about how when you run the script, it finds out the current state:

Initial state
 set $routeDist [/interface l2tp-client get $ifMain default-route-distance]; set $routeDist2 [/interface ppp-client get $ifRes default-route-distance]; if ($routeDist < $routeDist2) do= { if ( [/interface get $ifMain running] = true) do= { set $state 0; } else= { set $state 1; } } else= { if ( [/interface get $ifMain disabled] = true) do= { /interface enable $ifMain; } if ($routeDist > $routeDist2 and [/interface get $ifRes disabled] = false) do= { set $state 2; } else= { set $state 3; } } log info "$scriptName: initial state $state"; 

Here the logic is also complicated at first glance. Three parameters are analyzed: whether ISP1 is running, whether ISP2 is running, and the default route distance relationship with them. The initial states 1 and 3 are nonstandard, and they say about the wrong configuration, but the script in this case itself restores everything, even if sometimes by unnecessary switching.

Excluded state
I have one more condition which I have excluded, since most likely it is hardly needed by the majority. My ISP1 connects VPN by name, not by IP, to resolve this name, you need to use the same provider’s DNS, since it resolves to the local address. And if you don’t help the script with resolution, specifying a specific DNS, then even after the availability of the ISP1 network, it will never connect, because will not resolve the domain name, but will continue to use the DNS reserve. Here it is. state:

  if ($state = 2) do= { do { if (([ping DNSip1 count=1] > 0) or ([ping DNSip2 count=1] > 0)) do= { set $tmp 0; do { resolve VPNaddress server=DNSip1; } on-error= { }; do { resolve VPNaddress server=DNSip2; } on-error= { }; do { resolve VPNaddress } on-error= { set $tmp 1; }; if ($tmp = 0) do= { set $state 3; log info "$scriptName: state changed 2->3"; delay 5; } } if ($state = 2) do= { delay 15 }; } while ($state = 2); } 

Instead of DNSip1, DNSip2 and VPNaddress, we substitute the necessary data. All states below are respectively shifted by +1.


That's basically all, developed and debugged on 6.26 and RB951G-2HnD. On other versions - I do not promise, and sorry for the lack of ':' in front of the teams.

In my configuration, another one works in conjunction with this script, which runs in a schedule once a minute. It checks if this script is running, and additionally sends me an IP address by mail when it changes. Here is a small example, but only the first part:

Script monitor
 global FailoverDisabled; if ( [len [/system script job find where script="Failover"]] = 0 and $FailoverDisabled != 1) do= { do { execute script="Failover"; } on-error= { log info "$scriptName: Failed to execute Failover" }; } 

A global variable can disable the launch of the failover script. Also, due to the schedule, if the router unexpectedly reboots, the script will be automatically launched again.

Full Failover Script
 global FailoverTimes; global FailoverLastTime; global FailoverLastBackTime; local ifMain "ISP1"; local ifRes "ISP2"; local scriptName "Failover"; local state 0; local pingNum 0; local pingRes; local routeDist; local routeDist2; local tmp; local ip { xxxx; yyyy; zzzz }; local pingSrcAddr 192.168.xx.yy; if ( [len [/system script job find where script=$scriptName]] > 1) do= { error "single instance" }; delay 15; set $routeDist [/interface l2tp-client get $ifMain default-route-distance]; set $routeDist2 [/interface ppp-client get $ifRes default-route-distance]; if ($routeDist < $routeDist2) do= { if ( [/interface get $ifMain running] = true) do= { set $state 0; } else= { set $state 1; } } else= { if ( [/interface get $ifMain disabled] = true) do= { /interface enable $ifMain; } if ($routeDist > $routeDist2 and [/interface get $ifRes disabled] = false) do= { set $state 2; } else= { set $state 3; } } log info "$scriptName: initial state $state"; do { if ($state = 0) do= { do { if ($pingNum >= 3) do= { set $pingNum 0; } if ([ping ($ip->$pingNum) count=1] = 0) do= { set $pingRes [ping ($ip->0) count=2]; set $pingRes ($pingRes+[ping ($ip->1) count=2]); set $pingRes ($pingRes+[ping ($ip->2) count=2]); if ($pingRes = 0) do= { set $FailoverLastTime "$[/system clock get date] $[/system clock get time]"; set $FailoverTimes ([tonum $FailoverTimes] + 1) set $state 1; log info "$scriptName: state changed 0->1"; } } set $pingNum ($pingNum + 1); if ($state = 0) do= { delay 15 }; } while ($state = 0); } # endof if state = 0 if ($state = 1) do= { if ( [/interface l2tp-client get $ifMain default-route-distance] > 10) do= { /interface ppp-client set $ifRes default-route-distance=1; } /interface enable $ifRes; beep frequency=2000 length=250ms; delay 500ms; beep frequency=2000 length=250ms; delay 500ms; delay 6; /interface disable $ifMain; set $routeDist ([/interface ppp-client get $ifRes default-route-distance] + 1); /interface l2tp-client set $ifMain default-route-distance=$routeDist; /interface enable $ifMain; set $state 2; log info "$scriptName: state changed 1->2"; } if ($state = 2) do= { do { if ( [len [interface find where name=$ifMain and running] ] = 1) do= { set $pingRes [ping ($ip->0) src-address=$pingSrcAddr count=2]; set $pingRes ($pingRes+[ping ($ip->1) src-address=$pingSrcAddr count=2]); set $pingRes ($pingRes+[ping ($ip->2) src-address=$pingSrcAddr count=2]); if ($pingRes > 0) do= { set $state 3; log info "$scriptName: state changed 2->3"; } } if ($state = 2) do= { delay 15 }; } while ($state = 2); } # endof if state = 2 if ($state = 3) do= { /interface disable $ifRes; set $routeDist ([/interface l2tp-client get $ifMain default-route-distance] + 1); /interface ppp-client set $ifRes default-route-distance=$routeDist; set $state 0; set $FailoverLastBackTime "$[/system clock get date] $[/system clock get time]"; log info "$scriptName: state changed 3->0"; beep frequency=500 length=500ms; } # bad programming protection delay 1; } while= ( true ); 

Source: https://habr.com/ru/post/252729/


All Articles