Mikrotik automatic failover using netwatch
Mikrotik automatic failover is one of those topics with lots of materials on the internet. However, most of these materials, created by inexperienced network engineers, do not address the users’ requirements. In this article, I will not only simplify the configuration of Mikrotik automatic failover using netwatch, I will ensure that this solution will address your end.
Why Mikrotik Automatic failover?
If you have a dual-ISP connection where one is preferred over the other, you need a system that will route all user traffics out of the primary link but with the ability to automatically switch control to the secondary whenever a failure is detected on the primary link. Also, the system must be able to return control to the primary link immediately the failure is resolved.
Mikrotik automatic failover configuration
On my router, Ether1 is connected to ISP1 while Ether2 is connected to ISP2. The bridge port is used for the LAN. The static address 192.168.11.4/24 is used for ISP1 on ether1. 10.0.0.19/16 was statically assigned to Ether2 for connection to ISP2, and 192.168.88.1/24 was assigned to the LAN interface. All these are shown in the image below.
Next, create an interface list and add Ether1 and Ether2 to the WAN interface. The bridge port or whatever interface you are using for your LAN should be added to the LAN interface list. After that, go to IP > firewall > NAT and create a masquerade rule. Choose WAN (the one created earlier) as your out interface list. The reason I created the WAN interfece list was to simplify the NAT configuration. Otherwise, two nat rules will be created for both out interfaces.
The next step is the configuration of static default routes for both links. The preferred link should have the administrative distance of 1 while the backup link should have a higher AD, e.g. 10. The lower the AD, the better the link. At this point, connected users should have internet access via the primary link. The backup link, though in the routing table, is inactive and can only become active when the primary link has its cable unplugged or the its static default route disabled or removed from the routing table. That is not efficient.
You may also like: How to configure automatic failover with load balancing on a Cisco router
To efficiently automate the path selection process, we need to do three things: firstly, mark the connection and packets for all traffics originating from the router and destined to a relaible internet IP address like 22.214.171.124 or 126.96.36.199. Secondly, create a routing rule that uses the routing mark created in step one above to ensure that traffics to the listed address are routed out the primary ISP link at all time. What does this do? Simple! As long as the address is reachble, we can ask Netwatch to do something. Netwatch will enable the primary ISP’s static default route in the routing table. Otherwise, the primary route will be disabled to allow the secondary become active.
In the images above, the connection mark was first created, it was then used to create the routing mark. With the routing mark, a static route is created for all traffics from the router to 188.8.131.52 to go through the primary link at all times. See the image below for guide on static defualt route with routing mark.
Finnally, netwatch is enabled to track reachability to 184.108.40.206 and enable the default route to the primary link when the IP is up and disable the same default route when the IP is down (unreachable).
With the Mikrotik automatic failover setup above, netwatch will continue to test reachability to 220.127.116.11 every 5 seconds. As long as that IP is reachable, the primary link will be in use. Otherwise, the secondary link becomes active. This solution has been tested and found efficient.
If you enjoyed this tutorial, please subscribe to this blog to receive my posts via email. Also subscibe to my YouTube channel, like my Facebook page and follow me on Twitter
10 thoughts on “Mikrotik automatic failover using netwatch”
Question, you said 10.0.0.10 is your primary, that’s ether2. If my primary link is ether1, then the gateway should be for ether1 in my case correct?
Or would it be for ether2?
I am confused as I had it set to ether2, but then when my net went out, it disabled the wrong link it seems as I had to manually add a static route for ether2 (backup) to get it working for 0.0.0.0/0.
Can you show a pic of the static routes (all)?
That would make it easier as your guide is great I just need to understand all the routes associated with this example.
Your gateway should be ether1. Create two default routes; one for ether1 with a lower AD and the other for ether2, your secondary link, with higher AD. Then, configure netwatche to enable and disable default route based on reachability to your preferred DNS address.
the zero in “ip route enable 0” and “ip route dis 0” what is that for? Like why not 1 or any number?
It identifies the route to be enabled and disabled. The first static route is for ISP1 and is always labeled 0 while the second is labeled 1. When you enable 0, ISP1 becomes your active route since it has the lower AD but when it is disabled (dis 0), ISP 2 becomes the active route.
same isp dual untag link, how to configure fail over is one link is down. 2 link from 1 isp
if 1 link is down automatically switch over other link. is it possible?
Check out my post on auto failover using netwatch.
I have a problem where once I enable the netwatch rule, the primary route gets disabled every 5 seconds, then immediately re-enabled. I’ve followed your guide as exactly as possible, using RouterOS 6.49.6. Everything on the router is defaulted except for the changes in this guide.
lmao says that “these materials created by inexperienced network engineers” and uses “ip route enable 0”, not even a “[find comment]”
Also why abuse router resources to manage marked connections when you can just add static route with dns server to isp1 gateway
Not marking connection is an inexperienced ways of doing this because it causes control to remain with the secondary link even when the primary link comes back up.
Thanks for the guide.
The route panel changed since, like so
Also, I have a cable (modem in bridge mode) and a cellular backup (LTE built-in) WANs, both DHCP. How would that change things?