Wednesday, September 14, 2011

Getting ethernet bonding to work under Debian


I have a small cluster of different hardware running Debian squeeze/wheezy and several machine have been really simple to setup with bonding (aka: trunking/link aggregation/port trunking/ ... the list goes on) using mode=4 or "IEEE 802.3ad Dynamic link aggregation". My final clue to solving the problem was some dmesg output:

[   14.777381] e1000e 0000:06:00.1: eth0: changing MTU from 1500 to 9000
[   15.072476] e1000e 0000:06:00.1: irq 80 for MSI/MSI-X
[   15.128059] e1000e 0000:06:00.1: irq 80 for MSI/MSI-X
[   15.129468] ADDRCONF(NETDEV_UP): eth0: link is not ready
[   15.129473] 8021q: adding VLAN 0 to HW filter on device eth0
[   16.290994] e1000e 0000:06:00.0: eth1: changing MTU from 1500 to 9000
[   16.586584] e1000e 0000:06:00.0: irq 79 for MSI/MSI-X
[   16.640053] e1000e 0000:06:00.0: irq 79 for MSI/MSI-X
[   16.641411] ADDRCONF(NETDEV_UP): eth1: link is not ready
...
[   20.530343] bonding: Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
[   20.530350] bonding: Warning: either miimon or arp_interval and arp_ip_target module parameters must be specified, otherwise bonding will not detect link failures! see bonding.txt for details.
[   20.710398] bonding: bond0: Adding slave eth0.
[   20.710415] e1000e 0000:06:00.1: eth0: changing MTU from 9000 to 1500
[   21.006430] e1000e 0000:06:00.1: irq 80 for MSI/MSI-X
[   21.060058] e1000e 0000:06:00.1: irq 80 for MSI/MSI-X
[   21.061374] 8021q: adding VLAN 0 to HW filter on device eth0
[   21.061445] bonding: bond0: Warning: failed to get speed and duplex from eth0, assumed to be 100Mb/sec and Full.
[   21.061462] bonding: bond0: enslaving eth0 as an active interface with an up link.
[   21.242433] bonding: bond0: Adding slave eth1.



If I brought bond0 down/up again later I would get a working link. I tried adding "sleep" commands into the network init sequence to try and figure out if this was just some quiescent state that the NIC driver was in during initialization. This didn't help... so I finally read the warning about needing miimon/arp_interval/arp_ip_target for bonding to work. This was odd because my /etc/network/interfaces file looks like this:

iface bond0 inet manual
        bond-slaves eth0 eth1 eth2 eth3
        bond-mode 4
        bond-miimon 100
        bond-xmit-hash-policy layer3+4
        mtu 9000
        dns-search lpl.arizona.edu
        post-up ip link set $IFACE mtu 9000
        post-up sysctl -w net.ipv6.conf.all.autoconf=0
        post-up sysctl -w net.ipv6.conf.default.accept_ra=0


As it turns out, miimon is not set when the bonding driver is loaded. To solve this problem I created a new file /etc/modprobe.d/bonding with the following content:

alias bond0 bonding
options bonding mode=4 miimon=100

This fixes the issue of bonding not working on boot and should probably be the source of a Debian/Linux bug report.

No comments:

Post a Comment