On 06/02/07, Jacqui Caren <jacqui.caren@???> wrote:
> The added advantage of bonded interfaces is that it is not just failover
> but both NICs are in operation, so you have (almost) double bandwidth.
What you are describing is link aggregation. This is not what we are
doing when using redundant switches configured in the way I described;
at any one time the MAC address is associated with one and only one
switch port. Link aggregation is used when the host is connected to
two ports on the same switch (often they must also be sequential ports
and you must configure them as a trunked pair on the switch).
It is possible if you are using switches which are not trunked
together, but this requires dedicated switches and restricts to to HA
links between two hosts.
http://linux-net.osdl.org/index.php/Bonding#Configuring_Bonding_for_High_Availability
So, in short, you don't increase the bandwidth available to the host.
That's not often a problem if you're using Gig ports and a more
manageable solution these days is to upgrade to 10 Gig.
> Note: For failover having two identical NICs is just plain insane,
> as if they are from the same batch and have the same fault (seen it
> happen too many times with bulk box purchases) then they could both die
> in rapid succession.
Unfortunately, it's not always practical to do this, as many servers
these days come with 2-4 onboard NICs. In some cases (blades) it's not
possible to add more. Even in those hosts which are able to take more,
I'd choose two identical NICs for simplicity and treat two NIC
failures as a machine failure. At this point it's time to introduce
your standby host.
> Finally as pointed out above you need to manage failover which means
> detection and recovery. Replacing a dead NIC is not like a hot swap
> drive - you have to take the box down which I would assume is a no-no.
> Therefore you need three different NICs, one of which is unconfigured
> - this takes over from the dead NIC when (not if, when) it dies.
In a HA situation like this you need to be able to deal with entire
host failures, due to a failure of a non-redundant component, such as
a motherboard. To do HA properly, you'd need a backup host anyway.
Otherwise you still have a single point of failure.
G