[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [DNA] Model: how to treat "link down" events



Thomas Narten wrote:

> Right. I just started thinking about it as I was trying to understand
> what information is associated with an interface when one gets a 'link
> up' (i.e., there are some unstated assumptions being made in the DNA
> drafts I think, and I'm trying to flesh them out). It wasn't
> immediately obvious that (by default) any information would be
> associated with an interface.

OK. Since you've read it with a fresh pair of eyes you can help us make 
those assumptions explicit.

> Ah.. but a router going away (with an apparently still working link)
> strikes me as different than a link that goes down -- and for which we
> get an explicit indication from the LL that the link has gone
> down. Note: maybe ten years ago when we were doing the spec twisted
> pair ethernet was a lot less ubiquitous and the notion that one got
> 'link down' indications from the interface was a lot less
> common. Wireless was still a dream... :-)

I think we worried more about routers going away (perhaps they were 
flakier back then).

But with wireless loosing the hosts attachment to the link (which isn't 
really "the link" going down) is a lot more common when you're at the 
edge of the coverage area or there is too much interference.

> I don't mean to assert that. I do think that we (the community as a
> whole) didn't think about this part in a lot of detail.

I agree that the community didn't think that much about it, even if we 
had some reason that we picked 7/30 days for the default preferred and 
valid lifetimes.

> What I'm trying to get at is: what assumption can be made about the
> configuration information that is already associated with an interface
> (i.e., from 2461/2426) at the time a 'link up' event takes place?
> 
> I guess the obvious model is that whatever IP configuraiton
> information  was associated with the interface before the link up
> (subject to any lifetime considerations) is assumed to still be
> associated with the interface.

Yes.

> But think about this just from a 2461/2462 perspective (i.e., before
> we started thinking about DNA). What would happen in practice in a
> pre-DNA world?  Suppose an RA is received on an interface, but it
> indicates a new prefix that one has never heard of, and it doesn't
> include any prefixes currently associated with the
> interface. Shouldn't the old information be thrown away (since we
> might be at a new link). It needs to be discarded at some point, not
> because it has expired, but because it doesn't apply to the current
> link anymore. some of that informaitn will stay around for a very long
> time if one only looked at Lifetimes. [I don't recall any of this
> being discussed in 2461/2462.]

But here is where the flexibility of 2461/62 comes to bit us.
The RFCs don't require that the routers on the link advertise the same 
set of prefixes, and they don't even require that a single router put 
all its prefixes in a single RA. (The latter part of the flexibility is 
necessary should a router have more prefixes to advertise than fits in a 
single 1280 byte RA.)

Thus a RA might appear with a previously unknown prefix even though we 
are still attached to the same link.
It might not be very common in the IPv6 deployment we have today, 
because I suspect there are relatively few prefixes for each link and 
that all the routers advertise the same prefixes. (But in the context of 
Nemo folks have discussed having different routers advertise different 
prefixes, which actually doesn't make much sense to me.)

If we require a 'link up' event (and delivered in order with received 
packets as Mohan pointed out), then in what case could this generality 
bite us when the host didn't move to a new link:
1. Host receives link-up event. Sends a RS.
2. The first RA it receives is one that has a prefix that it hasn't seen 
in a RA on the link before.

The probability of this RA being an unsolicited, periodic RA is quite 
small, since the solicited responses will all arrive within at worst 3.5 
seconds. (.5 second random delay, plus if there are lots of hosts 
sending RSes, then the rate limit of at most one multicast RA every 3 
seconds will kick in). Given that the default period RA timer is 30 
minutes its unlikely that it will fire within that 3.5 second period.

So most likely the host will receive a solicited RA.
And most likely that RA will include prefixes that it has heard before.

But the Internet is a complex, and as a result a flaky place. The 
wireless physics makes things less than perfect. Do we really want to 
add more flakiness by assuming that the first RA is always a complete one?

We do need a change to the router behavior in any case to avoid waiting 
for .5 to 3.5 seconds for a RA. Can't we make CPL not introduce new 
sources of flakiness and have the DNA solution (like the ones we have on 
the table) also make sure the host can always tell for sure from a 
single RA whether it has moved or not?

>> We can't rely on timely 'link down' events for DNA, hence it makes
>> sense to only rely on 'link up' events, which is what CPL and the
>> DNA solution do.
> 
> Agreed that we can't _rely_ on them. But there are presumably cases
> where they will occur. How should they be handled? Just note them ,
> but more-or-less silently ignore them?

If we require reliable and ordered (with respect to received packets) 
'link up' event notifications, then DNA doesn't need to look at 'link 
down' since they don't carry any useful information that helps us tell 
whether a future 'link up' will imply moving to a different link or 
staying at a different link.

But protocols like MOBIKE and SHIM6 want to pay attention to a 'link 
down' event.

    Erik