[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [DNA] High-level overview of DNA, take 2
Hi Thomas,
I like your description.
I'll go down to the questions section and see if there's
something I can add.
Thomas Narten wrote:
> Here's a much longer attempt at describing the DNA landscape at a
> high-level. My goals in writing this down include to be sure I
> understand the current thinking in the WG, and to have it in text form
> so that others can (hopefully) come up to speed more quickly. It may
> also be useful to have some or all of this text appear in the
> documents being worked on, to make them easier to understand. Finally,
> I hope this allows us to break the problem space down better in a way
> that allows us to talk about components individually and more
> constructively.
>
> At the end, I have a number of questions.
>
>
> RFC 2461/2462 defines the IPv6 behavior for configuring a link with
> IPv6 information such as addresses, MTU, on-link prefixes, etc. When a
> link comes up for the first time, a host relies on RAs to provide the
> needed configuration information. At the time 2461/2462 were written,
> little thought was given to the details of how to process a "link
> down" event (when one is available), or how to handle a "link up"
> event for an interface that was previously up, perhaps only a few
> seconds ago. With the increased proliferation of wireless networks, it
> is now common to receive "link up" indications that may (or may not)
> indicate that a configuration change has occurred at the IP layer
> (i.e., one has moved from one link to another). DNA is the process of
> (quickly) determining whether one is attached to a previously visited
> link, in order to minimize service disruptions upon attaching to a
> link.
>
> DNA requires that a host keep track of the links to which it has
> recently attached. For each known link, information about that link,
> such as default router addresses, prefix information, MTU, Lifetimes,
> etc. are kept in what are called Link Instances (aka Candidate Link
> Objects in dna-cpl). Link Instances are identified by the set of
> on-link global prefixes that have been advertised on that link, as no
> prefix may be assigned to different links simultaneously. At any given
> time, one LI is considered the Current LI, and the parameters
> associated with the CLI are what are actually configured on the link
> as defined in 2461/2462.
>
> When a "link up" event is detected, a host will need to determine what
> IP configuration parameters should be used. A simple (but naive)
> approach would be to discard all associated information and initiate
> the normal startup procedure as described in 2461/2462. However, this
> has the downside of causing upper-layer protocols using the (just
> discarded) addresses to possibly receive "hard errors" and terminate
> connections using those addresses. In the common case where a link
> goes away momentarily and then comes back up connected to the same
> link (with the same associated config information) this is obviously
> undesirable.
>
> DNA takes the approach that upon reciept of a "link up" event, the
> most-recently used configuration information should continue to be
> used, and that DNA should be initiated to (quickly) determine whether
> one is attached to a new link, or one to which one has been previously
> attached. In the simple case, DNA will quickly determine whether one
> has reattached to the current link or to another known link.
> Specifically, it will do so by sending a single RS and receiving a
> single RA, one containing a prefix that it has seen before, thereby
> identifying the link. In such cases, DNA can declare its work done,
> and any new prefixes in the received RA are added to the CLI.
>
> In some scenarios, however, things are more complex. Specifically, DNA
> may receive an RA that contains only prefixes it has never seen
> before. When this happens, it may not be sure whether the prefixes
> correspond to a new link (i.e., one not visited before), or a
> previously-visited link, but for which it has not yet seen all the
> advertised prefixes. While it might be simplest to assume that receipt
> of such an RA indicated attachment to a new link, that can lead to
> (incorrectly) invalidating other configuration information for that
> link. To handle this case, DNA takes extra steps. First, it
> distinguishes between "complete" and "incomplete" LIs. Complete LIs
> are those for which there is a high probability of having recieved all
> prefixes being advertised on that link. Incomplete LIs are those for
> which insufficient time has elapsed to conclude with reasonable
> certainty that it is likely that all advertised prefixes have been
> seen.
>
> When DNA receives an RA advertising (only) unknown prefixes, and the
> CLI is not "complete", DNA waits for additional RAs before deciding
> whether the link is new or not.
>
> The DNA procedure is tightly coupled with the use of 2461/2462, since
> DNA uses the same messages.
>
> There are two variants of DNA. In the first variant, hosts interact
> with routers that have implemented 2461/2462, but have not implemented
> any DNA extensions. One of the key limitations in this case is that
> RAs are not required to include all prefixes, so reciept of an RA from
> a router is not sufficient to learn all the prefixes that (or other)
> routers might be advertising. The DNA approach for handling this case
> has been described above. In the second variant, both hosts and
> routers can be upgraded with DNA extensions. The proposed DNA
> extensions in this case include:
>
> a) routers are permitted to respond more quickly to received RSs, by
> having all routers on the link keep track of each other, and then
> stagger their responses so that if there are N routers present, one
> router responds during interval Time/N, the next during 2*Time/N,
> etc. [pentland-protocol3]
>
> b) allow hosts to include a prefix in the RS that the router echos
> back in the RA, if the prefix is still valid on the link. That way
> the host can learn immediately if it is reconnecting to a
> previously visited link. [landmark option in pentland-protocol3]
>
> c) have RAs include a "complete" bit, that routers set to indicate
> that the RA includes all the known RAs it is configured to
> advertise (plus the Learned Prefix Option described in d). This
> allows a host to quickly learn if any prefixes are being omitted
> from a particular RA.
>
> d) Definition of a Learned Prefix option that routers use to proxy
> prefixes that they do not themselves advertise, but that they have
> observed other routers advertising. Hosts do not use those prefixes
> to autoconfigure addresses; they do use them to learn (in a single
> RA) all prefixes believed to be in use on that link. With this
> information, the host can definitively determine whether it is
> connecting to a new link or revisiting an old one.
>
>
> Some key questions for DNA.
>
> 1) when reusing "old" config, do we need to do anything special? E.g.,
> change state of ND entries to STALE? Put addresses in Optimistic
> state? Take into account how much time has elapsed since we were
> last attached (or just assume this happens as part of normal ND
> timers?)
I don't think you should change the ND cache state unless the 2461
timers indicate you should (Except if you're no longer on a link).
It's an exposure of about 30 seconds (+DELAY+PROBE).
With regard to addresses, even if you return to the same link, any
period of absence is one where you may have missed a DAD NS
(albeit with low probability).
If you arrive back within a second, you would probably see the
NA O=1 to all nodes from the node which started DAD while you were
away. Placing the addresses in Optimistic state makes sense here,
but it may not be necessary to send an NS (opinions welcome).
If you've been away for an entire second, the host may arrive back
after DAD has completed on a peer node. A full optimistic DAD would
be required.
> What harm can occur from continuing to use the same information?
> how does this compare with the harm of delaying the configuration
> of an interface until one is more certain?
RT Voice transfer is out of the picture if you cannot configure the
interface quickly (few 10s of ms).
If the time to configure the addresses is greater than approx 1/2 RTT,
TCP timeout is likely to occur.
Depending on how using the same information occurs, return in under a
second is unlikely to damage anyone who isn't doing optimistic DAD
themselves (There are warnings in opti-dad about potential for failure).
If the host doesn't watch for NAs, then it may be unaware of the
competing nodes .
> 2) Upon "link up", does DNA need to delay sending an RS per the
> 2461/2462 rule? Are there ways to relax this that are consistent
> with the original concerns that led to the existing wording in
> 2461/2462?
There's new text in RFC2461bis, saying that if the link-layer indicates
you may have moved, it is possible to send an RS immediately, if the
link-layer is likely to serialize devices coming onto the medium.
draft-ietf-dna-hosts provides guidance on what to do with hints,
especially from link-layer and indicates when it is not possible
to immediately solicit (S 5.5).
> 3) Upon "link up", what messages can host send that would help it to
> quickly determine if on same link? Is it just an RS? what about
> unicast? what about an NS (e.g., perhaps including a Landmark
> option)?
>
> Are there other messages that can be sent that could help? Like NS
> queries unicast to known routers?
>
> Note: on some links, like wireless ethernet, multicast is
> (relatively) "expensive". Would it be better to just send a few
> immediate unicast NS (or RS) messages to the routers we know about?
> and send a multicast RS only if the unicast fails? Or do both in
> parallel?
Please note that the main problem in DNA is when the host has moved.
When the host has not moved, we don't really need to do much.
Therefore the additional packet to check "are we here?" to a known
destination doesn't help us achieve rapid change detection in the
important case.
When a host has moved, it is not possible to identify a unicast
destination for a querying packet a-priori.
This is why we've used multicast. Perhaps there is a way for all
links to have the same (MAC) unicast destination for a serving router,
which would avoid the issue of the multicast RS leaking back onto a
wireless LAN. There are other ways of dealing with this issue
today though (e.g. group filtering).
Ideas like special destinations have been discussed previously off-list,
but have the issue that they require modification to routers.
RS/RA has the advantage that it is always there, and if people
wish to improve its performance in scheduling or efficiency,
it is possible to upgrade it (or potentially the network it is on??).
> 4) When one gets an RA containing prefixes that one has not seen
> before, is the algorithm in CPA sufficient or ideal? Are there
> other probes one could send? E.g., upon indication that a new prefix
> is available, send additional probes quickly to try and flush out
> things more quickly? Or, should the host keep additional info in
> LIs, e.g., keeping track of which router advertised what prefixes,
> so that it can distingish "prefix from previously unknown router"
> vs. "new prefix from router we've received RAs from before"?
I think the distinction you make about added prefixes is
interesting, but without a PIO (perhaps with the Router Address (R)
flag set) in common it's really too difficult to identify if this
is really the same router because the addressing is link-local --
I mean you can guess, but it may be a different link.
As the router address flag isn't in wide use, this isn't so easy,
unless there was a common prefix between the RAs -- in which case
CPL works anyway.
> 5) When DNA determines that one is no longer connected to the same
> link as before, one swaps configs. Care must be taken to apply
> information learned from RAs received since last "link up" get
> applied to correct LI.
Yes. This is relatively easily done (in the implementation) by
moving aside the LI when the link-indication is received, and
testing received packets until either:
the expiry of the New LI build (if the existing LI was not complete).
amalgamation of the LIs, due to a common prefix
or replacement due to disjoint prefixes (if the existing LI was complete)
> 6) if DNA determines that was has reconnected to the same link, what
> else needs to be done? Rerun DAD? Do nothing? Does the answer
> depend on how long it has been since one was previously connected
> to that link?
As we discussed above, my guess is that immediate (<1s?) return has some
potential danger (low probability) due to duplicate addresses.
I don't think additional packets are warranted though.
Longer disconnections should have a complete DAD, whether original or
optimistic. Group joins and ND cache state should be governed by
existing timers, AFAICS.
> 7) Is there ever a time when one should discard still-valid prefix
> information on a link? That is, suppose a bogus router shows up and
> sends out an RA with prefix information with a long Lifetime. This
> information may stay associated with a link for weeks, because no
> router ever advertises the prefix with a Lifetime of 0. Pre-DNA,
> that info would presumably get discarded more quickly because one
> doesn't keep old LI information around. But with DNA, we may find
> this happens and causes problems. For robustness, it seems like it
> might be desirable to discard prefix information in the _absence_
> of RAs containing that prefix, in the case where other RAs continue
> to be received with other prefix information. I.e., if one knows
> that one has missed (say) N RAs, time out the info if one is
> receiving other info from other routers. (note: the situation is
> further aggrevated by having routers cache RAs learned from other
> routers.)
Well, I don't think it's really that useful to keep cached information
for a long time after you have departed a link.
In draft-ietf-dna-hosts Section 4.1, a maximum cache time is
described which tackles out of date information such as this,
and may be used to limit the effective lifetime of stored information.
This doesn't actually reduce the valid lifetime of the prefix,
though, but may age out the router state quickly.
Perhaps we should have a look at this further if you think
it's moving in the right direction.
In that case it may be necessary to attempt to tie the RA to
the PIO as in (Q3), although this would be to compare within
the same link and not between links for the purpose of
link-determination.
I hope this helps.
Greg