RIP
RIP is a routing protocol based on XNS (Xerox Network System) adapted for IP networks. It is used by many routers (Proteon, cisco, UB…) and many BSD Unix systems BSD systems typically run a program called "routed" to exchange information with other systems running RIP. RIP works best for nets of small diameter where the links are of equal speed. The reason for this is that the metric used to determine which path is best is the hop-count. A hop is a traversal across a gateway. So, all machines on the same Ethernet are zero hops away. If a router connects connects two net- works directly, a machine on the other side of the router is one hop away…. As the routing information is passed through a gateway, the gateway adds one to the hop counts to keep them consistent across the net- work. The diameter of a network is defined as the largest hop-count possible within a network. Unfor- tunately, a hop count of 16 is defined as infinity in RIP meaning the link is down. Therefore, RIP will not allow hosts separated by more than 15 gateways in the RIP space to communicate.
The other problem with hop-count metrics is that if links have different speeds, that difference is not
-11-
reflected in the hop-count. So a one hop satellite link (with a .5 sec delay) at 56kb would be used instead of a two hop T1 connection. Congestion can be viewed as a decrease in the efficacy of a link. So, as a link gets more congested, RIP will still know it is the best hop-count route and congest it even more by throwing more packets on the queue for that link.
The protocol is not well documented. A group of people are working on producing an RFC to both define the current RIP and to do some extensions to it to allow it to better cope with larger networks. Currently, the best documentation for RIP appears to be the code to BSD "routed".
Routed
The ROUTED program, which does RIP for 4.2BSD systems, has many options. One of the most frequently used is: "routed -q" (quiet mode) which means listen to RIP infor- mation but never broadcast it. This would be used by a machine on a network with multiple RIP speaking gate- ways. It allows the host to determine which gateway is best (hopwise) to use to reach a distant network. (Of course you might want to have a default gateway to prevent having to pass all the addresses known to the Internet around with RIP).
There are two ways to insert static routes into "routed", the "/etc/gateways" file and the "route add" command. Static routes are useful if you know how to reach a distant network, but you are not receiving that route using RIP. For the most part the "route add" command is preferable to use. The reason for this is that the command adds the route to that machine's routing table but does not export it through RIP. The "/etc/gateways" file takes precedence over any routing information received through a RIP update. It is also broadcast as fact in RIP updates produced by the host without question, so if a mistake is made in the "/etc/gateways" file, that mistake will soon permeate the RIP space and may bring the network to its knees.
One of the problems with "routed" is that you have very little control over what gets broadcast and what doesn't. Many times in larger networks where various parts of the network are under different administrative controls, you would like to pass on through RIP only nets which you receive from RIP and you know are reasonable. This prevents people from adding IP addresses to the network which may be illegal and you being responsible for passing them on to the Internet. This
-12-
type of reasonability checks are not available with "routed" and leave it usable, but inadequate for large networks.
Hello (RFC-891)
Hello is a routing protocol which was designed and implemented in a experimental software router called a "Fuzzball" which runs on a PDP-11. It does not have wide usage, but is the routing protocol currently used on the NSFnet backbone. The data transferred between nodes is similar to RIP (a list of networks and their metrics). The metric, however, is milliseconds of delay. This allows Hello to be used over nets of various link speeds and performs better in congestive situations.
One of the most interesting side effects of Hello based networks is their great timekeeping ability. If you consider the problem of measuring delay on a link for the metric, you find that it is not an easy thing to do. You cannot measure round trip time since the return link may be more congested, of a different speed, or even not there. It is not really feasible for each node on the network to have a builtin WWV (nationwide radio time standard) receiver. So, you must design an algorithm to pass around time between nodes over the network links where the delay in transmission can only be approximated. Hello routers do this and in a nationwide network maintain synchronized time within milliseconds.
Exterior Gateway Protocol (EGP RFC-904)
EGP is not strictly a routing protocol, it is a reacha- bility protocol. It tells only if nets can be reached through a particular gateway, not how good the connec- tion is. It is the standard by which gateways to local nets inform the ARPAnet of the nets they can reach. There is a metric passed around by EGP but its usage is not standardized formally. Its typical value is value is 1 to 8 which are arbitrary goodness of link values understood by the internal DDN gateways. The smaller the value the better and a value of 8 being unreach- able. A quirk of the protocol prevents distinguishing between 1 and 2, 3 and 4…, so the usablity of this as a metric is as three values and unreachable. Within NSFnet the values used are 1, 3, and unreachable. Many routers talk EGP so they can be used for ARPAnet gateways.
-13-
Gated
So we have regional and campus networks talking RIP among themselves, the NSFnet backbone talking Hello, and the DDN speaking EGP. How do they interoperate? In the beginning there was static routing, assembled into the Fuzzball software configured for each site. The problem with doing static routing in the middle of the network is that it is broadcast to the Internet whether it is usable or not. Therefore, if a net becomes unreachable and you try to get there, dynamic routing will immediately issue a net unreachable to you. Under static routing the routers would think the net could be reached and would continue trying until the application gave up (in 2 or more minutes). Mark Fedor of Cornell (fedor@devvax.tn.cornell.edu) attempted to solve these problems with a replacement for "routed" called "gated".
"Gated" talks RIP to RIP speaking hosts, EGP to EGP speakers, and Hello to Hello'ers. These speakers frequently all live on one Ethernet, but luckily (or unluckily) cannot understand each others ruminations. In addition, under configuration file control it can filter the conversion. For example, one can produce a configuration saying announce RIP nets via Hello only if they are specified in a list and are reachable by way of a RIP broadcast as well. This means that if a rogue network appears in your local site's RIP space, it won't be passed through to the Hello side of the world. There are also configuration options to do static routing and name trusted gateways.
This may sound like the greatest thing since sliced bread, but there is a catch called metric conversion. You have RIP measuring in hops, Hello measuring in milliseconds, and EGP using arbitrary small numbers. The big questions is how many hops to a millisecond, how many milliseconds in the EGP number 3…. Also, remember that infinity (unreachability) is 16 to RIP, 30000 or so to Hello, and 8 to the DDN with EGP. Getting all these metrics to work well together is no small feat. If done incorrectly and you translate an RIP of 16 into an EGP of 6, everyone in the ARPAnet will still think your gateway can reach the unreachable and will send every packet in the world your way. For these reasons, Mark requests that you consult closely with him when configuring and using "gated".
-14-
"Names"
All routing across the network is done by means of the IP address associated with a packet. Since humans find it difficult to remember addresses like 128.174.5.50, a symbolic name register was set up at the NIC where people would say "I would like my host to be named 'uiucuxc'". Machines connected to the Internet across the nation would connect to the NIC in the middle of the night, check modification dates on the hosts file, and if modified move it to their local machine. With the advent of workstations and micros, changes to the host file would have to be made nightly. It would also be very labor intensive and consume a lot of network bandwidth. RFC-882 and a number of others describe domain name service, a distributed data base system for mapping names into addresses.
We must look a little more closely into what's in a name. First, note that an address specifies a particular connec- tion on a specific network. If the machine moves, the address changes. Second, a machine can have one or more names and one or more network addresses (connections) to different networks. Names point to a something which does useful work (i.e. the machine) and IP addresses point to an interface on that provider. A name is a purely symbolic representation of a list of addresses on the network. If a machine moves to a different network, the addresses will change but the name could remain the same.
Domain names are tree structured names with the root of the tree at the right. For example:
uxc.cso.uiuc.edu
is a machine called 'uxc' (purely arbitrary), within the subdomains method of allocation of the U of I) and 'uiuc' (the University of Illinois at Urbana), registered with 'edu' (the set of educational institutions).
A simplified model of how a name is resolved is that on the user's machine there is a resolver. The resolver knows how to contact across the network a root name server. Root servers are the base of the tree structured data retrieval system. They know who is responsible for handling first level domains (e.g. 'edu'). What root servers to use is an installation parameter. From the root server the resolver finds out who provides 'edu' service. It contacts the 'edu' name server which supplies it with a list of addresses of servers for the subdomains (like 'uiuc'). This action is repeated with the subdomain servers until the final sub- domain returns a list of addresses of interfaces on the host in question. The user's machine then has its choice of which of these addresses to use for communication.
-15-
A group may apply for its own domain name (like 'uiuc' above). This is done in a manner similar to the IP address allocation. The only requirements are that the requestor have two machines reachable from the Internet, which will act as name servers for that domain. Those servers could also act as servers for subdomains or other servers could be designated as such. Note that the servers need not be located in any particular place, as long as they are reach- able for name resolution. (U of I could ask Michigan State to act on its behalf and that would be fine). The biggest problem is that someone must do maintenance on the database. If the machine is not convenient, that might not be done in a timely fashion. The other thing to note is that once the domain is allocated to an administrative entity, that entity can freely allocate subdomains using what ever manner it sees fit.
The Berkeley Internet Name Domain (BIND) Server implements the Internet name server for UNIX systems. The name server is a distributed data base system that allows clients to name resources and to share that information with other net- work hosts. BIND is integrated with 4.3BSD and is used to lookup and store host names, addresses, mail agents, host information, and more. It replaces the "/etc/hosts" file for host name lookup. BIND is still an evolving program. To keep up with reports on operational problems, future design decisions, etc, join the BIND mailing list by sending a request to "bind-request@ucbarp.Berkeley.EDU". BIND can also be obtained via anonymous FTP from ucbarpa.berkley.edu.
There are several advantages in using BIND. One of the most important is that it frees a host from relying on "/etc/hosts" being up to date and complete. Within the .uiuc.edu domain, only a few hosts are included in the host table distributed by SRI. The remainder are listed locally within the BIND tables on uxc.cso.uiuc.edu (the server machine for most of the .uiuc.edu domain). All are equally reachable from any other Internet host running BIND.
BIND can also provide mail forwarding information for inte- rior hosts not directly reachable from the Internet. These hosts can either be on non-advertised networks, or not con- nected to a network at all, as in the case of UUCP-reachable hosts. More information on BIND is available in the "Name Server Operations Guide for BIND" in "UNIX System Manager's Manual", 4.3BSD release.
There are a few special domains on the network, like SRI- NIC.ARPA. The 'arpa' domain is historical, referring to hosts registered in the old hosts database at the NIC. There are others of the form NNSC.NSF.NET. These special domains are used sparingly and require ample justification. They refer to servers under the administrative control of
-16-
the network rather than any single organization. This allows for the actual server to be moved around the net while the user interface to that machine remains constant. That is, should BBN relinquish control of the NNSC, the new provider would be pointed to by that name.
In actuality, the domain system is a much more general and complex system than has been described. Resolvers and some servers cache information to allow steps in the resolution to be skipped. Information provided by the servers can be arbitrary, not merely IP addresses. This allows the system to be used both by non-IP networks and for mail, where it may be necessary to give information on intermediate mail bridges.
What's wrong with Berkeley Unix
University of California at Berkeley has been funded by DARPA to modify the Unix system in a number of ways. Included in these modifications is support for the Internet protocols. In earlier versions (e.g. BSD 4.2) there was good support for the basic Internet protocols (TCP, IP, SMTP, ARP) which allowed it to perform nicely on IP ether- nets and smaller Internets. There were deficiencies, how- ever, when it was connected to complicated networks. Most of these problems have been resolved under the newest release (BSD 4.3). Since it is the springboard from which many vendors have launched Unix implementations (either by porting the existing code or by using it as a model), many implementations (e.g. Ultrix) are still based on BSD 4.2. Therefore, many implementations still exist with the BSD 4.2 problems. As time goes on, when BSD 4.3 trickles through vendors as new release, many of the problems will be resolved. Following is a list of some problem scenarios and their handling under each of these releases.
ICMP redirects
Under the Internet model, all a system needs to know to get anywhere in the Internet is its own address, the address of where it wants to go, and how to reach a gateway which knows about the Internet. It doesn't have to be the best gateway. If the system is on a network with multiple gateways, and a host sends a packet for delivery to a gateway which feels another directly connected gateway is more appropriate, the gateway sends the sender a message. This message is an ICMP redirect, which politely says "I'll deliver this message for you, but you really ought to use that gate- way over there to reach this host". BSD 4.2 ignores these messages. This creates more stress on the gate- ways and the local network, since for every packet
-17-
sent, the gateway sends a packet to the originator. BSD 4.3 uses the redirect to update its routing tables, will use the route until it times out, then revert to the use of the route it thinks is should use. The whole process then repeats, but it is far better than one per packet.
Trailers
An application (like FTP) sends a string of octets to TCP which breaks it into chunks, and adds a TCP header. TCP then sends blocks of data to IP which adds its own headers and ships the packets over the network. All this prepending of the data with headers causes memory moves in both the sending and the receiving machines. Someone got the bright idea that if packets were long and they stuck the headers on the end (they became trailers), the receiving machine could put the packet on the beginning of a page boundary and if the trailer was OK merely delete it and transfer control of the page with no memory moves involved. The problem is that trailers were never standardized and most gateways don't know to look for the routing information at the end of the block. When trailers are used, the machine typically works fine on the local network (no gateways involved) and for short blocks through gateways (on which trailers aren't used). So TELNET and FTP's of very short files work just fine and FTP's of long files seem to hang. On BSD 4.2 trailers are a boot option and one should make sure they are off when using the Internet. BSD 4.3 negotiates trailers, so it uses them on its local net and doesn't use them when going across the network.
Retransmissions
TCP fires off blocks to its partner at the far end of the connection. If it doesn't receive an acknowledge- ment in a reasonable amount of time it retransmits the blocks. The determination of what is reasonable is done by TCP's retransmission algorithm. There is no correct algorithm but some are better than others, where better is measured by the number of retransmis- sions done unnecessarily. BSD 4.2 had a retransmission algorithm which retransmitted quickly and often. This is exactly what you would want if you had a bunch of machines on an ethernet (a low delay network of large bandwidth). If you have a network of relatively longer delay and scarce bandwidth (e.g. 56kb lines), it tends to retransmit too aggressively. Therefore, it makes the networks and gateways pass more traffic than is really necessary for a given conversation. Retransmis- sion algorithms do adapt to the delay of the network
-18-
after a few packets, but 4.2's adapts slowly in delay situations. BSD 4.3 does a lot better and tries to do the best for both worlds. It fires off a few retransmissions really quickly assuming it is on a low delay network, and then backs off very quickly. It also allows the delay to be about 4 minutes before it gives up and declares the connection broken.
-19-
Appendix A
References to Remedial Information
Quaterman and Hoskins, "Notable Computer Networks",
Communications of the ACM, Vol 29, #10, pp. 932-971
(October, 1986).
Tannenbaum, Andrew S., Computer Networks, Prentice
Hall, 1981.
Hedrick, Chuck, Introduction to the Internet Protocols,
Anonymous FTP from topaz.rutgers.edu, directory
pub/tcp-ip-docs, file tcp-ip-intro.doc.
-20-
Appendix B
List of Major RFCs
RFC-768 User Datagram Protocol (UDP)
RFC-791 Internet Protocol (IP)
RFC-792 Internet Control Message Protocol (ICMP)
RFC-793 Transmission Control Protocol (TCP)
RFC-821 Simple Mail Transfer Protocol (SMTP)
RFC-822 Standard for the Format of ARPA Internet Text Messages
RFC-854 Telnet Protocol
RFC-917 * Internet Subnets
RFC-919 * Broadcasting Internet Datagrams
RFC-922 * Broadcasting Internet Datagrams in the Presence of Subnets
RFC-940 * Toward an Internet Standard Scheme for Subnetting
RFC-947 * Multi-network Broadcasting within the Internet
RFC-950 * Internet Standard Subnetting Procedure
RFC-959 File Transfer Protocol (FTP)
RFC-966 * Host Groups: A Multicast Extension to the Internet Protocol
RFC-988 * Host Extensions for IP Multicasting
RFC-997 * Internet Numbers
RFC-1010 * Assigned Numbers
RFC-1011 * Official ARPA-Internet Protocols
RFC's marked with the asterisk (*) are not included in
the 1985 DDN Protocol Handbook.
Note: This list is a portion of a list of RFC's by
topic retrieved from the NIC under NETINFO:RFC-SETS.TXT
(anonymous FTP of course).
The following list is not necessary for connection to
the Internet, but is useful in understanding the domain
system, mail system, and gateways:
RFC-882 Domain Names - Concepts and Facilities
RFC-883 Domain Names - Implementation
RFC-973 Domain System Changes and Observations
RFC-974 Mail Routing and the Domain System
RFC-1009 Requirements for Internet Gateways
-21-
Appendix C
Contact Points for Network Information
Network Information Center (NIC)
DDN Network Information Center SRI International, Room EJ291 333 Ravenswood Avenue Menlo Park, CA 94025 (800) 235-3155 or (415) 859-3695 NIC@SRI-NIC.ARPA
NSF Network Service Center (NNSC)
NNSC
BBN Laboratories Inc.
10 Moulton St.
Cambridge, MA 02238
(617) 497-3400
NNSC@NNSC.NSF.NET
-22-
Glossary
core gateway
The innermost gateways of the ARPAnet. These gateways have a total picture of the reacha- bility to all networks known to the ARPAnet with EGP. They then redistribute reachabil- ity information to all those gateways speak- ing EGP. It is from them your EGP agent (there is one acting for you somewhere if you can reach the ARPAnet) finds out it can reach all the nets on the ARPAnet. Which is then passed to you via Hello, gated, RIP….
count to infinity
The symptom of a routing problem where routing information is passed in a circular manner through multiple gateways. Each gate- way increments the metric appropriately and passes it on. As the metric is passed around the loop, it increments to ever increasing values til it reaches the maximum for the routing protocol being used, which typically denotes a link outage.
hold down
When a router discovers a path in the network has gone down announcing that that path is down for a minimum amount of time (usually at least two minutes). This allows for the pro- pagation of the routing information across the network and prevents the formation of routing loops.
split horizon
When a router (or group of routers working in consort) accept routing information from mul- tiple external networks, but do not pass on information learned from one external network to any others. This is an attempt to prevent bogus routes to a network from being propagated because of gossip or counting to infinity.
-23-
End of Project Gutenberg's Hitchhiker's Guide to the Internet, by Ed Krol