The KA9Q Internet Software Package Updated for 890421.1 Revision May 8, 1989 by Bdale Garbee, N3EUA Copyright 1989 by Bdale Garbee. All Rights Reserved. This Document may be reproduced in whole or in part for any non-commercial purpose, as long as credit is given the author. - 2 - Caveat Emptor This document is a major rewrite of the document included with 871225.0 release of the KA9Q package, but it is in the author's opinion very far from perfect. I believe that the bulk of the material here is factually correct, but the formatting is rough, and there are no doubt some juicy tidbits missing that should have been included. I would greatly appreciate reports of problems with the documentation, particularly if they are accompanied by suggested fixes! I promise to back up the document sources this time around, so that disk crashes won't force me to start over from scratch (which is what happened this time...). If you have comments or suggestions about the documentation, contact me via email at bdale@col.hp.com on the Internet, ...!winfree!bdale via UUCP, or as 76430,3323 on Compuserve. My profound thanks to the folks who contributed to this release, partic- ularly Bob Hoffman N3CVL and Ron Henderson WA7TAS, without whom it would have been impossible. Bdale Garbee, N3EUA - 3 - 1. Introduction to TCP/IP and the KA9Q Software This document describes the KA9Q Internet Package, which is an implemen- tation of the network protocol family originally created as part of the Arpanet project. The protocol family is commonly refered to as "TCP/IP", acronyms for two of the many protocols included. The KA9Q package is the result of several years of development by Phil Karn, KA9Q, and his "merry band of implementors". The "TCP group" has grown to include hundreds of individuals worldwide, many of whom have contributed ideas and software to this release. All of the sources to the software are included, and interested parties are encouraged to experiment and contribute changes resulting from their efforts back to the group. This is discussed further in the technical appendix. 1.1. An Overview of the TCP/IP Protocol Family Following is an overview of the protocol family supported by the KA9Q package. While reading this section will give you a wonderful overview of what's going on, you can feel free to skip ahead and try out the pro- gram first, then come back and read this section to flesh out your understanding. 1.1.1. Acknowledgement The material in this section was adapted from a document written by Charles L. Hedrick, of RUTGERS, The State University of New Jersey. The original document was Copyright 1987 by Charles L. Hedrick, and the material excerpted for this section is used by permission. 1.1.2. Introduction This document is a brief introduction to TCP/IP, followed by advice on what to read for more information. This is not intended to be a complete description. It can give you a reasonable idea of the capabilities of the protocols. But if you need to know any details of the technology, you will want to read the standards yourself. Throughout the text, you will find references to the standards, in the form of "RFC" or "IEN" numbers. These are document numbers. The final section of this document tells you how to get copies of those standards. 1.1.3. What is TCP/IP? TCP/IP is a set of protocols developed to allow cooperating computers to share resources across a network. It was developed by a community of researchers centered around the ARPAnet. Certainly the ARPAnet is the best-known TCP/IP network. However as of June, 87, at least 130 dif- ferent vendors had products that support TCP/IP, and thousands of net- works of all kinds use it. Over time, the original ARPAnet has been phased out, and is being replaced by a variety of networks running the same protocols loosely referred to as "The Internet". First some basic definitions. The most accurate name for the set of - 4 - protocols we are describing is the "Internet protocol suite". TCP and IP are two of the protocols in this suite. (They will be described below.) Because TCP and IP are the best known of the protocols, it has become common to use the term TCP/IP or IP/TCP to refer to the whole family. It is probably not worth fighting this habit. However this can lead to some oddities. For example, I find myself talking about NFS as being based on TCP/IP, even though it doesn't use TCP at all. (It does use IP. But it uses an alternative protocol, UDP, instead of TCP. All of this alphabet soup will be unscrambled in the following pages.) The Internet is a collection of networks, including the Arpanet, NSFnet, regional networks such as NYsernet, local networks at a number of University and research institutions, and a number of military networks. The term "Internet" applies to this entire set of networks. The subset of them that is managed by the Department of Defense is referred to as the "DDN" (Defense Data Network). This includes some research-oriented networks, such as the Arpanet, as well as more strictly military ones. (Because much of the funding for Internet pro- tocol developments is done via the DDN organization, the terms Internet and DDN can sometimes seem equivalent.) All of these net- works are connected to each other. Users can send messages from any of them to any other, except where there are security or other policy restrictions on access. Officially speaking, the Internet protocol documents are simply standards adopted by the Internet community for its own use. More recently, the Department of Defense issued a MILSPEC definition of TCP/IP. This was intended to be a more formal definition, appropriate for use in purchasing specifications. However most of the TCP/IP community continues to use the Internet standards. The MILSPEC version is intended to be consistent with it. Whatever it is called, TCP/IP is a family of protocols. A few provide "low-level" functions needed for many applications. These include IP, TCP, and UDP. (These will be described in a bit more detail later.) Others are protocols for doing specific tasks, e.g. transferring files between computers, sending mail, or finding out who is logged in on another computer. Initially TCP/IP was used mostly between mini- computers or mainframes. These machines had their own disks, and gen- erally were self-contained. Thus the most important "traditional" TCP/IP services are: - file transfer. The file transfer protocol (FTP) allows a user on any computer to get files from another computer, or to send files to another computer. Security is handled by requiring the user to specify a user name and password for the other computer. Provisions are made for handling file transfer between machines with different character set, end of line conventions, etc. This is not quite the same thing as more recent "network file system" or "netbios" protocols, which will be described below. Rather, FTP is a utility that you run any time you want to access a file on another system. You use it to copy the file to your own system. You then work with the local copy. (See RFC 959 for specifications for FTP.) - 5 - - remote login. The network terminal protocol (TELNET) allows a user to log in on any other computer on the network. You start a remote session by specifying a computer to connect to. From that time until you finish the session, anything you type is sent to the other computer. Note that you are really still talking to your own computer. But the telnet program effectively makes your computer invisible while it is running. Every character you type is sent directly to the other system. Generally, the connection to the remote computer behaves much like a dialup connection. That is, the remote system will ask you to log in and give a password, in whatever manner it would normally ask a user who had just dialed it up. When you log off of the other computer, the telnet program exits, and you will find yourself talking to your own computer. Microcomputer implementations of telnet generally include a terminal emulator for some common type of terminal. (See RFC's 854 and 855 for specifications for telnet. By the way, the telnet protocol should not be confused with Telenet, a vendor of commercial network services.) - computer mail. This allows you to send messages to users on other computers. Originally, people tended to use only one or two specific computers. They would maintain "mail files" on those machines. The computer mail system is simply a way for you to add a message to another user's mail file. There are some problems with this in an environment where microcomputers are used. The most serious is that a micro is not well suited to receive computer mail. When you send mail, the mail software expects to be able to open a connection to the addressee's computer, in order to send the mail. If this is a microcomputer, it may be turned off, or it may be running an application other than the mail system. For this reason, mail is normally handled by a larger system, where it is practical to have a mail server running all the time. Microcomputer mail software then becomes a user interface that retrieves mail from the mail server. (See RFC 821 and 822 for specifications for computer mail. See RFC 937 for a protocol designed for microcomputers to use in reading mail from a mail server.) These services should be present in any implementation of TCP/IP, except that micro-oriented implementations may not support computer mail. These traditional applications still play a very important role in TCP/IP-based networks. However more recently, the way in which net- works are used has been changing. The older model of a number of large, self-sufficient computers is beginning to change. Now many installations have several kinds of computers, including microcomputers, workstations, minicomputers, and mainframes. These com- puters are likely to be configured to perform specialized tasks. Although people are still likely to work with one specific computer, that computer will call on other systems on the net for specialized services. This has led to the "server/client" model of network services. A server is a system that provides a specific service for the rest of the network. A client is another system that uses that ser- vice. (Note that the server and client need not be on different comput- ers. They could be different programs running on the same - 6 - computer.) Here are the kinds of servers typically present in a modern computer setup. Note that these computer services can all be provided within the framework of TCP/IP. - network file systems. This allows a system to access files on another computer in a somewhat more closely integrated fashion than FTP. A network file system provides the illusion that disks or other devices from one system are directly connected to other systems. There is no need to use a special network utility to access a file on another system. Your computer simply thinks it has some extra disk drives. These extra "virtual" drives refer to the other system's disks. This capability is useful for several different purposes. It lets you put large disks on a few computers, but still give others access to the disk space. Aside from the obvious economic benefits, this allows people working on several computers to share common files. It makes system maintenance and backup easier, because you don't have to worry about updating and backing up copies on lots of different machines. A number of vendors now offer high-performance diskless computers. These computers have no disk drives at all. They are entirely dependent upon disks attached to common "file servers". (See RFC's 1001 and 1002 for a description of PC-oriented NetBIOS over TCP. In the workstation and minicomputer area, Sun's Network File System is more likely to be used. Protocol specifications for it are available from Sun Microsystems.) - remote printing. This allows you to access printers on other computers as if they were directly attached to yours. (The most commonly used protocol is the remote lineprinter protocol from Berkeley Unix. Unfortunately, there is no protocol document for this. However the C code is easily obtained from Berkeley, so implementations are common.) - remote execution. This allows you to request that a particular program be run on a different computer. This is useful when you can do most of your work on a small computer, but a few tasks require the resources of a larger system. There are a number of different kinds of remote execution. Some operate on a command by command basis. That is, you request that a specific command or set of commands should run on some specific computer. (More sophisticated versions will choose a system that happens to be free.) However there are also "remote procedure call" systems that allow a program to call a subroutine that will run on another computer. (There are many protocols of this sort. Berkeley Unix contains two servers to execute commands remotely: rsh and rexec. The man pages describe the protocols that they use. The user-contributed software with Berkeley 4.3 contains a "distributed shell" that will distribute tasks among a set of systems, depending upon load. Remote procedure call mechanisms have been a topic for research for a number of years, so many organizations have implementations of such facilities. The most widespread commercially-supported remote procedure call protocols seem to be Xerox's Courier and Sun's RPC. Protocol documents are - 7 - available from Xerox and Sun. There is a public implementation of Courier over TCP as part of the user-contributed software with Berkeley 4.3. An implementation of RPC was posted to Usenet by Sun, and also appears as part of the user-contributed software with Berkeley 4.3.) - name servers. In large installations, there are a number of different collections of names that have to be managed. This includes users and their passwords, names and network addresses for computers, and accounts. It becomes very tedious to keep this data up to date on all of the computers. Thus the databases are kept on a small number of systems. Other systems access the data over the network. (RFC 822 and 823 describe the name server protocol used to keep track of host names and Internet addresses on the Internet. This is now a required part of any TCP/IP implementation. IEN 116 describes an older name server protocol that is used by a few terminal servers and other products to look up host names. Sun's Yellow Pages system is designed as a general mechanism to handle user names, file sharing groups, and other databases commonly used by Unix systems. It is widely available commercially. Its protocol definition is available from Sun.) - terminal servers. Many installations no longer connect terminals directly to computers. Instead they connect them to terminal servers. A terminal server is simply a small computer that only knows how to run telnet (or some other protocol to do remote login). If your terminal is connected to one of these, you simply type the name of a computer, and you are connected to it. Generally it is possible to have active connections to more than one computer at the same time. The terminal server will have provisions to switch between connections rapidly, and to notify you when output is waiting for another connection. (Terminal servers use the telnet protocol, already mentioned. However any real terminal server will also have to support name service and a number of other protocols.) - network-oriented window systems. Until recently, high- performance graphics programs had to execute on a computer that had a bit-mapped graphics screen directly attached to it. Network window systems allow a program to use a display on a different computer. Full-scale network window systems provide an interface that lets you distribute jobs to the systems that are best suited to handle them, but still give you a single graphically-based user interface. (The most widely-implemented window system is X. A protocol description is available from MIT's Project Athena. A reference implementation is publically available from MIT. A number of vendors are also supporting NeWS, a window system defined by Sun. Both of these systems are designed to use TCP/IP.) Note that some of the protocols described above were designed by Berkeley, Sun, or other organizations. Thus they are not officially part of the Internet protocol suite. However they are implemented - 8 - using TCP/IP, just as normal TCP/IP application protocols are. Since the protocol definitions are not considered proprietary, and since commercially-support implementations are widely available, it is reasonable to think of these protocols as being effectively part of the Internet suite. Note that the list above is simply a sample of the sort of services available through TCP/IP. However it does contain the majority of the "major" applications. The other commonly-used protocols tend to be specialized facilities for getting information of various kinds, such as who is logged in, the time of day, etc. However if you need a facility that is not listed here, we encourage you to look through the current edition of Internet Protocols (currently RFC 1011), which lists all of the available protocols, and also to look at some of the major TCP/IP implementations to see what various vendors have added. 1.1.4. General description of the TCP/IP protocols TCP/IP is a layered set of protocols. In order to understand what this means, it is useful to look at an example. A typical situation is sending mail. First, there is a protocol for mail. This defines a set of commands which one machine sends to another, e.g. commands to specify who the sender of the message is, who it is being sent to, and then the text of the message. However this protocol assumes that there is a way to communicate reliably between the two computers. Mail, like other application protocols, simply defines a set of commands and messages to be sent. It is designed to be used together with TCP and IP. TCP is responsible for making sure that the commands get through to the other end. It keeps track of what is sent, and retransmitts anything that did not get through. If any message is too large for one datagram, e.g. the text of the mail, TCP will split it up into several datagrams, and make sure that they all arrive correctly. Since these functions are needed for many applications, they are put together into a separate protocol, rather than being part of the specifications for sending mail. You can think of TCP as forming a library of routines that applications can use when they need reliable network communications with another computer. Similarly, TCP calls on the services of IP. Although the services that TCP supplies are needed by many applications, there are still some kinds of applications that don't need them. However there are some services that every application needs. So these services are put together into IP. As with TCP, you can think of IP as a library of routines that TCP calls on, but which is also available to applications that don't use TCP. This strategy of building several levels of protocol is called "layering". We think of the applications programs such as mail, TCP, and IP, as being separate "layers", each of which calls on the services of the layer below it. Generally, TCP/IP applications use 4 layers: - an application protocol such as mail - a protocol such as TCP that provides services need by many applications - IP, which provides the basic service of getting datagrams to - 9 - their destination - the protocols needed to manage a specific physical medium, such as Ethernet or a point to point line. TCP/IP is based on the "catenet model". (This is described in more detail in IEN 48.) This model assumes that there are a large number of independent networks connected together by gateways. The user should be able to access computers or other resources on any of these networks. Datagrams will often pass through a dozen different networks before getting to their final destination. The routing needed to accomplish this should be completely invisible to the user. As far as the user is concerned, all he needs to know in order to access another system is an "Internet address". This is an address that looks like 128.6.4.194. It is actually a 32-bit number. However it is normally written as 4 decimal numbers, each representing 8 bits of the address. (The term "octet" is used by Internet documentation for such 8-bit chunks. The term "byte" is not used, because TCP/IP is supported by some computers that have byte sizes other than 8 bits.) Generally the structure of the address gives you some information about how to get to the system. For example, 128.6 is a network number assigned by a central authority to Rutgers University. Rutgers uses the next octet to indicate which of the campus Ethernets is involved. 128.6.4 happens to be an Ethernet used by the Computer Science Department. The last octet allows for up to 254 systems on each Ethernet. (It is 254 because 0 and 255 are not allowed, for reasons that will be discussed later.) Note that 128.6.4.194 and 128.6.5.194 would be different systems. The structure of an Internet address is described in a bit more detail later. Of course we normally refer to systems by name, rather than by Internet address. When we specify a name, the network software looks it up in a database, and comes up with the corresponding Internet address. Most of the network software deals strictly in terms of the address. (RFC 882 describes the name server technology used to handle this lookup.) TCP/IP is built on "connectionless" technology. Information is transfered as a sequence of "datagrams". A datagram is a collection of data that is sent as a single message. Each of these datagrams is sent through the network individually. There are provisions to open connections (i.e. to start a conversation that will continue for some time). However at some level, information from those connections is broken up into datagrams, and those datagrams are treated by the network as completely separate. For example, suppose you want to transfer a 15000 octet file. Most networks can't handle a 15000 octet datagram. So the protocols will break this up into something like 30 500-octet datagrams. Each of these datagrams will be sent to the other end. At that point, they will be put back together into the 15000-octet file. However while those datagrams are in transit, the network doesn't know that there is any connection between them. It is perfectly possible that datagram 14 will actually arrive before datagram 13. It is also possible that somewhere in the network, an error will occur, and some datagram won't get through at all. In that - 10 - case, that datagram has to be sent again. Note by the way that the terms "datagram" and "packet" often seem to be nearly interchangable. Technically, datagram is the right word to use when describing TCP/IP. A datagram is a unit of data, which is what the protocols deal with. A packet is a physical thing, appearing on an Ethernet or some wire. In most cases a packet simply contains a datagram, so there is very little difference. However they can differ. When TCP/IP is used on top of X.25, the X.25 interface breaks the datagrams up into 128-byte packets. This is invisible to IP, because the packets are put back together into a single datagram at the other end before being processed by TCP/IP. So in this case, one IP datagram would be carried by several packets. However with most media, there are efficiency advantages to sending one datagram per packet, and so the distinction tends to vanish. Two separate protocols are involved in handling TCP/IP datagrams. TCP (the "transmission control protocol") is responsible for breaking up the message into datagrams, reassembling them at the other end, resending anything that gets lost, and putting things back in the right order. IP (the "internet protocol") is responsible for routing individual datagrams. It may seem like TCP is doing all the work. And in small networks that is true. However in the Internet, simply getting a datagram to its destination can be a complex job. A connection may require the datagram to go through several networks at Rutgers, a serial line to the John von Neuman Supercomputer Center, a couple of Ethernets there, a series of 56Kbaud phone lines to another NSFnet site, and more Ethernets on another campus. Keeping track of the routes to all of the destinations and handling incompatibilities among different transport media turns out to be a complex job. Note that the interface between TCP and IP is fairly simple. TCP simply hands IP a datagram with a destination. IP doesn't know how this datagram relates to any datagram before it or after it. It may have occurred to you that something is missing here. We have talked about Internet addresses, but not about how you keep track of multiple connections to a given system. Clearly it isn't enough to get a datagram to the right destination. TCP has to know which connection this datagram is part of. This task is referred to as "demultiplexing." In fact, there are several levels of demultiplexing going on in TCP/IP. The information needed to do this demultiplexing is contained in a series of "headers". A header is simply a few extra octets tacked onto the beginning of a datagram by some protocol in order to keep track of it. It's a lot like putting a letter into an envelope and putting an address on the outside of the envelope. Except with modern networks it happens several times. It's like you put the letter into a little envelope, your secretary puts that into a somewhat bigger envelope, the campus mail center puts that envelope into a still bigger one, etc. Here is an overview of the headers that get stuck on a message that passes through a typical TCP/IP network: We start with a single data stream, say a file you are trying to send to some other computer: - 11 - ...................................................... TCP breaks it up into manageable chunks. (In order to do this, TCP has to know how large a datagram your network can handle. Actually, the TCP's at each end say how big a datagram they can handle, and then they pick the smallest size.) .... .... .... .... .... .... .... .... TCP puts a header at the front of each datagram. This header actually contains at least 20 octets, but the most important ones are a source and destination "port number" and a "sequence number". The port numbers are used to keep track of different conversations. Suppose 3 different people are transferring files. Your TCP might allocate port numbers 1000, 1001, and 1002 to these transfers. When you are sending a datagram, this becomes the "source" port number, since you are the source of the datagram. Of course the TCP at the other end has assigned a port number of its own for the conversation. Your TCP has to know the port number used by the other end as well. (It finds out when the connection starts, as we will explain below.) It puts this in the "destination" port field. Of course if the other end sends a datagram back to you, the source and destination port numbers will be reversed, since then it will be the source and you will be the destination. Each datagram has a sequence number. This is used so that the other end can make sure that it gets the datagrams in the right order, and that it hasn't missed any. (See the TCP specification for details.) TCP doesn't number the datagrams, but the octets. So if there are 500 octets of data in each datagram, the first datagram might be numbered 0, the second 500, the next 1000, the next 1500, etc. Finally, I will mention the Checksum. This is a number that is computed by adding up all the octets in the datagram (more or less - see the TCP spec). The result is put in the header. TCP at the other end computes the checksum again. If they disagree, then something bad happened to the datagram in transmission, and it is thrown away. So here's what the datagram looks like now. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Source Port | Destination Port | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Sequence Number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Acknowledgment Number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Data | |U|A|P|R|S|F| | | Offset| Reserved |R|C|S|S|Y|I| Window | | | |G|K|H|T|N|N| | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Checksum | Urgent Pointer | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | your data ... next 500 octets | | ...... | If we abbreviate the TCP header as "T", the whole file now looks like this: - 12 - T.... T.... T.... T.... T.... T.... T.... You will note that there are items in the header that I have not described above. They are generally involved with managing the connection. In order to make sure the datagram has arrived at its destination, the recipient has to send back an "acknowledgement". This is a datagram whose "Acknowledgement number" field is filled in. For example, sending a packet with an acknowledgement of 1500 indicates that you have received all the data up to octet number 1500. If the sender doesn't get an acknowledgement within a reasonable amount of time, it sends the data again. The window is used to control how much data can be in transit at any one time. It is not practical to wait for each datagram to be acknowledged before sending the next one. That would slow things down too much. On the other hand, you can't just keep sending, or a fast computer might overrun the capacity of a slow one to absorb data. Thus each end indicates how much new data it is currently prepared to absorb by putting the number of octets in its "Window" field. As the computer receives data, the amount of space left in its window decreases. When it goes to zero, the sender has to stop. As the receiver processes the data, it increases its window, indicating that it is ready to accept more data. Often the same datagram can be used to acknowledge receipt of a set of data and to give permission for additional new data (by an updated window). The "Urgent" field allows one end to tell the other to skip ahead in its processing to a particular octet. This is often useful for handling asynchronous events, for example when you type a control character or other command that interrupts output. The other fields are beyond the scope of this document. 1.1.5. The IP level TCP sends each of these datagrams to IP. Of course it has to tell IP the Internet address of the computer at the other end. Note that this is all IP is concerned about. It doesn't care about what is in the datagram, or even in the TCP header. IP's job is simply to find a route for the datagram and get it to the other end. In order to allow gateways or other intermediate systems to forward the datagram, it adds its own header. The main things in this header are the source and destination Internet address (32-bit addresses, like 128.6.4.194), the protocol number, and another checksum. The source Internet address is simply the address of your machine. (This is necessary so the other end knows where the datagram came from.) The destination Internet address is the address of the other machine. (This is necessary so any gateways in the middle know where you want the datagram to go.) The protocol number tells IP at the other end to send the datagram to TCP. Although most IP traffic uses TCP, there are other protocols that can use IP, so you have to tell IP which protocol to send the datagram to. Finally, the checksum allows IP at the other end to verify that the header wasn't damaged in transit. Note that TCP and IP have separate checksums. IP needs to be able to verify that the header didn't get damaged in transit, or it could send a message to the wrong place. For reasons not worth discussing here, it is both more efficient and safer to have TCP compute a separate checksum for the TCP header and data. Once IP has tacked on its - 13 - header, here's what the message looks like: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |Version| IHL |Type of Service| Total Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Identification |Flags| Fragment Offset | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Time to Live | Protocol | Header Checksum | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Source Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Destination Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | TCP header, then your data ...... | | | If we represent the IP header by an "I", your file now looks like this: IT.... IT.... IT.... IT.... IT.... IT.... IT.... Again, the header contains some additional fields that have not been discussed. Most of them are beyond the scope of this document. The flags and fragment offset are used to keep track of the pieces when a datagram has to be split up. This can happen when datagrams are forwarded through a network for which they are too big. (This will be discussed a bit more below.) The time to live is a number that is decremented whenever the datagram passes through a system. When it goes to zero, the datagram is discarded. This is done in case a loop develops in the system somehow. Of course this should be impossible, but well-designed networks are built to cope with "impossible" conditions. At this point, it's possible that no more headers are needed. If your computer happens to have a direct phone line connecting it to the destination computer, or to a gateway, it may simply send the datagrams out on the line (though likely a synchronous protocol such as HDLC would be used, and it would add at least a few octets at the beginning and end). 1.1.6. The Ethernet level However most of our networks these days use Ethernet. So now we have to describe Ethernet's headers. Unfortunately, Ethernet has its own addresses. The people who designed Ethernet wanted to make sure that no two machines would end up with the same Ethernet address. Furthermore, they didn't want the user to have to worry about assigning addresses. So each Ethernet controller comes with an address builtin from the factory. In order to make sure that they would never have to reuse addresses, the Ethernet designers allocated 48 bits for the Ethernet address. People who make Ethernet equipment have to register with a central authority, to make sure that the numbers they assign don't overlap any other manufacturer. Ethernet is a "broadcast medium". That is, it is in effect like an old party line - 14 - telephone. When you send a packet out on the Ethernet, every machine on the network sees the packet. So something is needed to make sure that the right machine gets it. As you might guess, this involves the Ethernet header. Every Ethernet packet has a 14-octet header that includes the source and destination Ethernet address, and a type code. Each machine is supposed to pay attention only to packets with its own Ethernet address in the destination field. (It's perfectly possible to cheat, which is one reason that Ethernet communications are not terribly secure.) Note that there is no connection between the Ethernet address and the Internet address. Each machine has to have a table of what Ethernet address corresponds to what Internet address. (We will describe how this table is constructed a bit later.) In addition to the addresses, the header contains a type code. The type code is to allow for several different protocol families to be used on the same network. So you can use TCP/IP, DECnet, Xerox NS, etc. at the same time. Each of them will put a different value in the type field. Finally, there is a checksum. The Ethernet controller computes a checksum of the entire packet. When the other end receives the packet, it recomputes the checksum, and throws the packet away if the answer disagrees with the original. The checksum is put on the end of the packet, not in the header. The final result is that your message looks like this: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Ethernet destination address (first 32 bits) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Ethernet dest (last 16 bits) |Ethernet source (first 16 bits)| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Ethernet source address (last 32 bits) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type code | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | IP header, then TCP header, then your data | | | ... | | | end of your data | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Ethernet Checksum | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ If we represent the Ethernet header with "E", and the Ethernet checksum with "C", your file now looks like this: EIT....C EIT....C EIT....C EIT....C EIT....C When these packets are received by the other end, of course all the headers are removed. The Ethernet interface removes the Ethernet header and the checksum. It looks at the type code. Since the type code is the one assigned to IP, the Ethernet device driver passes the datagram up to IP. IP removes the IP header. It looks at the IP protocol field. Since the protocol type is TCP, it passes the datagram up to TCP. TCP now looks at the sequence number. It uses the sequence numbers and other information to combine all the - 15 - datagrams into the original file. The ends our initial summary of TCP/IP. There are still some crucial concepts we haven't gotten to, so we'll now go back and add details in several areas. (For detailed descriptions of the items discussed here see, RFC 793 for TCP, RFC 791 for IP, and RFC's 894 and 826 for sending IP over Ethernet.) 1.1.7. Well-known sockets and the applications layer So far, we have described how a stream of data is broken up into datagrams, sent to another computer, and put back together. However something more is needed in order to accomplish anything useful. There has to be a way for you to open a connection to a specified computer, log into it, tell it what file you want, and control the transmission of the file. (If you have a different application in mind, e.g. computer mail, some analogous protocol is needed.) This is done by "application protocols". The application protocols run "on top" of TCP/IP. That is, when they want to send a message, they give the message to TCP. TCP makes sure it gets delivered to the other end. Because TCP and IP take care of all the networking details, the applications protocols can treat a network connection as if it were a simple byte stream, like a terminal or phone line. Before going into more details about applications programs, we have to describe how you find an application. Suppose you want to send a file to a computer whose Internet address is 128.6.4.7. To start the process, you need more than just the Internet address. You have to connect to the FTP server at the other end. In general, network programs are specialized for a specific set of tasks. Most systems have separate programs to handle file transfers, remote terminal logins, mail, etc. When you connect to 128.6.4.7, you have to specify that you want to talk to the FTP server. This is done by having "well-known sockets" for each server. Recall that TCP uses port numbers to keep track of individual conversations. User programs normally use more or less random port numbers. However specific port numbers are assigned to the programs that sit waiting for requests. For example, if you want to send a file, you will start a program called "ftp". It will open a connection using some random number, say 1234, for the port number on its end. However it will specify port number 21 for the other end. This is the official port number for the FTP server. Note that there are two different programs involved. You run ftp on your side. This is a program designed to accept commands from your terminal and pass them on to the other end. The program that you talk to on the other machine is the FTP server. It is designed to accept commands from the network connection, rather than an interactive terminal. There is no need for your program to use a well-known socket number for itself. Nobody is trying to find it. However the servers have to have well-known numbers, so that people can open connections to them and start sending them commands. The official port numbers for each program are given in "Assigned Numbers". Note that a connection is actually described by a set of 4 numbers: - 16 - the Internet address at each end, and the TCP port number at each end. Every datagram has all four of those numbers in it. (The Internet addresses are in the IP header, and the TCP port numbers are in the TCP header.) In order to keep things straight, no two connections can have the same set of numbers. However it is enough for any one number to be different. For example, it is perfectly possible for two different users on a machine to be sending files to the same other machine. This could result in connections with the following parameters: Internet addresses TCP ports connection 1 128.6.4.194, 128.6.4.7 1234, 21 connection 2 128.6.4.194, 128.6.4.7 1235, 21 Since the same machines are involved, the Internet addresses are the same. Since they are both doing file transfers, one end of the connection involves the well-known port number for FTP. The only thing that differs is the port number for the program that the users are running. That's enough of a difference. Generally, at least one end of the connection asks the network software to assign it a port number that is guaranteed to be unique. Normally, it's the user's end, since the server has to use a well-known number. Now that we know how to open connections, let's get back to the applications programs. As mentioned earlier, once TCP has opened a connection, we have something that might as well be a simple wire. All the hard parts are handled by TCP and IP. However we still need some agreement as to what we send over this connection. In effect this is simply an agreement on what set of commands the application will understand, and the format in which they are to be sent. Generally, what is sent is a combination of commands and data. They use context to differentiate. For example, the mail protocol works like this: Your mail program opens a connection to the mail server at the other end. Your program gives it your machine's name, the sender of the message, and the recipients you want it sent to. It then sends a command saying that it is starting the message. At that point, the other end stops treating what it sees as commands, and starts accepting the message. Your end then starts sending the text of the message. At the end of the message, a special mark is sent (a dot in the first column). After that, both ends understand that your program is again sending commands. This is the simplest way to do things, and the one that most applications use. File transfer is somewhat more complex. The file transfer protocol involves two different connections. It starts out just like mail. The user's program sends commands like "log me in as this user", "here is my password", "send me the file with this name". However once the command to send data is sent, a second connection is opened for the data itself. It would certainly be possible to send the data on the same connection, as mail does. However file transfers often take a long time. The designers of the file transfer protocol wanted to allow the user to continue issuing commands while the transfer is going on. For example, the user might make an inquiry, or he might abort the transfer. Thus the designers felt it was best to use a - 17 - separate connection for the data and leave the original command connection for commands. (It is also possible to open command connections to two different computers, and tell them to send a file from one to the other. In that case, the data couldn't go over the command connection.) Remote terminal connections use another mechanism still. For remote logins, there is just one connection. It normally sends data. When it is necessary to send a command (e.g. to set the terminal type or to change some mode), a special character is used to indicate that the next character is a command. If the user happens to type that special character as data, two of them are sent. We are not going to describe the application protocols in detail in this document. It's better to read the RFC's yourself. However there are a couple of common conventions used by applications that will be described here. First, the common network representation: TCP/IP is intended to be usable on any computer. Unfortunately, not all computers agree on how data is represented. There are differences in character codes (ASCII vs. EBCDIC), in end of line conventions (carriage return, line feed, or a representation using counts), and in whether terminals expect characters to be sent individually or a line at a time. In order to allow computers of different kinds to communicate, each applications protocol defines a standard representation. Note that TCP and IP do not care about the representation. TCP simply sends octets. However the programs at both ends have to agree on how the octets are to be interpreted. The RFC for each application specifies the standard representation for that application. Normally it is "net ASCII". This uses ASCII characters, with end of line denoted by a carriage return followed by a line feed. For remote login, there is also a definition of a "standard terminal", which turns out to be a half-duplex terminal with echoing happening on the local machine. Most applications also make provisions for the two computers to agree on other representations that they may find more convenient. For example, PDP-10's have 36-bit words. There is a way that two PDP-10's can agree to send a 36-bit binary file. Similarly, two systems that prefer full-duplex terminal conversations can agree on that. However each application has a standard representation, which every machine must support. 1.1.8. An example application: SMTP In order to give a bit better idea what is involved in the application protocols, I'm going to show an example of SMTP, which is the mail protocol. (SMTP is "simple mail transfer protocol.) We assume that a computer called TOPAZ.RUTGERS.EDU wants to send the following message. Date: Sat, 27 Jun 87 13:26:31 EDT From: hedrick@topaz.rutgers.edu To: levy@red.rutgers.edu Subject: meeting Let's get together Monday at 1pm. - 18 - First, note that the format of the message itself is described by an Internet standard (RFC 822). The standard specifies the fact that the message must be transmitted as net ASCII (i.e. it must be ASCII, with carriage return/linefeed to delimit lines). It also describes the general structure, as a group of header lines, then a blank line, and then the body of the message. Finally, it describes the syntax of the header lines in detail. Generally they consist of a keyword and then a value. Note that the addressee is indicated as LEVY@RED.RUTGERS.EDU. Initially, addresses were simply "person at machine". However recent standards have made things more flexible. There are now provisions for systems to handle other systems' mail. This can allow automatic forwarding on behalf of computers not connected to the Internet. It can be used to direct mail for a number of systems to one central mail server. Indeed there is no requirement that an actual computer by the name of RED.RUTGERS.EDU even exist. The name servers could be set up so that you mail to department names, and each department's mail is routed automatically to an appropriate computer. It is also possible that the part before the @ is something other than a user name. It is possible for programs to be set up to process mail. There are also provisions to handle mailing lists, and generic names such as "postmaster" or "operator". The way the message is to be sent to another system is described by RFC's 821 and 974. The program that is going to be doing the sending asks the name server several queries to determine where to route the message. The first query is to find out which machines handle mail for the name RED.RUTGERS.EDU. In this case, the server replies that RED.RUTGERS.EDU handles its own mail. The program then asks for the address of RED.RUTGERS.EDU, which is 128.6.4.2. Then the mail program opens a TCP connection to port 25 on 128.6.4.2. Port 25 is the well-known socket used for receiving mail. Once this connection is established, the mail program starts sending commands. Here is a typical conversation. Each line is labelled as to whether it is from TOPAZ or RED. Note that TOPAZ initiated the connection: RED 220 RED.RUTGERS.EDU SMTP Service at 29 Jun 87 05:17:18 EDT TOPAZ HELO topaz.rutgers.edu RED 250 RED.RUTGERS.EDU - Hello, TOPAZ.RUTGERS.EDU TOPAZ MAIL From: RED 250 MAIL accepted TOPAZ RCPT To: RED 250 Recipient accepted TOPAZ DATA RED 354 Start mail input; end with . TOPAZ Date: Sat, 27 Jun 87 13:26:31 EDT TOPAZ From: hedrick@topaz.rutgers.edu TOPAZ To: levy@red.rutgers.edu TOPAZ Subject: meeting TOPAZ TOPAZ Let's get together Monday at 1pm. TOPAZ . - 19 - RED 250 OK TOPAZ QUIT RED 221 RED.RUTGERS.EDU Service closing transmission channel First, note that commands all use normal text. This is typical of the Internet standards. Many of the protocols use standard ASCII commands. This makes it easy to watch what is going on and to diagnose problems. For example, the mail program keeps a log of each conversation. If something goes wrong, the log file can simply be mailed to the postmaster. Since it is normal text, he can see what was going on. It also allows a human to interact directly with the mail server, for testing. (Some newer protocols are complex enough that this is not practical. The commands would have to have a syntax that would require a significant parser. Thus there is a tendency for newer protocols to use binary formats. Generally they are structured like C or Pascal record structures.) Second, note that the responses all begin with numbers. This is also typical of Internet protocols. The allowable responses are defined in the protocol. The numbers allow the user program to respond unambiguously. The rest of the response is text, which is normally for use by any human who may be watching or looking at a log. It has no effect on the operation of the programs. (However there is one point at which the protocol uses part of the text of the response.) The commands themselves simply allow the mail program on one end to tell the mail server the information it needs to know in order to deliver the message. In this case, the mail server could get the information by looking at the message itself. But for more complex cases, that would not be safe. Every session must begin with a HELO, which gives the name of the system that initiated the connection. Then the sender and recipients are specified. (There can be more than one RCPT command, if there are several recipients.) Finally the data itself is sent. Note that the text of the message is terminated by a line containing just a period. (If such a line appears in the message, the period is doubled.) After the message is accepted, the sender can send another message, or terminate the session as in the example above. Generally, there is a pattern to the response numbers. The protocol defines the specific set of responses that can be sent as answers to any given command. However programs that don't want to analyze them in detail can just look at the first digit. In general, responses that begin with a 2 indicate success. Those that begin with 3 indicate that some further action is needed, as shown above. 4 and 5 indicate errors. 4 is a "temporary" error, such as a disk filling. The message should be saved, and tried again later. 5 is a permanent error, such as a non-existent recipient. The message should be returned to the sender with an error message. (For more details about the protocols mentioned in this section, see RFC's 821/822 for mail, RFC 959 for file transfer, and RFC's 854/855 for remote logins. For the well-known port numbers, see the current edition of Assigned Numbers, and possibly RFC 814.) - 20 - 1.2. Protocols other than TCP: UDP and ICMP So far, we have described only connections that use TCP. Recall that TCP is responsible for breaking up messages into datagrams, and reassembling them properly. However in many applications, we have messages that will always fit in a single datagram. An example is name lookup. When a user attempts to make a connection to another system, he will generally specify the system by name, rather than Internet address. His system has to translate that name to an address before it can do anything. Generally, only a few systems have the database used to translate names to addresses. So the user's system will want to send a query to one of the systems that has the database. This query is going to be very short. It will certainly fit in one datagram. So will the answer. Thus it seems silly to use TCP. Of course TCP does more than just break things up into datagrams. It also makes sure that the data arrives, resending datagrams where necessary. But for a question that fits in a single datagram, we don't need all the complexity of TCP to do this. If we don't get an answer after a few seconds, we can just ask again. For applications like this, there are alternatives to TCP. The most common alternative is UDP ("user datagram protocol"). UDP is designed for applications where you don't need to put sequences of datagrams together. It fits into the system much like TCP. There is a UDP header. The network software puts the UDP header on the front of your data, just as it would put a TCP header on the front of your data. Then UDP sends the data to IP, which adds the IP header, putting UDP's protocol number in the protocol field instead of TCP's protocol number. However UDP doesn't do as much as TCP does. It doesn't split data into multiple datagrams. It doesn't keep track of what it has sent so it can resend if necessary. About all that UDP provides is port numbers, so that several programs can use UDP at once. UDP port numbers are used just like TCP port numbers. There are well- known port numbers for servers that use UDP. Note that the UDP header is shorter than a TCP header. It still has source and destination port numbers, and a checksum, but that's about it. No sequence number, since it is not needed. UDP is used by the protocols that han- dle name lookups (see IEN 116, RFC 882, and RFC 883), and a number of similar protocols. Another alternative protocol is ICMP ("Internet control message protocol"). ICMP is used for error messages, and other messages intended for the TCP/IP software itself, rather than any particular user program. For example, if you attempt to connect to a host, your system may get back an ICMP message saying "host unreachable". ICMP can also be used to find out some information about the network. See RFC 792 for details of ICMP. ICMP is similar to UDP, in that it handles messages that fit in one datagram. However it is even simpler than UDP. It doesn't even have port numbers in its header. Since all ICMP messages are interpreted by the network software itself, no port numbers are needed to say where a ICMP message is supposed to go. - 21 - 1.3. Keeping track of names and information: the domain system As we indicated earlier, the network software generally needs a 32-bit Internet address in order to open a connection or send a datagram. However users prefer to deal with computer names rather than numbers. Thus there is a database that allows the software to look up a name and find the corresponding number. When the Internet was small, this was easy. Each system would have a file that listed all of the other systems, giving both their name and number. There are now too many computers for this approach to be practical. Thus these files have been replaced by a set of name servers that keep track of host names and the corresponding Internet addresses. (In fact these servers are somewhat more general than that. This is just one kind of information stored in the domain system.) Note that a set of interlocking servers are used, rather than a single central one. There are now so many different institutions connected to the Internet that it would be impractical for them to notify a central authority whenever they installed or moved a computer. Thus naming authority is delegated to individual institutions. The name servers form a tree, corresponding to institutional structure. The names themselves follow a similar structure. A typical example is the name BORAX.LCS.MIT.EDU. This is a computer at the Laboratory for Computer Science (LCS) at MIT. In order to find its Internet address, you might potentially have to consult 4 different servers. First, you would ask a central server (called the root) where the EDU server is. EDU is a server that keeps track of educational institutions. The root server would give you the names and Internet addresses of several servers for EDU. (There are several servers at each level, to allow for the possibly that one might be down.) You would then ask EDU where the server for MIT is. Again, it would give you names and Internet addresses of several servers for MIT. Generally, not all of those servers would be at MIT, to allow for the possibility of a general power failure at MIT. Then you would ask MIT where the server for LCS is, and finally you would ask one of the LCS servers about BORAX. The final result would be the Internet address for BORAX.LCS.MIT.EDU. Each of these levels is referred to as a "domain". The entire name, BORAX.LCS.MIT.EDU, is called a "domain name". (So are the names of the higher-level domains, such as LCS.MIT.EDU, MIT.EDU, and EDU.) Fortunately, you don't really have to go through all of this most of the time. First of all, the root name servers also happen to be the name servers for the top-level domains such as EDU. Thus a single query to a root server will get you to MIT. Second, software generally remembers answers that it got before. So once we look up a name at LCS.MIT.EDU, our software remembers where to find servers for LCS.MIT.EDU, MIT.EDU, and EDU. It also remembers the translation of BORAX.LCS.MIT.EDU. Each of these pieces of information has a "time to live" associated with it. Typically this is a few days. After that, the information expires and has to be looked up again. This allows institutions to change things. The domain system is not limited to finding out Internet addresses. Each domain name is a node in a database. The node can have records that define a number of different properties. Examples are Internet - 22 - address, computer type, and a list of services provided by a computer. A program can ask for a specific piece of information, or all information about a given name. It is possible for a node in the database to be marked as an "alias" (or nickname) for another node. It is also possible to use the domain system to store information about users, mailing lists, or other objects. There is an Internet standard defining the operation of these databases, as well as the protocols used to make queries of them. Every network utility has to be able to make such queries, since this is now the official way to evaluate host names. Generally utilities will talk to a server on their own system. This server will take care of contacting the other servers for them. This keeps down the amount of code that has to be in each application program. The domain system is particularly important for handling computer mail. There are entry types to define what computer handles mail for a given name, to specify where an individual is to receive mail, and to define mailing lists. (See RFC's 882, 883, and 973 for specifications of the domain system. RFC 974 defines the use of the domain system in sending mail.) 1.4. Routing The description above indicated that the IP implementation is responsible for getting datagrams to the destination indicated by the destination address, but little was said about how this would be done. The task of finding how to get a datagram to its destination is referred to as "routing". In fact many of the details depend upon the particular implementation. However some general things can be said. First, it is necessary to understand the model on which IP is based. IP assumes that a system is attached to some local network. We assume that the system can send datagrams to any other system on its own network. (In the case of Ethernet, it simply finds the Ethernet address of the destination system, and puts the datagram out on the Ethernet.) The problem comes when a system is asked to send a datagram to a system on a different network. This problem is handled by gateways. A gateway is a system that connects a network with one or more other networks. Gateways are often normal computers that happen to have more than one network interface. For example, we have a Unix machine that has two different Ethernet interfaces. Thus it is connected to networks 128.6.4 and 128.6.3. This machine can act as a gateway between those two networks. The software on that machine must be set up so that it will forward datagrams from one network to the other. That is, if a machine on network 128.6.4 sends a datagram to the gateway, and the datagram is addressed to a machine on network 128.6.3, the gateway will forward the datagram to the destination. Major communications centers often have gateways that connect a number of different networks. (In many cases, special-purpose gateway systems provide better performance or reliability than general-purpose systems acting as gateways. A number of vendors sell such systems.) - 23 - Routing in IP is based entirely upon the network number of the destination address. Each computer has a table of network numbers. For each network number, a gateway is listed. This is the gateway to be used to get to that network. Note that the gateway doesn't have to connect directly to the network. It just has to be the best place to go to get there. For example at Rutgers, our interface to NSFnet is at the John von Neuman Supercomputer Center (JvNC). Our connection to JvNC is via a high-speed serial line connected to a gateway whose address is 128.6.3.12. Systems on net 128.6.3 will list 128.6.3.12 as the gateway for many off-campus networks. However systems on net 128.6.4 will list 128.6.4.1 as the gateway to those same off-campus networks. 128.6.4.1 is the gateway between networks 128.6.4 and 128.6.3, so it is the first step in getting to JvNC. When a computer wants to send a datagram, it first checks to see if the destination address is on the system's own local network. If so, the datagram can be sent directly. Otherwise, the system expects to find an entry for the network that the destination address is on. The datagram is sent to the gateway listed in that entry. This table can get quite big. For example, the Internet now includes several hundred individual networks. Thus various strategies have been developed to reduce the size of the routing table. One strategy is to depend upon "default routes". Often, there is only one gateway out of a network. This gateway might connect a local Ethernet to a campus-wide backbone network. In that case, we don't need to have a separate entry for every network in the world. We simply define that gateway as a "default". When no specific route is found for a datagram, the datagram is sent to the default gateway. A default gateway can even be used when there are several gateways on a network. There are provisions for gateways to send a message saying "I'm not the best gateway -- use this one instead." (The message is sent via ICMP. See RFC 792.) Most network software is designed to use these messages to add entries to their routing tables. Suppose network 128.6.4 has two gateways, 128.6.4.59 and 128.6.4.1. 128.6.4.59 leads to several other internal Rutgers networks. 128.6.4.1 leads indirectly to the NSFnet. Suppose we set 128.6.4.59 as a default gateway, and have no other routing table entries. Now what happens when we need to send a datagram to MIT? MIT is network 18. Since we have no entry for network 18, the datagram will be sent to the default, 128.6.4.59. As it happens, this gateway is the wrong one. So it will forward the datagram to 128.6.4.1. But it will also send back an error saying in effect: "to get to network 18, use 128.6.4.1". Our software will then add an entry to the routing table. Any future datagrams to MIT will then go directly to 128.6.4.1. (The error message is sent using the ICMP protocol. The message type is called "ICMP redirect.") Most IP experts recommend that individual computers should not try to keep track of the entire network. Instead, they should start with default gateways, and let the gateways tell them the routes, as just described. However this doesn't say how the gateways should find out about the routes. The gateways can't depend upon this strategy. They have to have fairly complete routing tables. For this, some sort of routing protocol is needed. A routing protocol is simply a technique for the gateways to find each other, and keep up to date about the - 24 - best way to get to every network. RFC 1009 contains a review of gateway design and routing. However rip.doc is probably a better introduction to the subject. It contains some tutorial material, and a detailed description of the most commonly-used routing protocol. 1.5. Details about Internet addresses: subnets and broadcasting As indicated earlier, Internet addresses are 32-bit numbers, normally written as 4 octets (in decimal), e.g. 128.6.4.7. There are actually 3 different types of address. The problem is that the address has to indicate both the network and the host within the network. It was felt that eventually there would be lots of networks. Many of them would be small, but probably 24 bits would be needed to represent all the IP networks. It was also felt that some very big networks might need 24 bits to represent all of their hosts. This would seem to lead to 48 bit addresses. But the designers really wanted to use 32 bit addresses. So they adopted a kludge. The assumption is that most of the networks will be small. So they set up three different ranges of address. Addresses beginning with 1 to 126 use only the first octet for the network number. The other three octets are available for the host number. Thus 24 bits are available for hosts. These numbers are used for large networks. But there can only be 126 of these very big networks. The Arpanet is one, and there are a few large commercial networks. But few normal organizations get one of these "class A" addresses. For normal large organizations, "class B" addresses are used. Class B addresses use the first two octets for the network number. Thus network numbers are 128.1 through 191.254. (We avoid 0 and 255, for reasons that we see below. We also avoid addresses beginning with 127, because that is used by some systems for special purposes.) The last two octets are available for host addesses, giving 16 bits of host address. This allows for 64516 computers, which should be enough for most organizations. (It is possible to get more than one class B address, if you run out.) Finally, class C addresses use three octets, in the range 192.1.1 to 223.254.254. These allow only 254 hosts on each network, but there can be lots of these networks. Addresses above 223 are reserved for future use, as class D and E (which are currently not defined). Many large organizations find it convenient to divide their network number into "subnets". For example, Rutgers has been assigned a class B address, 128.6. We find it convenient to use the third octet of the address to indicate which Ethernet a host is on. This division has no significance outside of Rutgers. A computer at another institution would treat all datagrams addressed to 128.6 the same way. They would not look at the third octet of the address. Thus computers outside Rutgers would not have different routes for 128.6.4 or 128.6.5. But inside Rutgers, we treat 128.6.4 and 128.6.5 as separate networks. In effect, gateways inside Rutgers have separate entries for each Rutgers subnet, whereas gateways outside Rutgers just have one entry for 128.6. Note that we could do exactly the same thing by using a separate class C address for each Ethernet. As far as Rutgers is concerned, it would be just as convenient for us to have a number of class C addresses. However using class C addresses would make things inconvenient for the rest of the world. Every institution that wanted - 25 - to talk to us would have to have a separate entry for each one of our networks. If every institution did this, there would be far too many networks for any reasonable gateway to keep track of. By subdividing a class B network, we hide our internal structure from everyone else, and save them trouble. This subnet strategy requires special provi- sions in the network software. It is described in RFC 950. 0 and 255 have special meanings. 0 is reserved for machines that don't know their address. In certain circumstances it is possible for a machine not to know the number of the network it is on, or even its own host address. For example, 0.0.0.23 would be a machine that knew it was host number 23, but didn't know on what network. 255 is used for "broadcast". A broadcast is a message that you want every system on the network to see. Broadcasts are used in some situations where you don't know who to talk to. For example, suppose you need to look up a host name and get its Internet address. Sometimes you don't know the address of the nearest name server. In that case, you might send the request as a broadcast. There are also cases where a number of systems are interested in information. It is then less expensive to send a single broadcast than to send datagrams individually to each host that is interested in the information. In order to send a broadcast, you use an address that is made by using your network address, with all ones in the part of the address where the host number goes. For example, if you are on network 128.6.4, you would use 128.6.4.255 for broadcasts. How this is actually implemented depends upon the medium. It is not possible to send broadcasts on the Arpanet, or on point to point lines. However it is possible on an Ethernet. If you use an Ethernet address with all its bits on (all ones), every machine on the Ethernet is supposed to look at that datagram. Although the official broadcast address for network 128.6.4 is now 128.6.4.255, there are some other addresses that may be treated as broadcasts by certain implementations. For convenience, the standard also allows 255.255.255.255 to be used. This refers to all hosts on the local network. It is often simpler to use 255.255.255.255 instead of finding out the network number for the local network and forming a broadcast address such as 128.6.4.255. In addition, certain older implementations may use 0 instead of 255 to form the broadcast address. Such implementations would use 128.6.4.0 instead of 128.6.4.255 as the broadcast address on network 128.6.4. Finally, certain older implementations may not understand about subnets. Thus they consider the network number to be 128.6. In that case, they will assume a broadcast address of 128.6.255.255 or 128.6.0.0. Until support for broadcasts is implemented properly, it can be a somewhat dangerous feature to use. Because 0 and 255 are used for unknown and broadcast addresses, normal hosts should never be given addresses containing 0 or 255. Addresses should never begin with 0, 127, or any number above 223. Addresses violating these rules are sometimes referred to as "Martians", because of rumors that the Central University of Mars is using network 225. - 26 - 1.6. Datagram fragmentation and reassembly TCP/IP is designed for use with many different kinds of network. Unfortunately, network designers do not agree about how big packets can be. Ethernet packets can be 1500 octets long. Arpanet packets have a maximum of around 1000 octets. Some very fast networks have much larger packet sizes. At first, you might think that IP should simply settle on the smallest possible size. Unfortunately, this would cause serious performance problems. When transferring large files, big packets are far more efficient than small ones. So we want to be able to use the largest packet size possible. But we also want to be able to handle networks with small limits. There are two provisions for this. First, TCP has the ability to "negotiate" about datagram size. When a TCP connection first opens, both ends can send the maximum datagram size they can handle. The smaller of these numbers is used for the rest of the connection. This allows two implementations that can handle big datagrams to use them, but also lets them talk to implementations that can't handle them. However this doesn't completely solve the problem. The most serious problem is that the two ends don't necessarily know about all of the steps in between. For example, when sending data between Rutgers and Berkeley, it is likely that both computers will be on Ethernets. Thus they will both be prepared to handle 1500-octet datagrams. However the connection will at some point end up going over the Arpanet. It can't handle packets of that size. For this reason, there are provisions to split datagrams up into pieces. (This is referred to as "fragmentation".) The IP header contains fields indicating the a datagram has been split, and enough information to let the pieces be put back together. If a gateway connects an Ethernet to the Arpanet, it must be prepared to take 1500-octet Ethernet packets and split them into pieces that will fit on the Arpanet. Furthermore, every host implementation of TCP/IP must be prepared to accept pieces and put them back together. This is referred to as "reassembly". TCP/IP implementations differ in the approach they take to deciding on datagram size. It is fairly common for implementations to use 576-byte datagrams whenever they can't verify that the entire path is able to handle larger packets. This rather conservative strategy is used because of the number of implementations with bugs in the code to reassemble fragments. Implementors often try to avoid ever having fragmentation occur. Different implementors take different approaches to deciding when it is safe to use large datagrams. Some use them only for the local network. Others will use them for any network on the same campus. 576 bytes is a "safe" size, which every implementation must support. 1.7. Ethernet encapsulation: ARP There was a brief discussion earlier about what IP datagrams look like on an Ethernet. The discussion showed the Ethernet header and checksum. However it left one hole: It didn't say how to figure out what Ethernet address to use when you want to talk to a given Internet address. In fact, there is a separate protocol for this, called ARP ("address resolution protocol"). (Note by the way that ARP is not an - 27 - IP protocol. That is, the ARP datagrams do not have IP headers.) Suppose you are on system 128.6.4.194 and you want to connect to system 128.6.4.7. Your system will first verify that 128.6.4.7 is on the same network, so it can talk directly via Ethernet. Then it will look up 128.6.4.7 in its ARP table, to see if it already knows the Ethernet address. If so, it will stick on an Ethernet header, and send the packet. But suppose this system is not in the ARP table. There is no way to send the packet, because you need the Ethernet address. So it uses the ARP protocol to send an ARP request. Essentially an ARP request says "I need the Ethernet address for 128.6.4.7". Every system listens to ARP requests. When a system sees an ARP request for itself, it is required to respond. So 128.6.4.7 will see the request, and will respond with an ARP reply saying in effect "128.6.4.7 is 8:0:20:1:56:34". (Recall that Ethernet addresses are 48 bits. This is 6 octets. Ethernet addresses are conventionally shown in hex, using the punctuation shown.) Your system will save this information in its ARP table, so future packets will go directly. Most systems treat the ARP table as a cache, and clear entries in it if they have not been used in a certain period of time. Note by the way that ARP requests must be sent as "broadcasts". There is no way that an ARP request can be sent directly to the right system. After all, the whole reason for sending an ARP request is that you don't know the Ethernet address. So an Ethernet address of all ones is used, i.e. ff:ff:ff:ff:ff:ff. By convention, every machine on the Ethernet is required to pay attention to packets with this as an address. So every system sees every ARP requests. They all look to see whether the request is for their own address. If so, they respond. If not, they could just ignore it. (Some hosts will use ARP requests to update their knowledge about other hosts on the network, even if the request isn't for them.) Note that packets whose IP address indicates broadcast (e.g. 255.255.255.255 or 128.6.4.255) are also sent with an Ethernet address that is all ones. 1.8. Getting more information This directory contains documents describing the major protocols. There are literally hundreds of documents, so we have chosen the ones that seem most important. Internet standards are called RFC's. RFC stands for Request for Comment. A proposed standard is initially issued as a proposal, and given an RFC number. When it is finally accepted, it is added to Official Internet Protocols, but it is still referred to by the RFC number. We have also included two IEN's. (IEN's used to be a separate classification for more informal documents. This classification no longer exists -- RFC's are now used for all official Internet documents, and a mailing list is used for more informal reports.) The convention is that whenever an RFC is revised, the revised version gets a new number. This is fine for most purposes, but it causes problems with two documents: Assigned Numbers and Official Internet Protocols. These documents are being revised all the time, so the RFC number keeps changing. You will have to look in rfc-index.txt to find the number of the latest edition. Anyone who is seriously interested in TCP/IP should read the RFC describing IP (791). RFC 1009 is also useful. It is a specification for gateways - 28 - to be used by NSFnet. As such, it contains an overview of a lot of the TCP/IP technology. You should probably also read the description of at least one of the application protocols, just to get a feel for the way things work. Mail is probably a good one (821/822). TCP (793) is of course a very basic specification. However the spec is fairly complex, so you should only read this when you have the time and patience to think about it carefully. Fortunately, the author of the major RFC's (Jon Postel) is a very good writer. The TCP RFC is far easier to read than you would expect, given the complexity of what it is describing. You can look at the other RFC's as you become curious about their subject matter. Here is a list of the documents you are more likely to want: rfc-index list of all RFC's rfc1012 somewhat fuller list of all RFC's rfc1011 Official Protocols. It's useful to scan this to see what tasks protocols have been built for. This defines which RFC's are actual standards, as opposed to requests for comments. rfc1010 Assigned Numbers. If you are working with TCP/IP, you will probably want a hardcopy of this as a reference. It's not very exciting to read. It lists all the offically defined well-known ports and lots of other things. rfc1009 NSFnet gateway specifications. A good overview of IP routing and gateway technology. rfc1001/2 netBIOS: networking for PC's rfc973 update on domains rfc959 FTP (file transfer) rfc950 subnets rfc937 POP2: protocol for reading mail on PC's rfc894 how IP is to be put on Ethernet, see also rfc825 rfc882/3 domains (the database used to go from host names to Internet address and back -- also used to handle UUCP these days). See also rfc973 rfc854/5 telnet - protocol for remote logins rfc826 ARP - protocol for finding out Ethernet addresses rfc821/2 mail - 29 - rfc814 names and ports - general concepts behind well-known ports rfc793 TCP rfc792 ICMP rfc791 IP rfc768 UDP rip.doc details of the most commonly-used routing protocol ien-116 old name server (still needed by several kinds of system) ien-48 the Catenet model, general description of the philosophy behind TCP/IP The following documents are somewhat more specialized. rfc813 window and acknowledgement strategies in TCP rfc815 datagram reassembly techniques rfc816 fault isolation and resolution techniques rfc817 modularity and efficiency in implementation rfc879 the maximum segment size option in TCP rfc896 congestion control rfc827,888,904,975,985 EGP and related issues To those of you who may be reading this document remotely instead of at Rutgers: The most important RFC's have been collected into a three-volume set, the DDN Protocol Handbook. It is available from the DDN Network Information Center, SRI International, 333 Ravenswood Avenue, Menlo Park, California 94025 (telephone: 800-235-3155). You should be able to get them via anonymous FTP from sri-nic.arpa. rip.doc is available by anonymous FTP from topaz.rutgers.edu, as /pub/tcp-ip-docs/rip.doc. IBM PC 360k floppies with ARC'ed versions of the RFC's and IEN's are also available from the TAPR office, thanks to Andy Freeborn, N0CCZ. 1.9. Overview of the KA9Q Internet Package The software associated with this document represents the culmination of what might be described as a first phase of implementaton. The emphasis to date has been on building a robust platform on which to build real - 30 - networks. To this end, the core protocols have been extensively tested and verified. In addition, great emphasis has been placed on increasing the portability of the software, supporting more and more hardware interfaces, and making it possible to use as many networking technolo- gies (asynch or RS-232 lines, Ethernet, various packet radio interfaces, digipeaters, NET/ROM, etc) as possible. The down side is that the user interface can be described at best as "terse". The good news is that many individuals are working on improv- ing the interface, and great strides have been made in the Macintosh implementation. In the meantime, we ask only that you realize what our priorities have been, and understand that even the implementors aren't always proud of "how it looks". This release provides support for the IP, ICMP, TCP, UDP, FTP, SMTP, and Telnet protocols from the basic Arpanet set. In addition, the ARP pro- tocol is available for address resolution on AX.25 and Ethernet inter- faces, and support is provided for NET/ROM used as a transport. It is unfortunately necessary, as a result of the ongoing NET/ROM vs TheNet debate, to mention that the NET/ROM implementation included here is the original work of Dan Frank, W9NK, working solely from documents pub- lished by Software 2000. This release includes sources that are known to compile and run well on PC clones using MS-Dos, several flavors of System V Unix, including HP- UX and Microport on AT clones, the HP Portable Plus, the Atari ST, and the NEC PC-9801. Binaries are available on floppy for the PC and clones as part of this release. Floppies are available for the Mactintosh version, which is maintained separately but in parallel with the mainstream release. Other machines for which code is provided that may or may not (probably not) work include the Amiga and BSD Unix. - 31 - 2. Installation 2.1. What an IP Address Is, and How to Get One IP Addresses are 32 bit numbers that uniquely identify a given machine (or "host") running the TCP/IP protocol suite. All of the possible 32 bit numbers are coordinated by an entity known as the Network Informa- tion Center, or NIC. Amateur Radio operators are fortunate in that a "Class A Subnet" consisting of 24 bits of address, in the range 44.X.X.X, has been reserved for our use. By general concensus, Brian Kantor, WB6CYT, of San Diego, CA, now serves as the top level adminis- trator of the 44.X.X.X address space, and assigns blocks of addresses to regional coordinators from around the world. You need to have a unique address before you can link in with the rest of the networked world. The best way to get one is to ask around the local packet community and find out who your local address coordinator is. Your local coordinator will then assign you an address from the block for your area. Brian Kantor can be reached as brian@ucsd.edu on the Internet if you need help locating your local address coordinator. 2.2. Configuring a TNC for TCP/IP Operation This section describes the procedure for configuring various packet radio Terminal Node Controller units (TNC's) for operation with the KA9Q package. Readers who will be using the package with only SLIP or Ether- net (wired) connections can feel free to skip ahead to section 2.2. There are now several choices for TNCs to be used with the TCP/IP network code. Versions of the Keep It Simple Stupid TNC interface software (KISS) are available for the TNC-1, the TNC-2, the VADCG board and clones (Ashby), the Kantronics family of TNCs, and the AEA TNCs. Following are the different setup/configuration modes for the dif- ferent TNCs. 2.2.1. TAPR TNC-1 and Clones The firmware for the TNC-1 is available in either a downloadable version or a stand alone version. I will describe only the stand alone version here. Locate the ROM labeled E000 and remove it. Insert the KISS PROM in its place making sure that you orient the prom in the same direction (failure to do so will result in smoked PROM). Connect your TNC-1 to your computer using an RS-232 cable. A cable that passes the signals from pins 2, 3, and 7 is suffi- cient. Since the TNC-1 has no switches for setting the baud rate to the com- puter the firmware has been "hard wired" to 4800 baud. See the docu- mentation that comes with the TNC-1 version of KISS for instructions on how to patch the .HEX file for other baud rates. There is also a newer version of the TNC-1 KISS firmware that is - 32 - documented in the TNC_TNC1.ARC file in the distribution. TAPR can provide programmed TNC-1 KISS EPROMs. 2.2.2. TAPR TNC-2 and Clones The standard firmware for the TNC-2 now supports a 'KISS' command to turn on KISS support. If you wish to use the KISS command included in 1.1.6 firmware, read your TNC documentation for more info. If you want to run KISS only, or have an older TNC-2 without the KISS command, dig out the TNC_TNC2.ARC package and read the docs included on how to program an EPROM with the firmware (or buy a ROM from TAPR), and then proceed. Open up your TNC and locate the ROM. It is in the socket labeled "U23." Using a small nail file or screwdriver gently pry up the existing EPROM. Carefully press the new EPROM into place being sure that the orientation is the same. If you are installing the 2764 type of EPROM you will need to make a small modification to the TNC. There is a location on the board just above the first RAM socket labeled JMP- 6. Turn the board over and cut the trace joining the two pads. Solder a two-pin jumper header in place. When using a type 2764 the jumper at JMP-6 should be removed and installed when a type 27256 EPROM is being used. That should complete the hardware part of the installa- tion. As an alternative you may choose to burn the KISS code into a 27256 and not bother with jumpers. Attach your TNC to your PC using an RS-232C cable. You can use the same cable that you were using to connect your PC to your TNC. If you are doing this for the first time and are not sure about your cabling, a cable with just pins 2, 3, and 7 passed through is sufficient. Some PCs like to see the signals Clear To Send (CTS, pin 5), Data Set Ready (DSR, pin 6), and Data Carrier Detect (DCD, pin 8) asserted. You can set this up by jumpering pin 4 to 5, and pin 20 to pins 6 and 8 at the female DB-25 connector that goes to the PC. Now to verify that the TNC still works. Apply power to the TNC and turn it on. The STA, CON, and PWR LEDs should come on and the STA and CON lights should go out again about 1 second later. If you have the type 2764 EPROM with the KISS code in it one or both of the STA and CON LEDs will begin to flash. If the CON LED flashes you have 8Kb of RAM in the TNC. If the STA LED flashes you have 16Kb of RAM. If both LEDs flash you have 32 Kb of RAM. The flashing of the LEDs ver- ifies the proper operation of the TNC. 2.2.3. AEA PK-232 If you have one of these boxes, congratulations! You do not have to change PROMS! KISS is already installed as a standard feature if you have a recent release of the firmware, 4-MAR-87 or later for the PK- 232, or 21-JAN-87 or later for the PK-87, you have KISS in your TNC already. To make it work first ensure that your computer can communi- cate with the TNC in standard packet mode. This will ensure - 33 - that the computer, TNC, cabling, and radio are all operating properly. [Please note that one of the commands "PACKET" is not valid on the PK- 87 and will only elicit a "Huh?" response. Please note that comments have been added to the commands. Do not type the information following the double dash or type the double dash itself.] Here is the sequence of commands that will turn on the KISS mode for the AEA products: AWLEN 8 -- ensure it can speak 8 bit data PARITY 0 -- no parity RESTART -- warm reset; make commands take effect PACKET -- PK-232 or Heath only TONE 3 -- PK-87 only START 0 -- disable software flow control STOP 0 XON 0 XOFF 0 XFLOW off CONMODE trans -- pass through all characters HPOLL off -- disable host polling KISS on -- enable KISS version of host mode RAWHDLC on -- turn off AX25L2 (now handled by the PC) PPERSIST on -- turn off DWAIT and enable p-persistence HOST on -- start KISS running The PK-87 or the PK-232 will remain in the KISS mode until you send a break (~200ms of spacing) or until you send the command character three times (^C ^C ^C) in quick succession. Beware! Some terminal emulators (like YAPP) will send a break signal when you exit from them. That will undo your work and cause all manner of confusion. The termi- nal program PROCOMM seems to work just fine. The TNC may also be switched back to ordinary AX.25 mode by issuing the following command from within NET.EXE: param ax0 255 AEA is rumored to be reworking their software so that entering just the "KISS" command will do all of the above. Check your documentation to see how your version works. 2.2.4. Kantronics TNC's Kantronics includes KISS support in their products. It is the simplest of the commercial implementations of KISS to configure and use. First setup and operate your KAM, KPC-II, or KPC-4 for standard packet operation. This will ensure that the computer, TNC, cabling, and radio are operating properly. Use your terminal program to send the following commands: ABAUD 4800 -- baud rate to what you will be using when net is running (set by the attach command) - 34 - DWAIT 0 -- disable DWAIT PERSIST 50 -- enable persistence and set it to about 20% SLOTTIME 16 -- set the slot time to 160 ms KISS ON -- Enable KISS mode at the next reset PERM -- make above command permanent so that KISS will be entered whenever TNC is powered up RESET -- start KISS If you wish to have the the TNC revert back to ordinary AX.25 mode of opera- tion you should omit the PERM command from the above sequence. That way the TNC will revert back to ordinary AX.25 mode when the power is removed and restored to the TNC. The TNC may be switched back to ordinary AX.25 mode by issuing the command: param ax0 255 This command will work even if the PERM command has been used to make KISS the default mode of operation. 2.2.5. Paccomm PC-100 Card There have been problems in the development of the driver for this card, and though support is included in this release, it is unclear whether the driver provided works at all, or what the proper way to configure the PC-100 is. An individual is working on improving the driver, and we hope to include his results soon. 2.2.6. DRSI DRSI provides a copy of the KA9Q package configured for their card directly. Contact DRSI about the current level of support they provide. At some point, their driver will hopefully be integrated back into the mainstream release. 2.3. IBM PC and Clones 2.3.1. Installing the Plug'N'Play Disk For users of IBM PC's and Clones, we try to make life as simple as pos- sible. There is a 360k floppy disk available from TAPR (see the appen- dices for contact information) that is almost ready to go. You "Plug" the disk in, edit a couple of files with your favorite text editor, and then you're ready to "Play". Instructions on editting the files are embedded as comments in the files, with a readme file on the disk to get you started. For completeness, information about the files is included here as well. 2.3.1.1. The AUTOEXEC.NET File The AUTOEXEC.NET file (called STARTUP.NET under Unix, and other things elsewhere) works a lot like the AUTOEXEC.BAT file in Dos, hence the name. When NET first starts up, it reads AUTOEXEC.BAT and executes all of the commands as if they had been typed in to the program from the keyboard. This provides an easy mechanism for setting up the initial - 35 - system configuration, including setting the hostname, AX.25 parameters, interfaces used, servers to start, and protocol variables. The suggested procedure is to start with the AUTOEXEC.NET file included on the plug and play disk, and go from there. If you don't have the plug and play disk, review the command summary elsewhere in this docu- ment, and wing it... 2.3.1.2. The FTPUSERS File Since MS-DOS was designed as a single-user operating system, it pro- vides no access control; all files can be read, written or deleted by the local user. It is usually undesirable to give such open access to a system to remote network users. The FTP server therefore provides its own access control mechan- ism. The file "/ftpusers" is used to control remote FTP access. The default is NO access; if this file does not exist, the FTP server will be unusable. A remote user must first "log in" to the system, giving a valid name and password listed in /ftpusers, before he or she can transfer files. Each entry in /ftpusers consists of a single line of the form username password path1 permissions1 path2 permissions2 ... There must be exactly one space between each field. Comment lines are begun with "#" in column one. "username" is the user's login name. "password" is the required password. Note that this is in plain- text; therefore it is not a good idea to give general read permission to the root directory. A password of "*" (a single asterisk) means that any password is acceptable. "/pathN" is an allowable prefix on accessible files. Before any file or directory operation, the current directory and the user specified file name are joined to form an absolute path name in "canonical" form (i.e., a full path name starting at the root, with "./" and "../" references, as well as redundant /'s, recognized and removed). The result MUST begin with an allowable path prefix; if not, the opera- tion is denied. NB! Under MS-DOS, this field must use backslashes ("/"), NOT forward slashes ("/"). This field must always begin with a "/", i.e., at the root directory. "permissionsN" is a decimal number granting permission for read, create and write operations. If the low order bit (0x1) is set, the user is allowed to read a file subject to the path name prefix restriction. If the next bit (0x2) is set, the user is allowed to create a new file if it does not overwrite an existing file. If the third bit (0x4) is set, the user is allowed to write a file even if it overwrites an existing file, and in addi- tion he may delete files. Again, all operations are allowed subject to the path name prefix - 36 - restrictions. Permissions may be combined by adding bits, for example, 0x3 (= 0x2 + 0x1) means that the user is given read and create per- mission, but not overwrite/delete permission. For example, suppose /ftpusers on machine "pc.ka9q.ampr" contains the line friendly test /testdir 7 A session using this account would look like this: net> ftp pc.ka9q.ampr SYN Sent Established 250 pc.ka9q.ampr FTP version 871225.5 ready at Wed Jan 20 16:27:18 1988 user friendly 331 Enter PASS command pass test 230 Logged in The user now has read, write, overwrite and delete privileges for any file under /testdir; he may not access any other files. Here are some more sample entries in /ftpusers: karn foobar / 7 # User "karn" password "foobar" may read, # write, overwrite and delete any file on # system. guest bletch /g/bogus 3 # User "guest" password "bletch" may read # any file under /g/bogus and its subdirs, # and may create new files that do not # overwrite existing files. He may NOT # delete any files. anonymous * /public 1 # User "anonymous" (any password) may read # files under /public and subdir; he may # not create, overwrite or delete any # files. The last entry is a standard convention for keeping a repository of downloadable files; in particular, the username "anonymous" is an established ARPAnet convention. Every system providing an FTP server is encouraged to provide restricted access to an 'anonymous' user. 2.3.1.3. The HOSTS.NET File The file HOSTS.NET provides a mapping between internet addresses and symbolic hostnames. It is used by NET to look up a hostname to figure out the correct IP address to use. This version of NET does not include nameserver support (see the discussion earlier in this document), and so uses this static file for name lookups. Tabs are recommended between the host number and host name. Here is an example of some HOSTS.NET - 37 - entries: 44.96.0.2 wb2sef xt.wb2sef 44.96.0.16 n8fjb 44.96.0.17 ka3lyq Note that the domain name .AMPR.ORG has been assigned for amateur radio. By default, we assume that the hostname is the user's callsign in the case where a user has one system online, and so .AMPR.ORG is the implied official hostname. If you have more than one machine on the air, distinguish them by prefixing a familiar name followed by a period, as in "winfree.n3eua" or "at.n0ccz". Note that the use of a callsign as a host name has nothing to do with the "mycall" parameter. It is convenient to use the callsign as a host- name, and required to use the callsign for "mycall" to properly identify a station according to FCC rules. 2.3.2. Installing on a Hard Disk To install the software on a hard disk, just clone the directory struc- ture and file layout from the floppy disk. All paths are relative to the root directory of the current drive. 2.4. Unix To run NET under Unix, you'll need to compile the program from sources. To do so, unpack the source archive into a directory, edit the beginning of makefile.unx to pick your Unix variant, edit config.h to enable the appropriate interface hardware (slip and kiss are probably all that will work), the run 'make -f makefile.unx'. There's nothing wrong with copy- ing the makefile.unx file to makefile and doing the editting there... personal preference. The basic requirements are that the serial ports to be used by net must have their permissions set so that they are read-write for the userid that will run net. For example, you can define a user named 'net' and make that user own tty000 and tty001. The protection for the ttys should be crw------- (600). Logins must be turned off for those ports, i.e. there must not be any other process, such as a getty or init, trying to access them. The attach line is virtually the same as for the PC, except that the I/O address argument is ignored and the I/O vector argument is now the tty name. For example: attach asy 0 /dev/tty000 ax25 ax0 2048 256 4800 attach asy 0 /dev/tty001 ax25 ax1 2048 256 4800 The Unix version of Net uses two environment variables, NETHOME and NETSPOOL. A possible configuration might be NETHOME=/usr/net NETSPOOL=/usr/spool The directories needed are /usr/net, /usr/net/finger, /usr/spool/mail/, - 38 - and /usr/spool/mqueue. See also the documentation on the W2XO BBS (sources and documentation are included in the NET source distribution), as there are some important interactions if you intend to run the PBBS code with NET under Unix. The Unix version of NET currently supports only serial ports, with the KISS and SLIP protocols. 2.5. Macintosh This release does not include Macintosh code. A separate group is work- ing on the Macintosh, using the same system independent protocol modules, but with a user interface that is much more closely related to the expected Macintosh environment. Installation documentation for the Mac is included with the Mac version of the software, available from . 2.6. Atari ST Installation for the Atari version of NET is not yet available. Your best bet is to stare at the sources, in config.h and files.h, as well as st.c and st.h. We hope to include documentation in the next revision of this manual. 2.7. NEC PC-9801 Installation on the NEC PC-98 family is the same as for the IBM PC and clones, except that you need to have the version of NET.EXE that includes handling for the serial port(s) in the NEC machine. 2.8. Hewlett-Packard Portable Plus Installation on the Portable Plus is the same as for the IBM PC and clones, except that you need to have the version of NET.EXE that is designed for the Portable Plus. - 39 - 3. Taking NET for a Test Drive For the quick introduction to NET provided in this section, we assume that you are using an IBM PC or clone with the Plug'n'Play disk. We also assume that you've already configured the disk per in the installation instructions. Finally, we assume a TNC has been set up as interface 'ax0'. 3.1. Trying out the AX.25 Support Start by typing 'NET' to get the program up and running. You should be presented with a banner including revision information and a copyright statement, followed by a prompt of 'net>'. If you don't get this, some- thing is horribly wrong. Find a friend and ask for help. Once you have the program going at all, the first thing you'll probably want to do is to figure out if the TNC is hooked up correctly, and whether you're getting out at all. To get connected, you do basically the same thing you'd do with a raw TNC. Type 'connect ax0 ', where is someone's callsign who is known to be on the air. You can also specify a digipeater string. For example, you could type one of: connect ax0 n3eua (connect using the ax0 TNC to N3EUA) connext ax0 n3eua n1fed n0ccz (conn to N3EUA via N1FED and N0CCZ) If all is well, you should get "Conn Pending" and then "Connected" mes- sages. At this point, you're connected just like using a plain old TNC. Kind of boring, huh? It'll get more exciting soon! When you're ready to disconnect, use the key to escape from the session back to the 'net>' prompt, and then type 'disconnect'. You will discover that all commands can be abbreviated, and you can type a If things don't work, watch the lights on the TNC to see if you're transmitting at all, then go read up on the "trace" command so you can see what the program thinks it's doing. Even easier, if there's someone else using TCP in your area, ask for help! 3.2. The Telnet Command If there's someone else on the air in your area already using TCP/IP, then the next most easy thing to do is to try a keyboard connection using the Telnet protocol. The end result will be the same as doing an AX.25 connect in most cases, but you'll be taking advantage of a couple of neat attributes of having more protocol horsepower to help you out. First, you need to either know the numeric IP address of your friend's system, or you need to have updated HOSTS.NET to include the system name and the numeric address. Then all you have to do is type: telnet n3eua (talk to N3EUA, address in HOSTS.NET) tel- net [44.32.0.4] (use the numeric address directly) - 40 - Now you can type back and forth just as if you were connected with a normal TNC. When you're done, use the key to escape back to com- mand mode, and then type 'close' to close the connection gracefully, or 'reset' if you're really in a hurry. 3.3. The FTP Command So far, all we've done is to use more software and work harder to do the same things we can do with a plain old TNC. The FTP command isn't like that! If you want to get a file from your friends' machine, you can type the command: ftp n3eua to start a file transfer session to the N3EUA machine. When the connec- tion is opened, you'll get a banner from the remote machine, followed by a prompt for your user name. If you've negotiated with your friend to have a special username and password set up for you in his FTPUSERS file, use that. If not, many machines allow arbitrary users to get lim- ited access to the files available with a special username 'anonymous'. If you want to use the 'anonymous' login, when you're prompted for a password enter your callsign or something else recognizable, as many folks keep a log of FTP's so they know what files people care about, and being able to associate your activities with you sometimes helps. 3.4. The Mail System The mail system is a subject unto itself. It is also one of the truly nifty things about running TCP/IP. Look elsewhere in the documentation for a complete rundown on how to install and operate the BM mailer, and the portions of NET related to it. 3.5. Tracing and Status Commands The tracing and status commands provide a great deal of information about what is going on in the system. All we'll attempt to do here is raise your interest level. If you want to find out what sessions are active to and from your machine, you can type 'sessions' and you'll get a list. If you want to get information about all of the TCP connections open to and from your machine, including mail transfers and other things that don't directly interact with your keyboard and screen, you can type "tcp status" and you'll get a list of connections. If you're not sure what's happening on an interface, or you'd like to "read the mail" (watch what other folks are doing ont he channel), then use the "trace" command. The form is descibed in the command reference elsewhere in this document. For example: trace ax0 111 (activity on ax0, including ASCII dump) trace ax0 211 (activity on ax0, including hex dump) trace ax0 11 (activity on ax0, printing only the headers) Note that you also have control over whether tracing can bother you in a - 41 - session, see the trace command summary for more details. - 42 - 4. The Mail System As is typical with networking software, handling electronic mail is often as big a job as coping with all other applications combined. In order to make full use of the mail system in the KA9Q package, you will need to spend a little time getting things configured for your system. 4.1. Installing and Using BM The BM.EXE mail user interface program was created by Bdale Garbee, N3EUA, and despite popular belief, 'BM' really stands for "Bdale's Mailer". Gerard van der Grinten PA0GRI extended the mailer with a number of new features that resulted in version 2. More recently, Dave Trulli NN2Z has extended the mailer creating revision 3. All comments or suggestions about BM should be directed to Dave. BM provides a full set of mail services to the user which allow sending and receiving electronic mail, as well as a variety of local mail mani- pulation commands. 4.1.1. Installation To install BM requires the modification of the supplied configuration files and the creation of the proper directory structure. The fol- lowing sections describe the file and directory structure used by BM and SMTP. 4.1.1.1. Directory Structure /spool/mqueue This directory holds the outbound mail jobs for SMTP. Each job consists of 2 files a xxxx.txt and xxxx.wrk file where xxxx is a unique numerical prefix. The format of the files are described in a later section. /spool/rqueue This directory is used by SMTP for jobs that have been received and will be processed by a user defined mail routing program. This directory is not used directly by BM. /spool/mail This directory holds the individual mailboxes for each user name on your system. The extension .txt is add to the user name to form the mailbox name. Mail received by the SMTP server is appended to the mailbox file. 4.1.1.2. Configuration File The /bm.rc file provides BM with the configuration needed for the operation of the mailer. - 43 - The format for the /bm.rc file is: variable value The following variables are valid in the bm.rc file: 4.1.1.2.1. smtp Defines the path to the directory containing the mailbox files. The default directory is /spool/mail on the current drive. 4.1.1.2.2. host Is used to set the local hostname for use in the RFC822 mail headers. This is a required field. This should match the hostname definition in autoexec.net. 4.1.1.2.3. user Defines the user name of the person who is sending mail. This is also used as the default mailbox for reading mail. On the AMPRNET this is usually set to your call. There is a DOS limit of 8 characters for the user name. 4.1.1.2.4. edit Defines the name of your favorite editor which can be used to construct and edit the text of outgoing messages. The use of edit is optional. 4.1.1.2.5. fullname Is used to provide your full name to the mailer for use in the comment portion of "From:" header line. The use of fullname is optional. 4.1.1.2.6. reply Defines the address where you wish to receive replies to messages sent. This option is useful if you are operating your pc on a local area network and would like your mail replies sent to a more "well known host". The address specified by reply is used to generate a "Reply-To:" header in outbound mail. The "Reply-To:" header overrides the "From:" header which is the address normally used to reply to mail. This field is optional. 4.1.1.2.7. maxlet defines the maximum number of messages that can be processed by BM in one mailbox file. The default value of maxlet is 100. 4.1.1.2.8. mbox Specifies the default file to be used for the "save" command. This file is in the same format as a mailbox and may later be viewed - 44 - using the -f option of BM. If this option is not used then the default is set to mbox. 4.1.1.2.9. record If defined a copy of each message sent will be saved in . 4.1.1.2.10. folder If defined folder contains the path used by the save command. 4.1.1.2.11. screen [bios|direct] In the Turboc compiled version of BM, screen sets the display out- put mode to use either direct writes to screen memory or the ROM BIOS. The default is direct which provides the fastest output mode. If you are using a windowing system such as Desqview you should set the mode to bios. 4.1.1.2.12. Example BM.RC File host nn2z.ampr user dave fullname Dave Trulli # send my replies to the Sun reply nn2z@ka9q.bellcore.com screen direct edit /bin/vi mbox c:/folder/mbox record c:/folder/outmail folder c:/folder max- let 200 4.1.1.3. The ....lias File The alias file provides an easy way to maintain mailing lists. An alias can be any string of characters not containing the "@" symbol. The format for the alias file is: alias recip1 recip2 recip3 recip4 Note that a long list of aliases can be continued on an additional line by placing a tab or space on the continuation line. Some examples aliases are: dave nn2z@nn2z.ampr phil karn@ka9q.bellcore.com # mail to local nnj users nnj wb2cop@wb2cop.ampr karn@ka9q.bellcore.com wb0mpq@home.wb0mpq.ampr w2kb@w2kb.ampr In the above example, when specifying nnj as the recipient, BM will expand the alias into the list of recipients from the alias file. At this time an alias may not contain any other aliases. - 45 - 4.1.1.4. /spool/mqueue/sequence.seq The sequence file maintains a mes- sage counter which is used by BM and SMTP to generate message ids and unique filenames. This file is created if not already present by BM. 4.1.2. Environment The timezone used in mail headers is obtained from the DOS environment variable TZ. An example TZ setting is: set TZ=EDT4 It is set in your AUTOEXEC.BAT file. The first 3 characters are the timezone and the fourth character is the number of hours from GMT time. If TZ is not set, GMT is assumed. 4.2. Commands All BM commands are single letters followed by optional arguments. The command list has been designed to make those familiar with Berke- ley mailers comfortable with BM. 4.2.1. Main Menu Commands 4.2.1.1. m [userlist] The mail command is used to send a message to one or more recipients. All local recipient names ( those which don't contain an '@' ) are checked for possible aliases. If no arguments are supplied you will be prompted for a recipient list. While entering a message into the text buffer several commands are available such as: invoking an editor, and reading in text from other messages or files. See the section below for a description of these commands. To end a message enter a line containing a single period. It is important to remember that the input line buffer has a 128 char- acter limit. You should format your text by entering a carriage return at the end of each line. Typing excessively long lines may cause data loss due to truncation when passing the message through other hosts. Keeping lines less than 80 characters is always a good idea. 4.2.1.2. d [msglist] Mark messages for deletion. Messages marked for deletion are removed when exiting BM via the 'q' command or when changing to an alternate mailbox with the 'n' command. 4.2.1.3. h Display message headers. The message headers contain the message number, the status indicating whether it has been read or deleted, the sender, size, date, and subject. - 46 - 4.2.1.4. u [msglist] Undelete a message that is marked for deletion. The status of a mes- sage can be determined by looking at the status field of the message using the 'h' command. 4.2.1.5. n [mailbox] Display or change mailbox. The 'n' command with no arguments will display a list of mailboxes containing mail. If an argument is sup- plied, then the current mailbox is closed and a new mailbox is opened. 4.2.1.6. ! cmd Run a DOS command from inside BM. An error message will result if there is not enough memory available to load the command. 4.2.1.7. ? Print a help menu for BM commands. 4.2.1.8. s [msglist] [file] The 's' command is used to save messages in a file. If no filename is given the default from the mbox variable in /bm.rc is used. If no message number is supplied then the current message is saved. The message is stored in the same format as a mailbox file with all mail headers left intact. 4.2.1.9. p [msglist] The 'p' command is used to send messages to the printer. This com- mand uses the DOS device PRN for output. This command is equivalent to: s [ msglist ] PRN 4.2.1.10. w [msglist] file The 'w' command is used to save messages in a file. Only the message body is saved. All mail headers are removed. If no message number is supplied then the current message is saved. 4.2.1.11. f [msg] The 'f' command is used to forward a mail message to another recipient. If no message number is supplied the current message is used. The user is prompted for the recipients and a subject. The RFC822 header is added to the message text while retaining the complete original message in the body. Also see the ~m command. 4.2.1.12. b [msg] Bounce a message. Bounce is similar to forwarding but instead of your user information, the original sender information is - 47 - maintained. If no message number is supplied the current message is used. 4.2.1.13. r [msg] Reply to a message. Reply reads the header information in order to construct a reply to the sender. The destination information is taken from the "From:" or the "Reply-To:" header, if included. If no message number is supplied the current message is used. 4.2.1.14. msg# Entering a message number from the header listing will cause the message text to be displayed. 4.2.1.15. l List outbound messages. The job number, the sender, and the destina- tion for each message is displayed. A status of "L" will appear if the SMTP sender has the file locked. 4.2.1.16. k [msglist] Remove an outbound message from the mqueue. A message can be removed from the send queue by specifying the job number obtained by the l command. If the message is locked you will be warned that you may be removing a file that is currently being sent by SMTP. You will asked if this job should still be killed. 4.2.1.17. $ Update the mailbox. This command updates the mailbox, deleting messages marked for deletion and reading in any new mail that may have arrived since entering BM. 4.2.1.18. x Exit to DOS without changing the data in the mailbox. 4.2.1.19. q Quit to DOS updating the mailbox. 4.2.2. Text Input Commands The following commands are available while entering message text into the message buffer. ~r read into the message buffer. ~m read into the message buffer. ~p display the text in the message buffer. - 48 - ~e invoke the editor defined in /bm.rc with a temporary file containing the text in the message buffer. ~q Abort the current message. No data is sent. ~~ Insert a single tilda character into the message. ~? Display help menu of tilda escape commands. 4.2.3. Command Line Options BM may be invoked as follows: To send mail: bm [ -s subject ] recip1 .. .. recipN To read mail: bm [ -u mailbox | -f file ] -s subject This option sets the subject to the text on the command line. -u mailbox Specify which mailbox to read. This overides the default from the bm.rc. -f file Read message from "file" instead of a mailbox. 4.3. Technical Information 4.3.1. Outbound Mail Queue File Formats Outgoing mail messages consist of two files each in the /spool/mqueue direc- tory. The names of the two files will be of the form .WRK and .TXT, where integer is the sequence number of the message relative to this machine. The file sequence.seq in the mqueue directory contains the current sequence number for reference by the mail user interface. The .TXT file contains the data portion of the SMTP transaction, in full RFC822 format. The .WRK file consists of 3 or more lines, as follows: the hostname of the destination system the full sender address, in user@host format. some number of full destination addresses, in user@host format. 4.3.2. Standards Documents The SMTP specification is RFC821. The Format for text messages (includ- ing the headers) is in RFC822. RFC819 discusses hostname naming conven- tions, particularly domain naming. - 49 - 4.4. Bug Reports Please send any comments, suggestions or bug reports about BM to: Dave Trulli Usenet: nn2z@ka9q.bellcore.com packet: nn2z@nn2z AMPRNET: nn2z@nn2z.ampr [44.64.0.10] - 50 - 5. NET/ROM Support 5.1. Introduction The NET/ROM support for the KA9Q package serves three purposes: 1) Existing NET/ROM networks may be used to send IP traffic. 2) NET may be used as a NET/ROM packet switch. 3) NET may be used to communicate with NET/ROM nodes, and its mailbox facility can accept connects over the NET/ROM net- work. 5.2. Setting up the NET/ROM Interface No physical interface is completely dedicated to net/rom, which is as it should be. You attach all your AX.25 interfaces, of whatever sort. Then you attach the net/rom pseudo-interface ("attach netrom"). Then you identify to the net/rom software those interfaces you want to allow it to use, with the "netrom interface" command. The format of this com- mand is: netrom interface ax0 #ipnode 192 The first argument is the name of the previously attached interface you want to use. The second argument is the alias of your node, to be used in your routing broadcasts. The alias is never used for anything else (as you will see!). The last number is the net/rom quality figure. This is used in computing the route qualities; it represents the contri- bution of this interface to the overall computation. For a 1200 baud half-duplex connection, 192 is the right number. You need a netrom interface command for every interface you're going to use with net/rom. 5.3. Tracing on the NET/ROM Interface If you want to trace your NET/ROM datagrams, don't try turning on trace mode for the "netrom" interface. Nothing will break, but nothing will happen. You should trace the individual AX.25 interfaces instead. 5.4. Routing Broadcasts Once you have set up your interfaces, you need to set some timers. There are two: the nodes broadcast interval timer, and the obsolescence timer. These are set in seconds, like the smtp timer. You should usu- ally set them to an hour. You can set them to something different, if you want. If your local net/rom nodes broadcast every hour, but you want to do so every ten minutes, you can say: netrom nodetimer 600 netrom obsotimer 3600 Every time the obsotimer kicks, the obsolescence counts for all non- permanent entries are decremented by one. When the count for an entry - 51 - falls below five, it is no longer broadcast. When it falls to 0, it is removed. The count is initialized at 6. These will eventually be sett- able parameters; you can adjust them now by changing the initializers for the variables in the source file. When you first come on the air, you can send out nodes broadcasts to tell the local nodes that you are available. Use the command: netrom bcnodes ax0 where ax0 is the interface on which you want to send the broadcast. Do this for every interface on which you want to do this. By default, the NET/ROM code does not broadcast the contents of your routing table. This is as it should be, since usually we just want to be the endpoints of communications rather than relaying NET/ROM traffic. If you want to be a switch station, include the command: netrom verbose yes in your autoexec. Sometimes you can hear broadcasts from nodes that can't hear you. If your routing table gets filled with these unusable routes, your node will grind to a halt. The solution to this is node broadcast filtering, via the netrom nodefilter command. There is a filter list, which con- tains a list of callsigns and interfaces. Then there is a filter mode, which indicates what to do with the list. If the filter mode is "none", no filtering is done. If it is "accept", then only broadcasts from the indicated stations on the indi- cated interfaces are accepted. If it is "reject", then all broadcasts except those from the listed stations on the listed interfaces are accepted. Because the net/rom code cannot at this time recognize unusable routes and try alternates, I strongly recommend use of the filter com- mand to restrict broadcast acceptance to those nodes which you know you can reach. 5.5. The NET/ROM Routing Table The next net/rom commands are those used for maintaining the routing table. They fall under the "netrom route" subcommand. "netrom add" adds a permanent entry to the routing table. Its format is: netrom route add #foo w9foo ax0 192 w9rly This command adds an entry for w9foo, whose alias is #foo, route quality 192, via w9rly on interface ax0. Let's talk about what this means. w9foo is the *destination* node, the one to whom you want the packets routed by the net/rom network. w9rly is your *neighbor*, the net/rom node to which you pass the packet to be forwarded. Since w9rly may appear on more than one interface (the callsign may be used by more than - 52 - one net/rom node on different bands), we specify that we are to use ax0 to send the packet. With net/rom, like IP, we don't know exactly what route a packet will take to its destination. We only know the name of a neighbor which has indicated a willingness to forward that packet (of course, the neighbor may be the destination itself, but that's unlikely in our application). Net/rom sends the packet to the neighbor, with a network header specify- ing our callsign and that of the ultimate destination (in this case w9foo). We can use the netrom route add command to establish a digipeater path to the neighbor. For example: netrom route add #foo w9foo ax0 192 w9rly wd9igi This will cause us to use wd9igi as a digipeater in establishing our connection to the net/rom node w9rly. To drop the route to w9foo, you would type netrom route drop w9foo w9rly ax0 To see the contents of your routing table, you may type netrom route and to see the routing entries for an individual station you can type netrom route info You may not use an alias as an argument to the netrom route info com- mand. I can not stress enough that "route add" and "netrom route add" are two different commands, with different purposes. In general, you only need a "netrom route add" if you need to add a route to a net/rom node via a digipeater path. If you find yourself using this command, ask yourself, "Why am I doing this?" Many people do not understand that net/rom does automatic routing (well, sort of :-)). 5.6. The Importance of the Routing Table The NET/ROM routing table is analogous to the IP routing table: if there is nothing in it, your NET/ROM traffic will not go out. You must either manually enter a list of routes (perhaps via your autoexec.net) or wait to receive routing broadcasts from your neighbors before your NET/ROM traffic will leave your station. If you go to send packets via NET/ROM and nothing happens, even if you have trace mode on, make sure that the destination node is in your NET/ROM routing table. If sending IP traffic, double check the ARP table for an appropriate NET/ROM ARP entry for the destination node (see below for more information on the use of the ARP table). The ARP table - 53 - is not used for NET/ROM transport routing. 5.7. Interfacing with NET/ROMs Using Your Serial Port What if you have a net/rom node or nodes, and you'd like to attach them to your computer via their serial interfaces, and use net as a packet switch? It's very easy: you have to attach those interfaces, using the "attach asy" command, but specifying type "nrs" instead of "slip" or "kiss". "nrs" is the net/rom serial framing protocol, which is like KISS, but uses different framing characters and has an 8-bit checksum. When you attach an nrs interface, it can be used for passing IP datagrams or AX.25 frames over serial lines or modems. To use it for net/rom, you have to identify it to the netrom code just like any other interface, with the "netrom interface" command. 5.8. The Time to Live Initializer The "netrom ttl" command allows setting of the time-to-live initial- izer for NET/ROM datagrams. I recommend a value of 16 for most net- works. Use more if you expect to go more than 16 hops. The default is 64. The purpose of the ttl initializer is to prevent a packet from get- ting caught forever in routing loops. Every router who handles the packet decrements the ttl field of the network datagram before sending it on, and when it reaches 0 it is discarded. 5.9. Using NET/ROM Support for IP Now you know all the commands, but how do we actually use net/rom for IP communications? This takes two steps: Step one: update the routing table. In all likelihood, you will use net/rom to gateway two IP subnets. So, you'll probably want to identify a station on each end as a gateway. Let's say we're on the Milwaukee subnet, and we want to talk to someone in Madison. If we're not the gateway, we just have a routing table entry like this: route add [44.92.0.0]/24 ax0 wg9ate-pc.ampr This specifies that wg9ate should get all packets for the 44.92.0.x sub- net via interface ax0. Wg9ate has this routing table entry: route add [44.92.0.0]/24 netrom w9mad-pc.ampr (presuming that w9mad is the Madison gateway). Now, when the IP layer at wg9ate gets datagrams for Madison, it knows that they have to go via net/rom to w9mad. Notice that we don't specify a "real" interface, like ax1 or nr0, in the route entry. The net/rom network layer will pick the right interface based on its net/rom routing tables. - 54 - We're not done yet, though. w9mad-pc.ampr is not an ax.25 callsign. The net/rom send routine called by the IP layer needs to map from the IP address to an ax.25 address. It does this via a manually added arp entry: arp add w9mad-pc.ampr netrom w9mad [We kind of fudged by using the arp table for this purpose, since there is no way to do automatic address resolution for net/rom, and arp mes- sages are never sent or received for net/rom nodes. However, the arp table does contain precisely what we have here: mappings from IP addresses to callsigns, and it saved a lot of code to do it this way.] Notice also that no digipeaters are ever specified in the arp entry for a net/rom node. Also, the callsign to which we are mapping is the final destination of the packet, not the non-destination neighbor. That neighbor will be picked based on the net/rom routing tables. So, as a summary, let's look at what happens to a packet that reaches the IP layer on wg9ate, destined for Madison. The IP routing code looks the destination IP address up in the table, and discovers that it should go via net/rom to w9mad-pc.ampr. So, it passes the packet to the net/rom send routine. That routine uses the arp table to translate w9mad-pc's IP address to the callsign "w9mad". Then it passes the packet to the net/rom routing code. That code checks to see if the des- tination callsign (w9mad) is the same as that of any of its assigned net/rom interfaces. Since it isn't, it puts a network layer header (a.k.a. net/rom level 3 header) on it, and looks for w9mad in its rout- ing tables. Presumably, it finds an appropriate neighbor for the packet, and sends in out via ax.25. The net/rom network does the job of actually getting the packet to its destination. At w9mad, the packet's protocol ID causes it to be sent to the same net/rom routing code that handled the outgoing packet from wg9ate (run- ning on a different computer, of course). Now the destination callsign matches, so the net/rom network layer header is stripped off, and packet is passed up to the IP layer. (Net/rom network headers don't have a protocol ID byte, so we just hope for the best. If a net/rom node addresses a net/rom transport layer packet to us, it is likely to be dropped by IP for any of a number of reasons.) 5.10. The NET/ROM Transport Layer NET/ROM transport is the protocol used by NET/ROM node to communicate end-to-end. When a user attaches to a NET/ROM via AX.25, and asks for a connect to a node in the NODES list, his local NET/ROM tries to open a transport connection to the destination node over the NET/ROM network. NET/ROM transport packets are carried in NET/ROM network datagrams, just like IP datagrams. You shouldn't use NET/ROM transport when connecting to other TCP/IP stations. TCP is a much better protocol than NET/ROM transport, and makes better use of available bandwidth. Also, BM and SMTP are more convenient to use than a TCP/IP station's mailbox facility. However, - 55 - for communicating with AX.25 users via the NET/ROM network, the tran- sport facilities in NET will work better (and more easily) than the traditional method of connecting to your local node via AX.25. 5.11. Connecting via NET/ROM Transport To connect to the node whose alias is FOO and whose callsign is W9FOO, you can issue either of the following two commands: netrom connect foo netrom connect w9foo If foo:w9foo is in your NET/ROM routing table, your station will transmit a connect request to the appropriate neighbor used to reach w9foo. NET/ROM transport sessions are very much like those for AX.25. You can use the disconnect, reset, kick, upload, and record commands, and the session command to switch sessions. 5.12. Displaying the Status of NET/ROM Connections The command netrom status is used to display the status of all NET/ROM connections, which will include those used in keyboard sessions as well as ones attached to the mailbox. For more detailed information on a session, you can use the address of the NET/ROM control block: netrom status <&nrcb> where <&nrcb> is the hex address given in the short form of the command or in the "session" display. 5.13. NET/ROM Transport Parameters The NET/ROM transport parameters may be set with the various NET/ROM subcommands. Their meanings are listed below: acktime: This is the ack delay timer, similary to ax25 t2. The default is 3000 ms. choketime: The time to wait before breaking a send choke condition. Choke is the term for NET/ROM flow control. irtt: The initial round trip time guess, used for timer setting. qlimit: The maximum length of the receive queue for chat sessions. This is similar to ax25 - 56 - window. retries: Maximum retries on connect, disconnect, and data frames. window: Maximum sliding window size, negotiated down at connect time. 5.14. The Mailbox The AX.25 mailbox also accepts NET/ROM connections. The "mbox on" and "mbox off" commands control whether the mailbox is turned on for NET/ROM as well as AX.25, and the "mbox" command displays current mail- box connects of both types. Many people have observed that the AX.25 mailbox requires the user to enter a carriage return to bring up the banner and prompt. This is because of certain defects of that protocol when it is used as the link layer for several different higher level protocols, and is unavoidable. (So stop asking, OK? :-)) The NET/ROM mailbox does not require the car- riage return, and will be activated as soon as the incoming connection is completed. 5.15. Where to go for More Information The paper "Transmission of IP Datagrams over NET/ROM Networks" appeared in the Seventh ARRL Networking Conference papers, available from the ARRL. In it, I describe the more technical details of how the NET/ROM network support works. If you want to learn about NET/ROM, talk your local NET/ROM or TheNET operator out of his or her manual. If you want to learn more, read the source code. That's about it for sources, since the NET/ROM protocols originated in a commercial product. 5.16. About the Code There has been a great deal of controversy about TheNET, a no-charge NET/ROM clone for TNCs. This is not the place to discuss the truth of the charges leveled by Software 2000 against its authors, but that situation requires me to make the following statement: The NET/ROM transport support in NET.EXE was not taken in any way, shape or form from NET/ROM (whose source I have never seen) or from TheNET. The protocol code is based on protocol 6 from Tanenbaum's excellent book, Computer Networks, as a moderately careful reading of both should show. The source code is freely distributed, so the curious reader should have the opportunity to check this assertion if he or she so desires. The smoothed round trip time calculation, which is not done in "real" NET/ROMs (and should be, by the way -- they'd work a whole lot better) is adapted from that used by KA9Q in the TCP protocol in NET. The dicey - 57 - business of adapting it to a sliding windows protocol with selective retransmission was done by me, all alone, after my cries for help on the tcp-group mailing list went unanswered :-). I have taken the precaution of copyrighting the NET/ROM code in NET. It may be freely distributed for non-commercial purposes, in whole or in part, and may be used in other software packages such as BBS systems if so desired, so long as the copyright notice is not removed from the source files, and the program in which it is used displays "NET/ROM code copyright 1989 by Dan Frank, W9NK" when it starts up. Any person who wishes to distribute the code, or anything based on the code, for commercial purposes will find me very reasonable, but rather insistent about being compensated for the hours I've spent work- ing on it. - 58 - 6. Advanced Topics 6.1. The Finger Server < there will be finger docs here someday > 6.2. The GRAPES Multi-Port Digipeating Code The multiport digipeating code from GRAPES will allow you to route frames in and out of LANs semi-automagically based on a table lookup maintained by the switch. To enable multi-port digipeating, there are two tables that you must build and place in the root directory. They are named DIGILIST and EXLIST. DIGILIST contains the digis that are directly reachable from your switch. The file is a simple ASCII text file containing the callsign of the digi and the interface name of the port needed to reach it. The port name is the same name you used in the attach state- ment for that port. Additionally there is a special callsign "lan" that tells mulport which port feeds your LAN or default port. The file would look something like this: kd4nc-1 ax1 # kd4nc-1 is a neighbor switch on the high speed link # attached to ax1 wb4gqx-1 ax3 # wb4gqx-1 is a neighbor digi on 145.01 (ax3) k4hal-1 ax2 # k4hal-1 is a neighbor digi on 440 (ax2) lan ax0 # lan is a special name for the low speed lan # attached to the switch and is the default port # used when mycall is the last call in the digi # string. The file EXLIST holds DESTINATION callsigns that do not obey the rules. For example, a user station on the high speed link. It is formatted just like DIGILIST. To understand why this file may be necessary we review the rules mulport obeys. First, mulport examines the digi string of incoming frames. If it finds it's call in the string and it is not already marked as repeated, it looks at the next call in the digi string. If a match is found between the call following MYCALL in the digi string and a call in DIGILIST, then the frame is repeated out the port associated with that call in DIGILIST. If no match is found then the frame is repeated out the port it came in on. If MYCALL is the LAST call in the digi string then it repeats the frame out the port associated with "lan" in the DIGILIST. So you see that if MYCALL is the last or the only call in the digi string the frame will be repeated out the lan port. This can cause a problem if the station you wish to connect is only one digi hop away and is not on the lan fre- quency. The EXLIST handles this case. Mulport will look at the DES- TINATION call if MYCALL is the last or only call in the digi string. If a match is found with a call in EXLIST then the port associated with that DESTINATION call is used to repeat the frame. EXLIST is only for stations who would normally be expected to be on the lan side but are operating off some other port instead. An example might be a PBBS operating on the trunk to serve more than one lan. - 59 - 6.3. Multiple Serial Ports on One Interrupt Thanks to effort from Dan Frank, W9NK, this version of net supports the idea of multiple serial ports all sharing a common hardware interrupt line. The original motivation for this was to support the IBM PS/2 fam- ily, but it turns out to be very helpful with a variety of PC/AT inter- face cards as well. There are no new commands, and existing autoexecs don't need to be changed. All you have to do to share interrupts is simply use the same interrupt in more than one attach line. This applies *only* to asy dev- ices. An interrupt may not be shared between, e.g., an ethernet card and a serial port. The code has been tested on an IBM PS/2 Model 70 with the dual async adaptor. Any card that logical-ORs the interrupt lines from the various UARTS should work. Interrupt sharing at the bus level does not work on the AT bus, but does work on the Micro Channel. The PS/2 series uses interrupt 4 for the motherboard async port, then interrupt 3 for all bus-attached serial ports. The code is believed to work with both level-sensitive and edge- triggered interrupts, but hasn't been fully tested. As an example, the Quadram Quadport AT with the add-on daughtercard can handle up to five serial ports sharing the same interrupt, and up to two cards may be supported in a PC, making a total of more serial ports than a poor little PC should be asked to handle... - 60 - 7. NET Command Reference 7.1. Startup When NET.EXE is executed without arguments, it attempts to open the file "AUTOEXEC.NET" in the root directory of the current drive. If it exists, it is read and executed as though its contents were typed on the console as commands. This feature is useful for setting the local IP address and host name, initializing the IP routing table, and starting the various Internet services. If NET.EXE is invoked with an argument, it is taken to be the name of an alternate startup file; it is read instead of AUTOEXEC.NET. 7.2. Console Mode The console may be in one of two modes: command mode and converse mode. In command mode, the prompt "net>" is displayed and any of the commands described in the next section may be entered. In converse mode, keyboard input is processed according to the "current session", which may be either a Telnet, FTP, or AX.25 connection. In a telnet or AX.25 session, keyboard input is sent to the remote system and any output from the remote system is displayed on the console. In an FTP session, keyboard input is first examined to see if it is a known local com- mand; if so it is executed locally. If not, it is "passed through" to the remote FTP server. (See the section titled "FTP Subcommands"). The keyboard also has "cooked" and "raw" states. In cooked state, input is line-at-a-time; the user may use the line editing characters ^U, ^R and backspace to erase the line, redisplay the line and erase the last character, respectively. Hitting either return or line feed passes the complete line up to the application. In raw mode, each character is immediately passed to the application as it is typed. The keyboard is always in cooked state in command mode. It is also cooked in converse mode on an AX25 or FTP session. In a Telnet session it depends on whether the remote end has issued (and the local end has accepted) the Telnet "WILL ECHO" option. (See the "echo" command). On the IBM-PC, the user may escape back to command mode by hitting the F10 key. On the HP Portable and Portable Plus, which have only 8 function keys, F8 is used instead. On other systems, the user must enter the "escape" character, which is by default control-] (hex 1d, ASCII GS). (Note that this is distinct from the ASCII character of the same name). The escape character can be changed (see the "escape" command). 7.3. Commands This section describes each of the commands recognized while in command mode. Note that certain FTP subcommands, e.g., put, get, dir, etc, are recognized only in converse mode with the appropriate FTP session; they are not recognized while in command mode. The notation "" denotes a host or gateway, which may be specified in one of two ways: as a symbolic name listed in the file "/hosts.net", or as a numeric IP address in dotted decimal notation enclosed by brackets, - 61 - e.g., [44.0.0.1]. When domain server support is added, ARPA-style domain names (e.g., ka9q.ampr) will also be accepted if a domain server is available on the network to resolve them into IP addresses. 7.3.1. Entering a carriage return (empty line) while in command mode puts you in converse mode with the current session. If there is no current session, net remains in command mode. 7.3.2. ! An alias for the "shell" command. 7.3.3. # Commands starting with the hash mark (#) are ignored. This is mainly useful for comments in the AUTOEXEC.NET file. 7.3.4. arp With no arguments, displays the Address Resolution Protocol table that maps IP addresses to their subnet (link) addresses on subnetworks capable of broadcasting. For each IP address entry the subnet type (e.g., Ethernet, AX.25), subnet address and time to expiration is shown. If the link address is currently unknown, the number of IP datagrams awaiting resolution is also shown. 7.3.4.1. arp add ether|ax25|netrom The add subcommand allows manual addition of address resolution entries into the table. This is useful for "hard-wiring" digipeater paths, and other paths that are not directly resolvable. 7.3.4.2. arp drop ether|ax25|netrom The drop subcommand allows removal of entries from the table. 7.3.4.3. arp publish ether|ax25|netrom The publish subcommand allows you to respond to arp queries for some other host. This is commonly referred to as "proxy arp", and is con- sidered a fairly dangerous tool. The basic idea is that if you have two machines, one of which is on the air with a TNC, and the second one of which is connected to the first with a slip link, you might want the first machine to publish it's own AX.25 address as the right answer for arp queries addressing the second machine. This way, the rest of the world doesn't know the second machine isn't really on the air. Use arp publish with caution. 7.3.5. attach attach