外文翻譯--以太網(wǎng)和IEEE802_第1頁
外文翻譯--以太網(wǎng)和IEEE802_第2頁
外文翻譯--以太網(wǎng)和IEEE802_第3頁
外文翻譯--以太網(wǎng)和IEEE802_第4頁
外文翻譯--以太網(wǎng)和IEEE802_第5頁
已閱讀5頁,還剩20頁未讀, 繼續(xù)免費閱讀

下載本文檔

版權說明:本文檔由用戶提供并上傳,收益歸屬內容提供方,若內容存在侵權,請進行舉報或認領

文檔簡介

TCP/IP Illustrated Volume 1 The Protocols Chapter 1 .Introduction 1.3 TCP/IP Layering There are more protocols in the TCP/IP protcol suite. Figure 1.4 shows some of additional protocols that we talk about in this text. TCP and UDP are the two precominant transport ;ayer protocols. Both use OP the network layer. TCP provides areliable transport ;ayer,even though the service it uese(IP) is unrelitable. Chapters 17 througe 22 provike a detailed look at the operation of TCP. We then look at some TCP applications. Telnet and Tlogin in Chapter 26,FTP in Chapter 27,and SMTP in Chapter 28. The applications user processes. UDP sends and recieves datagrams for applications. A datagram is a unit of information(i.e., a ceertain number of bytes of information that is specified by the sender) that travels from the sender to the recerver. Unlike TCP, however,UDP is unreliable. There is no guarantee that the datafram ever gets to its final destination. Chapter 11 looks at UDP,and then Chapter 14(the Domain Name System),Chapter 15(the Trivial File Transfer Protocol),and Chapter 16(the Bootstrap Protocol)look at some applications that use UDP. SMNP(the Simp;e Nerwork Managemennt Protocol) also uses UDP, but since it deals with many of the other protocols,we save a discussion if it until Chapter 25. IP is the main protocol at the network layer. It is used by both TCP and UDP.Every piece of TCP and UDP data that gers transferred around an internert goes through the IP layer at both end systems and at every intermediate router.OnFigure 1.4 we also show an application accessing OP directly. This is rare,but possible.(Some older routing protocols were implementde thes way.Also, it is possible to wxperiment with new transport layer protocols using this feature.)Chapter 3 look at IP,but we save some of the details for later chapters where their discussion makes more sense.Chapters 9 and 10 look at how IP performw routing. ICMP is an adjunct to IP. It is used by the IP layer to cxchange error messages and other vital information with the IP layer in another host ir router.Chapter 6 looks at ICMP in mire detail.Although ICMP is used primarily by IP,it is possible for an application to also access it.Indeed well see that two popular diagnostic tools,ping and traceroute(Chapters 7 and 8),both use ICMP. IGMP is the Internet Group management Protocol.It is used with multicasting:sending aUDP datagram to multiple hosts. We describe the general propertise of broadcasting(sendihng aUDP datagramto every host on aspecified nerwork) and multicasting in Chapter 12,and then descrebe IGMP itself in Chapter 13. ARP(Address Resolution Protocol) and RARP (Reverse Address Resolution Protocol )are specialized protocols used only with certain types of nerwerk interfaces(such as Ethernet and token ring )to convert between the addresses used by the IP layer and the addresses used by the network interface.We examine these protocols in Chapters 45 and 5,respectively. 1.8 Client-Server Model Most netwirking applications are written assuming are side is the dirent and the client and the other the server. The purpose of the application is for the server to provide some defined service for clients. We can categorize servers into two classes:iterative or concurrent. An iterative server itsrates through the following steps. 1. Wait for a client requset to arrive 2. Process the client request. 3. Send the response back to the client that sent the request. 4. Go back to step. The problem with an iterative server is when step 2 takes a while.During this time no other clients are serviced. A concurrent server, on the other hand,performs the following steps. 1. Waet for a client request to arrive. 2. Start a new server to handle this clients rquest. This may involve creating new process,task,or thread,depending on what the underlying operating systim. This new server handles thiis clients entire requist.When complete,this new server timinates. 3. Go back to step C1. The advantage of aconcurrent server is that the server just spawns other servers to handle the client reqsets. Each client has,inessence, its own server.Assuming the operating systim allows multiprogramming,multiple clients are serviced concurrintly. The reason we categorize servers ,and not clients ,is because a client nomally cant tell whether its talking to an iterative server or a concurrent server. As a general rule,TCP servers are concurrent, and UDP servers are iterative,but there are a few wxceptions.Will look in detail at the impact of UDP in its servers Section 11.12, and the impact of TCP on its servers in Section 18.11. Chapter 2. Link Layer 2.1 Introduction From Figure 1.4 we see that the purpose of the link layer in the TCP/IP protocol suite is to send and receive IP datagrams for the IP module ARP requests and replies for the ARP module and RARP requests and replies for the RARP module TCP/IP supports many different link layers depending o the type of networking hardware being used: Ethernet token ring ,FDDI(Fiber Distributed Data Interface),RS-232 serial lines,and the like. In this chapter well lool at some of the details involved in the Ethernet link layer two specialized link layers for serial interfaces(SLIP and PPP),and the loopback driver thats part of most implementations Ethernet and SLIP are the link layers used for most of the examples in the book.Wi aloe talk about the MTU(Macimim Transmission Unit ),a characteristic of the link layer that wi encounter numerous times in the remaining chapters We also show some calculations lof how to choose the MTU for a serial line. 2.2 Ethernet and IEEE 802 Encapsulation The term Ethernet generally refers to a standard published in 1982 by Digital Equipment Corp., Intel Corp., and Xerox Corp. It is the predominant form of local area network techonology used with TCP/IP today. It uses an access method called CSMA/CD, which stands for Carrier Sense Multiple Access with Collision Detection. It operates at 10 Mbits/sed and uses 48-bit addresses. A few years later the IEEE 802 Committee published a sightly differebt set of standards.802.3 covers an entire set of CSMA/CD networks,802.4 covers token bus networks, and 802.5 covers token ring networks. Common to all three of these is the 802.2 standard that defines the logical link control common to many of the 802 networks ,Unfortunately the combination covers all the details of these IEEE802 standards. In the TCP/IP world ,the encapsulation of IP datagrams is defined in RFC 894 for Ethernets and in RFC requires that every Internet host connected to a 10Mbits/sec Ethernet cable: 1. Must be able to send and receive packets using RFC 8j94(Ethernet) encapsulation. 2. Should be able to receive RFC 1042 packets intermixed with RFC 894 packets. 3. May be able to send packets using RFC 1042 encapsulation. If the host can send both types of packets the packet sent must be configurable and the configuration option must default to RFC 894 packets. RFC 894 encapsulation is most commonly used. Figure 2.1 shows the two different forms of encapsulation. The number below each box in the figure is the size of that box in bytes. Both frame formats use 48-bit destination and source address .(802.3 allows 16-bit addresses to be used ,bit 48-bit addresses are nirmal.) These are what we call hardware addresses throughout the text. The ARP and RARP protocols(Chapters and 5)map between the 32-bit IP addresses and the 48-bit hardware addresses. The next 2 bytes are different in the two frame formats. The 802 length field says how many bytes follow,up to but not includeing the CRC at the end. The Ethernet type later in the SNAP haeder. Fortunately none of the nalid 802 length values is the same as the Tthenet type values ,making the two frame formats distinguishable. In the Ethernet frame the data immediately follows the type field while in the frame format 3 bytes of 802.2 LL Cand 5 bytes of 802.2 SNAP follow. The DSAP and SSAP are both set to the same 2-byte type field that we had with the Ethernet frame format. The CRC field is a cyclic redundancy check that detects errors in rest of the frame. There is a minimum size for 802.3 and Ethernet frames. Thos minimum requires that the data portion be at least 38 bytes for 802.3 or 46 bytes for Ethernet. To handle this ,pad bytes are inserted to assure that the frame is long enough. Well encounter this minimum when we start watching packets on the wire. In this text well display the Ethernet encapsulation when we need to ,because this is the most commonly used form of encapsulation. Chapter 3 IP: Internet Protocol 3.1 Inttoduction IP is the workhorse protocol of the TCP/IP protocol suite. All TCP, UDP, ICMP, and IGMP data gets transmitted as IP datagrams. Afact tht amazes many newcomers to TCP/IP, especially those from an X.25 or SNA background, is that IP provides an unreliable, connectionless datagram delivery service. By unreliable we mean there re no guarantees that an IP datagram successfully gets to its destination. IP provides a best effort service. When something goes wrong, such as a router temporarily running ort of buffers , IP has a simple error handling algorithm:throw away the datagram and try to send an ICMP message back to the source. Any required reliability must be provided by the upper layers. The term connectionless means that IP does net maintain any state information about successive datagrams. Each datagram is handled independently from all other datagrams. Those also means that IP datagrams can get delivered out of order. If a sou sends two consecutive datagrams to the same destination,each is routed independly and can take different routes ,with B arriving before A. In this chapter we take a brief look at the fields in the IP header,describe IP routing, and cover subnetting . we also liik at two useful commands: ifconfig and netstat. We leave adetailed discussion of some of the fields in the IP header for later when wi can see exactly how the fields are used. Chapter 18 . TCP Connection Establishment and Termination 18.3 Timeout of Connectio Establishment There are several instances when the connection cannot be established. In one example the server host is down To simulate this scenario we issue our telnet command after disconnecting the Ethernet cable from the servers host. Figure 18.6 shows the tcpdump output. The interesting point on this output os how frequently the clients TCP sends a SYN to try to establish the connection. The second segment is sent 5.8 seconds after the first, and the third is sent 24 seconds after the second. The time difference is 76 seconds.Most Berkeley-derived systems set a time limit of 75 seconds on the establishment of a new connection. Well see in Section 21.4 that the third packet sent by the client would have timed out around 16:25:29,48 seconds after it wes sent ,had the client not fiven up after 75 seconds. First Timeout Period One puzzling itwm in Figure 18.6 is that the first timeout period,5.8 seconds ,is close to 6 seconds,but not exact,while the second period is almost exactly 24 seconds. Ten more of these tests were run and the first timeout period took on various values between 5.59 seconds and 5.93 seconds. The second timeout period,however,was always 24.00. Whats happening here is that BSD implementations of TCP run a timer that goes off every 500 ms. This 500-ms timer is used for various TCP timeouts, all of which we cover in later chapters. When we type in the telnet command, an initial 6-second timer is established ,but it may expire any where betwiin 5.5 and 6 seconds in the future. Figure 18.7 shows whats happening. Although the timer is initialized to 12 ticks ,the first decrement of the timer can occur between 0 and 500 ms after it is set. From that point on the timer is decremented about every 500 ms, but the first period can e variable. When that 6-second timer expires at the tick labeled 0 in Figure 18.7, the timer is reset for 24 seconds in the future. This next timer will be close to 24 seconds, since it was set at a time when the TCPs 500-ms timer handler was called by the kernel. Type-of-Service Field In Figure 18.6, the notation appears. This is the type0of-service field in the IP datagram. The BSD/386 Telnet client sets the field for minimum delay. 18.4 Maximum Segment Size The macimum segment size is the largest “chunk” of data that TCP will send to the other end.When a connection is established ,each end can announce its MSS. The values weve seen have all been 1024. The resulting IP datagram is normally 40 bytes large:20 bytes for the TCP header and 20 bytes for the IP header. Some texts refer to this as a “negotiated” option. It is not negotiated in any way. When a connection is established,each end has the option of announcing the MSS it expects to receive. If one end does not receive an MSS option from the other end, adefault of 536 bytes is assumed. In general, the larger the MSS the better, until fragmentation occurs. A large segment size allows more data to be sent in each segment, amortizing the cost of the IP and TCP headers. When TCP sends a SYN segment, either because a local application wants to send an MSS value up to the outgoing interfaces MTU, minus the size of the fixed TCP and IP headers. For an Ethernet thos implies an MSS of up to 1460 bytes. Using IEEE802.3 encapsulation, the MSS could go up to 1452 bytes. The values of 1024 that weve seen in this chapter, for connections involving BSD/386 and SVR4, are because many BSD implementations require the MSS to be a announce an MSS of 1460 when both ends are on a local Ethernet. Measurements in show how an MSS of 1460 provides better performance on an Ethernet than an MSS of 1024. If the destination IP address is “nonlocal,” the MSS normally defaults to 536. Ehile its easy to say that a destination whose IP address has the same network ID and the ent network ID from ours is nonlocal, a destination with the same network ID but a different subnet ID could be erther local or nonlocal. Most implementations provide a configuration iption that lets the system adminestrator mines whether the announced MSS is as large as possible or the default of 536. The MSS lets a host limet the size of datagrams that the other end sends it . When combined with the fact that a host can also limet the size of the datagrams that or sends, this lets a host avoid ftagmentation whaen the host os connected to a network with a small MTU. Consider our host slip ,which has a SLIP link with an MTU of 296 to the router bsdi. Figure 18.8 shows these systems and the host sun. The important fact here is that sun cannot send a segment with more than 256 bytes of data, since it received an MSS option of 256. Furthermore ,since slip knows that the outgoing interfaces MTU os 296 bytes of data, to avoid fragmentation. Its OK for a system to send less than the MSS announced by the other end. This avoidance of fragmentation works only if either host is directly connected to a network with an MTU of less than 5756. If both hosts are connected to Ethernets, and both announce an MSS of 536, but an intermediate network has an MTU of 296, fragmentation will occur. The only way around this is to use the path MTU discovery mechanism. 18.11 TCP Server Design We said in Section 1.8 that most TCP servers are concurrent. When a new connection request arrives at a server, the server accepts the connection and invokes a new process to handle the new client. Depending on the iperating system, various techniques are used to in voke the new server. Under Unixx the connom technique is to create a new process using the fork function. Lightweight processes can also be used ,if supported. What were interested in is the interaction of TCP with concurrent server. We need to answer the following questions:how are the port numbers handled when aserver accits anew connection request from a client ,and what happens if multiple connection requests arrive at about the same time? 18.11.1 TCP Server Port Numbers We can see how TCP handles the port numbers by watching any TCP server. Well watch the Telnet server using the netstat command. The following output is on a system with no active Telnet xonnections. The a flag reports on all network end points, not just those that are ESTABLISHED. The n flag prints IP addresses as dotted-decimal numbers,instead of trying to use the instead of service names. The f inet option reports only TCP and UDP end points. The local address is output as*.23. where the asterisk is normally called the wildcard character. This means that an incoming connection request will be accepted on any local intercface4. If the host were multihomed, we could specify asingle IP address for the local IP address ,and only connections received on that interface would be accepted. The local port is 23, the well-kown port number for Telnet. The foreign address is outputs as *.*, which means the foreign IP address and forrign port number are not known yet ,because the end point I in the LISTEN state ,waeting for aconnection toarrive. We now start a Telnet client on the host slip that connects to this server. Here are the relevant lines from the netstat output: The first line for port 23 is the ESTABLISHED connection. All four elements of the local and foreign address are filled in for this connection: the local IP address and port number,and the foreign IP address and port number. The local IP address corresponds to the interface on which the connection request arrived. The end point in the LISTEN state is left alone. This is the end point that the concurrent server uses to accept future connection requests. It is the TCP module in the kernel that creates the new end point in the ESTABLISHED state, when the incoming connection requewt arrines and is accepted. Also notice that the port number for the ESTABLISHED connection doesnt change: its 23, the same as the LISTEN end point. We now initiate another Telnet client from the same client to this server. Here is the relevant netstat output: We now have two ESTABLISHED connections from the same host to the same serer. Both have a local port number of 23. This is not a problem for TCP sincethe foreign port numbers are different. They must be different beacarse each of the Telnet clients rses an wpheneral port, and the definition of an ephemeral port os ene that os net currently in use on that host. This example reiterates that RCP demultiplexs incoming segments using all four values that comprise the local and foreign addresses:destination IP address, destination port number, source IP address, and source portnumber. TCP cannot detemine which the only one of the three end points at port 23 that will receive incoming connection requests is the one in the LISTEN state. The end points in the ESTABLISHED state can not receive SYN segments ,and the end point in the LISTEN state cannot receive datasegments. Next we initate a third Telnet client, from the host solaris that is across the SLIP link from sun, and not on its Ethernet. The local IP address of the first ESTABLISHED connedtion now corresponds to the interface address of SLIP link on the multihomed host sun. Chapter 21 . TCP Timeout and Retransmission 21.1 Introduction TCP provides a reliable transport layer. one of the ways it provides reliability is for each end to acknowledgments can get lost. TCP handles these by setting a timeout when to sends data, and of the data isnt acknowledge when the timeout expires, to retransmits the data. A critical element of any complementation is the timeout and retransmission strategy. How is the timeout interval determined, and how frequently does a retransmission occur? Weve already seen two examples of timeout and retransmission(1)In the ICMP port unreachable example in Section 6.5 we saw the TFTP client using UDP employing a simple timeout and retransmission strategy: it assumed 5 seconds was an adequate timeout period and retransmitted every 5 seconds.(2) In the ARP example to anonexistent host , wi saw that when TCP tried to establish the connection it retransmitted its SYN using a longer delay between each retransmisson. TCP manages four different timers for each connection. 1. A retransmisson timer is used when expectiong an acknowledgment from the other end. Thos chapter looks at this timer in detail, along with related issues such as congestion avoedance. 2. A persist timer keeps window size information flowing even if the other end closes its receive window. Chapter 22 describes this timer. 3. A keepalive timer detects when the other end on an otherwise idle connection crashes or reboots. Chapter 23 describes this timer. 4. A 2MSL timer measures the time a connection has been in the TOME_WAIT state. We described these state in Section 18.6. In this chapter wi start with a simple example of TCPs tomeout and retransmission and then move to a larger example that lets us liik at all the details involved in TCPs timer management. We liik at how typical implementations measure the round-trip time of TCP segments and how TCP uses these measurements to estimate the retransmission timeout of the next segment it transmits. We then look at TCPs congestion avoidance-what TCP does when packets are lost-and follow through an actual example where packets are lost. We also look at the newer fast retransmit and fast recovery algorithms, and see how they let TCP detect lost packets faster than waiting for a timer to expire. 21.2 Simple Timeout and Retransmission Example Lets first look at the retransmission strategy used by TCP. Well establish a connection, send some data to verify that everything is OK, disconnect the cable, send some more data, and watch what TCP does: Figure 21.1 shows the tcpdump output. Lines 1, 2, and 3 correspond to the normal TCP connection establishment. Line 4 is the transmission of “hello, world” and line 5 is its acknowledgment. We then disconnect the Ethernet cable from svr4. Line 6 shows “and hi” being sent .Lines 7-18 are 12 retransmissions of that segment, and line 19 is when the sending TCP finally gives up and sends a reset. Ecamine the time difference betwiin successive retransmissions: with rounding thru occur 1,3,6,12,24,48, and the 64 wsconds apart. Well see later in this chapter that the first timeout is actually set for 1.5 seconds after the first transmission After this the timeout value is doubled for each retransmission, with an upper limit of 64 seconds. This doubling is called an exponetial backoff. Compare thos to the TFTP example in Section 6.5,where every retransmission occurred 5 seconds after the previous. The time difference betwoon the first ransmission of the packet is about 9 minutes. Modern TCPs are persistent when trying to send data! 21.3 Round-Trip Time Measurement Fundamental to TCPs timeout and retransmission is the measurement of the round-trip time experienced on a given connection. We expect this can change over time, as routes might change and as network traffic changes, and TCP should track these changes and modify its timeout accordingly.; First TCP must measure the RTT between sendong abute with aparticular sequence number and receiving an acknowledgment that covers that sequence number. Recall from the prevous chapter that normally there is not a one-to-one correspondence between data segments and ACKs. In Figure 20.1 this means that one RTT that can be meacured by the sender is the time betwiin the transmission of segment 4 and the reception of segment 7, even though this ACK is for an additional 1024 bytes. Well use M to denote the measured RTT. The original TCP specification had TCP update a smoothed RTT estimator using the low-pass filter R R+(1- )M Where is a smoothing factor with a recommended value of 0.9. Thos smoothed RTT os updated every time a new measurement is made. Ninety percint of each new estimate is from the previous estimate and 10% is from the new measurement. Given this smoothed estimater, which changes as the RTT changes, RFC 793 recommended the retransmission timeout value (RTO) be set to RTO=R Where is a delay variance factor with a recommended value of 2. Jacobson 1988 details the problems with this approach, basically that it cant keep up with wide fluctuations in the RTT, causing unnecessary retransmissions. As Jacobson notes, unnecessary retransmissions add to the network load, when the network is already liaded. It is the network equivalent of pouring gasoline on a fire. What.s needed is to keep rack of the variance in the RTT meacurements, in addition to the smoothed RTT estimator. Calculating the RTO based on both the ean and variance provides much better response to wide fluctuations in the round-trip times, than just calculating the RTO as a calculations we show below, which take into account the variance of the round-trip times. As described by Jacobsonm, the mean deviation is a good approximation to the standard deviation, but easier to compute. This leads to the following equations that are applied to each RTT measurement M. Err = M A A A + gErr D D + d(|Err|-D) RTO = A + 4D Where A is the smoothed RTT and D is the smoothed mean deviation. Rtt is the difference betwiin the measured value just obtained and the current RTT estimator. Both A and D are used o calculate the next retransmission timeout. The gain g is for the average and os set to 1/8. The gain for the deviation is and is set to 0.25. The larger gain for the deviation makes the RTO go up faster when the RTT changes. TCP/IP Illustrated Volume 3 TCP for Transaction, HTTP, NNTP, and the UNIX Domain Protocols Chapter 1. T/TCP Introduction 1.2 UDP Client-Server We begin with asimple UDP client=-server example, shoeing the client source code in Figure 1.1. The client sends a request to the server,the server processes the reques and sends back a reply. Create a UDP socket The socket function creates a UDP socket, returning a ninnegative descriptor to the process. The error-handling function err-sys kiss shown in Appendix B.2of Steven 1992.It accepts any number of arguments,formats them using vsprintf,prints the Unix error message corresponding to the errno value from the system call, and then terminates the process. Fill in servers address An Internet socket address structrure is first zeroed out using memset and then filled with the IP address and port number of the server. For simplicity we require the user to enter the IP address as a dotted-decimal number on the command line when the program is rn (argv1).We #define the servers port number(UDP_SERV_PORT)in the cliserv.h header, which is included at the beginning of all the programs in this chapter. This is done for simplicity and to avoid complicating the code with calls to gethostbyname and gerservbyname. Form request and send it to server The client forms arequest (which we show only as a comment)and sends it to the server using sendto. This causes asingle UDP datagram to e sent to the server.Once again, or simplicity, we assume afixed-sized request(REQUEST)and a fixed-sized reply(REPLAY).A real application would allocate room for its maximum-sized request and rely, but the actual request and reply would vary and would normally be smaller. Read and process reply from server The call to recvfrom blicks the process(i.e., puts it to sleep) until a datagram arrives for the client. The client then processes the reply(which we show as acomment)and terminates. Create UDP socket and bind local address The call to socket creates a UDP socket, and an Internet socket address structure is filled in with the servers local address . The local IP address is set to the wildcard interface(in case the servers host is multihomed,that is, has mire than one network interface). The pert number is set to the servers well-known port(UDP_SERV_PORT)which we said earlier is defined in the cliserv. header. This local IP address and well-known port are bound to the socket by bind. Process client requests The server then enters an infinite loop, waiting for aclient request to arrive(recvfrom),processing that request(which we show only as acomment),and sending back a reply(sendto). 1.3 TCP Client-Server Our next example of aclient-erver transaction appkication uses TCP. Figure 1.5 shows the client program. Create TCP socket and connect to server A TCP socket is created but socket and then an Internet socket address structure is filled in with the IP address and port number of the server. The call to connect causes TCPs tree-way handshake to occur, establishing a connection between the client and werver. Chapter 18 of Volume 1 provides additional details in the packet exchanges when TCP connections are established and terminated. Send request and half-close the connection The clients request is sent to the server by write. The client then clises one-half of the connection, the direction of data flow from the client to the server, by calling shutdown with a second argument of 1. This tells the server that the client is done sending data: it passes an end-of=file notification from the client to the server .A TCP segment containing the FIN flag is sent to the server . The client can still read from the connectiong-only one direction of data flow is closed. This is called TCPs half-close. Section 18.5 of Volume 1 provides additional details. Read reply The reply is read by our function read_ stream,shown in Figure 1.6.Since TCP is a byte-stream protocol, without any from of record markers, the reply from the servers TCP can be returned in one or mire TCP segments, This can be returned to the client process in one or mire reads. Furthermire we know that when the server has sent the complete reply, the server process clises the connection, causing its TCP to send a FIN segment to the client, which is retruned to the client process by read returning an dedn-of-file (areturn value of 0).To handle these detail, the function read_ stream calls read as many times as necessary, until either the input buffer is full, or an end-of-file is returned by read. The return value of the function is the number of bytes read. Create listening TCP socket A TCP socket is created and the servers ell-known port is bouned to the socket. As with the UDP server, the TCP server binds the wildcard as its local IP address. The call to listen makes the socket a listening socket on which incoming connections will be accepted ,and the second argument of SOMAXCONN sepcifies the maximum number if pending connections the kernel will queue for the socket. Accept a connection and process request The server blocks in the call to accept until aconnes action is established by the clients connect. The new socket descriptor returned by accept ,sockfd, refers to the connection to the client. The clients request is read by read_ stream and the reply is returned by write. TCPs TIME_WAIT State TCP requires that the endpoint that sends the first FIN, which in our example is the client, must remain in the TIME_WAIT state for twice the maximun segmeng lifetine onece he connection is completely closed by both ends. The recommended value for the MSL is 120 seconds ,implying a TIME_ WAIT delay if 4 minutes. While the connection is in the TIME_WAIT state ,that same connection cannot be opened again. Reducing the Number if Segments with TCP TCP can reduce the number of segments in the transaction shown in Figure 1.8 by combining data with the control segments ,as we shoe in Figure1.9. Notice that the fist segment now contains the SYN, data, and FIN, not just the SYN as we saw in Figure 1.8.Similarly the servers reply is combined with the servers FIN. Although this sequence of packets is legal under the rules of TCP, the author is not aware of a method for an application to cause TCP to generate these sequende of segments using the sockets API(hence the question mark that generates the first segment from the clien, and the question mark that generates the final segment from the serner)and knows of no implenmentations that actually generate this sequence of segments. 卷一:協(xié)議 第 1 章 概述 1.3 TCP/IP 的分層 在 T C P / I P協(xié)議族中,有很多種協(xié)議。圖 1 - 4給出了本書將要討論的其他協(xié)議。 T C P和 U D P是兩種最為著名的運輸層協(xié)議,二者都使用 I P作為網(wǎng)絡層協(xié)議。 雖然 T C P使用不可靠的 I P服務,但它卻提供一種可靠的運輸層服務。本書第 1 7 2 2章將詳細討論 T C P的內部操作細節(jié)。然后,我們將介紹一些 T C P的應用,如第 2 6章中的 Te l n e t和 R l o g i n、第 2 7章中的 F T P以 及第 2 8章中的 S M T P等。這些應用通常都是用戶進程。 U D P為應用程序發(fā)送和接收數(shù)據(jù)報。一個數(shù)據(jù)報是指從發(fā)送方傳輸?shù)浇邮辗降囊粋€信息單元(例如,發(fā)送方指定的一定字節(jié)數(shù)的信息)。但是與 T C P不同的是, U D P是不可靠的,它不能保證數(shù)據(jù)報能安全無誤地到達最終目的。本書第 11章將討論 U D P,然后在第 1 4章( D N S :域名系統(tǒng)),第 1 5章( T F T P:簡單文件傳送協(xié)議),以及第 1 6章( BO OT P:引導程序協(xié)議)介紹使用 U D P的應用程序。 S N M P也使用了 U D P協(xié) 議,但是由于它還要處理許多其他的協(xié)議,因此本書把它留到第 2 5章再進行討論。 I P是網(wǎng)絡層上的主要協(xié)議,同時被 T C P和 U D P使用。 T C P和 U D P的每組數(shù)據(jù)都通過端系統(tǒng)和每個中間路由器中的 I P層在互聯(lián)網(wǎng)中進行傳輸。在圖 1 - 4中,我們給出了一個直接訪問 I P的應用程序。這是很少見的,但也是可能的(一些較老的選路協(xié)議就是以這種方式來實現(xiàn)的。當然新的運輸層協(xié)議也有可能使用這種方式)。第 3章主要討論 I P協(xié)議,但是為了使內容更加有針對 性,一些細節(jié)將留在后面的章節(jié)中進行討論。第 9章和第 1 0章 討論 I P如何進行選路。 I C M P是 I P協(xié)議的附屬協(xié)議。 I P層用它來與其他主機或路由器交換錯誤報文和其他重要信息。第 6章對 I C M P的有關細節(jié)進行討論。盡管 I C M P主要被 I P使用,但應用程序也有可能訪問它。我們將分析兩個流行的診斷工具, P i n g和 Tr a c e r o u t e(第 7章和第 8章),它們都使用了 I C M P。 I G M P是 I n t e r n e t組管理協(xié)議。它用來把一個 U D P數(shù)據(jù)報多播到多個主機。我們在第 1 2章中描述廣播(把一個 U D P數(shù)據(jù)報發(fā) 送到某個指定網(wǎng)絡上的所有主機)和多播的一般特性,然后在第 1 3章中對 I G M P協(xié)議本身進行描述。 A R P(地址解析協(xié)議)和 R A R P(逆地址解析協(xié)議)是某些網(wǎng)絡接口(如以太網(wǎng)和令牌環(huán)網(wǎng))使用的特殊協(xié)議,用來轉換 I P層和網(wǎng)絡接口層使用的地址。我們分別在第 4章和第 5章對這兩種協(xié)議進行分析和介紹。 1.8 客戶 -服務器模型 大部分網(wǎng)絡應用程序在編寫時都假設一端是客戶,另一端是服務器,其目的是為了讓服務器為客戶提供一些特定的服務。 可以將這種服務分為兩種類型:重復型或并發(fā)型。重復型服務器通過以下步驟 進行交互: I1. 等待一個客戶請求的到來。 I2. 處理客戶請求。 I3. 發(fā)送響應給發(fā)送請求的客戶。 I4. 返回 I 1步。 重復型服務器主要的問題發(fā)生在 I 2狀態(tài)。在這個時候,它不能為其他客戶機提供服務。 相應地,并發(fā)型服務器采用以下步驟: C1. 等待一個客戶請求的到來。 C2. 啟動一個新的服務器來處理這個客戶的請求。在這期間可能生成一個新的進程、任務或線程,并依賴底層操作系統(tǒng)的支持。這個步驟如何進行取決于操作系統(tǒng)。生成的新服務器對客戶的全部請求進行處理。處理結束后,終止這個新服務器。 C3. 返回 C 1步。 并發(fā)服務器的優(yōu)點在于它是利用生成其他服務器的方法來處理客戶的請求。也就是說,每個客戶都有它自己對應的服務器。如果操作系統(tǒng)允許多任務,那么就可以同時為多個客戶服務。 對服務器,而不是對客戶進行分類的原因是因為對于一個客戶來說,它通常并不能夠辨別自己是與一個重復型服務器或并發(fā)型服務器進行對話。 一般來說, T C P服務器是并發(fā)的,而 U D P服務器是重復的,但也存在一些例外。我們將在 11 . 1 2節(jié)對 U D P對其服務器產(chǎn)生的影響進行詳細討論,并在 1 8 . 11節(jié)對 T C P對其服務器的影 響進行討論。 第 2 章 鏈路層 2.1 引言 從圖 1 - 4中可以看出,在 T C P / I P協(xié)議族中,鏈路層主要有三個目的:( 1)為 I P模塊發(fā)送和接收 I P數(shù)據(jù)報;( 2)為 A R P模塊發(fā)送 A R P請求和接收 A R P應答;( 3)為 R A R P發(fā)送 R A R P請求和接收 R A R P應答。 T C P / I P支持多種不同的鏈路層協(xié)議,這取決于網(wǎng)絡所使用的硬件,如以太網(wǎng)、令牌環(huán)網(wǎng)、 F D D I(光纖分布式數(shù)據(jù)接口)及 R S-2 3 2串行線路等。在本章中,我們將詳細討論以太網(wǎng)鏈路層協(xié)議,兩個串 行接口鏈路層協(xié)議( S L I P和 P P P),以及大多數(shù)實現(xiàn)都包含的環(huán)回( l o o p b a c k)驅動程序。以太網(wǎng)和 S L I P是本書中大多數(shù)例子使用的鏈路層。對 M T U(最大傳輸單元)進行了介紹,這個概念在本書的后面章節(jié)中將多次遇到。我們還討論了如何為串行線路選擇 M T U。 2.2 以太網(wǎng)和 IEEE 802 封裝 以太網(wǎng)這個術語一般是指數(shù)字設備公司( Digital Equipment Corp.)、英特爾公司( I n t e lC o r p .)和 X e r o x公司在 1 9 8 2年聯(lián)合公布的一個標準。它是當今 T C P / I P采用的主要的局域網(wǎng)技術。它采用一種稱作 C S M A / C D的媒體接入方法,其意思是帶沖突檢測的載波偵聽多路接入( Carrier Sense, Multiple Access with Collision Detection)。它的速率為 10 Mb/s,地址為 48 bit。 幾年后, I E E E(電子電氣工程師協(xié)會) 8 0 2委員會公布了一個稍有不同的標準集,其中 8 0 2 . 3針對整個 C S M A / C D網(wǎng)絡, 8 0 2 . 4針對 令牌總線網(wǎng)絡, 8 0 2 . 5針對令牌環(huán)網(wǎng)絡。這三者的共同特性由 8 0 2 . 2標準來定義,那就是 8 0 2網(wǎng)絡共有的邏輯鏈路控制( L L C)。不幸的是, 8 0 2 . 2和 8 0 2 . 3定義了一個與以太網(wǎng)不同的幀格式。文獻 Stallings 1987對所有的 IEEE 802標準進行了詳細的介紹。 在 T C P / I P世界中,以太網(wǎng) I P數(shù)據(jù)報的封裝是在 RFC 894Hornig 1984中定義的, IEEE 802網(wǎng)絡的 I P數(shù)據(jù)報封裝是在 RFC 1042Postel and Reynolds 1988中定義的。主機需求 R F C要求每臺 I n t e r n e t主機都與一個 10 Mb/s的以太網(wǎng)電纜相連接: 1) 必須能發(fā)送和接收采用 RFC 894(以太網(wǎng))封裝格式的分組。 2) 應該能接收與 RFC 894混合的 RFC 1042( IEEE 802)封裝格式的分組。 3) 也許能夠發(fā)送采用 RFC 1042格式封裝的分組。 如果主機能同時發(fā)送兩種類型的分組數(shù)據(jù),那么發(fā)送的分組必須是可以設置的,而且默認條件下必須是 RFC 894分組。最常使用的封裝格式是 RFC 894定義的格式。 圖 2 - 1顯示了兩種不同形式的封裝格式。圖中每個方框下面的數(shù)字是它們的字節(jié)長度。兩種幀格式都采用 48 bit( 6字節(jié))的目的地址和源地址( 8 0 2 . 3允許使用 16 bit的地址,但一般是 48 bit地址)。這就是我們在本書中所稱的硬件地址。 A R P和 R A R P協(xié)議(第 4章和第 5章)對 32 bit的 I P地址和 48 bit的硬件地址進行映射。接下來的 2個字節(jié)在兩種幀格式中互不相同。在 8 0 2標準定義的幀格式中,長度字段是指它后續(xù)數(shù)據(jù)的字節(jié)長度,但不包括 C R C檢驗碼。以太網(wǎng)的類型字段定義了 后續(xù)數(shù)據(jù)的類型。在 8 0 2標準定義的幀格式中,類型字段則由后續(xù)的子網(wǎng)接入?yún)f(xié)議( Sub-network AccessP r o t o c o l, S N A P)的首部給出。幸運的是, 8 0 2定義的有效長度值與以太網(wǎng)的有效類型值無一相同,這樣,就可以對兩種幀格式進行區(qū)分。 在以太網(wǎng)幀格式中,類型字段之后就是數(shù)據(jù);而在 8 0 2幀格式中,跟隨在后面的是 3字節(jié)的 802.2 LLC和 5字節(jié)的 802.2 SNAP。目的服務訪問點( Destination Service Access Point,D S A P)和源服務訪問點( Source Service Access Point, SSAP)的值都設為 0 x a a。 Ct r l字段的值設為 3。隨后的 3個字節(jié) o rg code都置為 0。再接下來的 2個字節(jié)類型字段和以太網(wǎng)幀格式一樣(其他類型字段值可以參見 RFC 1340 Reynolds and Postel 1992)。 C R C字段用于幀內后續(xù)字節(jié)差錯的循環(huán)冗余碼檢驗(檢驗和)(它也被稱為 F C S或幀檢驗序列)。 8 0 2 . 3標準定義的幀和以太網(wǎng)的幀都有最小長度要求。 8 0 2 . 3規(guī)定數(shù) 據(jù)部分必須至少為 3 8字節(jié),而對于以太網(wǎng),則要求最少要有 4 6字節(jié)。為了保證這一點,必須在不足的空間插入填充( p a d)字節(jié)。在開始觀察線路上的分組時將遇到這種最小長度的情況。在本書中,我們在需要的時候將給出以太網(wǎng)的封裝格式,因為這是最為常見的封裝格式。 第 3 章 IP:網(wǎng)際協(xié)議 3.1 引言 I P是 T C P / I P協(xié)議族中最為核心的協(xié)議。所有的 T C P、 U D P、 I C M P及 I G M P數(shù)據(jù)都以 I P數(shù)據(jù)報格式傳輸(見圖 1 - 4)。許多剛開始接觸 T C P / I P的人對 I P提供不可靠、無連接的數(shù)據(jù)報傳送服務感到很奇怪,特別是那些具有 X . 2 5或 S N A背景知識的人。 不可靠( u n r e l i a b l e)的意思是它不能保證 I P數(shù)據(jù)報能成功地到達目的地。 I P僅提供最好的傳輸服務。如果發(fā)生某種錯誤時,如某個路由器暫時用完了緩沖區(qū), I P有一個簡單的錯誤處理算法:丟棄該數(shù)據(jù)報,然后發(fā)送I C M P消息報給信源端。任何要求的可靠性必須由上層來提供(如 T C P)。 無連接( c o n n e c t i o n l e s s)這個術語的意思是 I P并不維護任何關于后續(xù)數(shù)據(jù)報的狀態(tài)信息。每個數(shù)據(jù)報的處理是相互獨立的。這也說明, I P數(shù)據(jù)報可以不按發(fā)送順序接收。如果一信源向相同的信宿發(fā)送兩個連續(xù)的數(shù)據(jù)報(先是 A,然后是 B),每個數(shù)據(jù)報都是獨立地進行路由選擇,可能選擇不同的路線,因此 B可能在 A到達之前先到達。 在本章,我們將簡要介紹 I P首部中的各個字段,討論 I P路由選擇和子網(wǎng)的有關內容。還要介紹兩個有用的命令: i f c o n f i g和 n e t s t a t。關于 I P首部中一些字段的細節(jié),將留在以后使用這些字段的時候再進行討論。 RFC 791Postel 1981a是 I P的正式規(guī)范文件。 第 18章 TCP 的連接和終止 18.3 連接建立的超時 有很多情況導致無法建立連接。一種情況是服務器主機沒有處于正常狀態(tài)。為了模擬這種情況,我們斷開服務器主機的電纜線,然后向它發(fā)出 telnet命令。圖 18-6顯示了 tcpdump的輸出。 在這個輸出中有趣的一點是客戶間隔多長時間發(fā)送一個 SYN,試圖建立連接。第2個 SYN與第 1個的間隔是 5.8秒,而第 3個與第 2個的間隔是 24秒。作為一個附注,這個例子運行 38分鐘后客戶重新啟動。這對應初始序號為 291 008 001 (約為 38 60 64000 2)。我們曾經(jīng)介紹過使用典型的伯克利實現(xiàn)版的系統(tǒng)將初始序號初始化為 1,然后每隔 0.5秒就增加 64000。另外,因為這是系統(tǒng)啟動后的第一個 TCP連接,因此客戶的端口號是 1024。 圖 18-6中沒有顯示客戶端在放棄建立連接嘗試前進行 SYN重傳的時間。為了了解它我們必須對 telnet命令進行計時: 時間差值是 76秒。大多數(shù)伯克利系統(tǒng)將建立一個新連接的最長時間限制為 75秒。我們將在 21.4節(jié)看到由客戶發(fā)出的第 3個分組大約在 16:25:29超時, 客戶在它第 3個分組發(fā)出后 48秒而不是 75秒后放棄連接。 18.3.1 第一次超時時間在圖 18-6中一個令人困惑的問題是第一次超時時間為 5.8秒,接近 6秒,但不準確,相比之下第二個超時時間幾乎準確地為 24秒。運行十多次測試,發(fā)現(xiàn)第一次超時時間在 5.59秒 5.93秒之間變化。然而,第二次超時時間則總是 24.00秒(精確到小數(shù)點后面兩位)。 這是因為 BSD版的 TCP軟件采用一種 500 ms的定時器。這種 500 ms的定時器用于確定本章中所有的各種各樣的 TCP超時。當我們鍵入 telnet命令,將建立一個 6秒的定時器( 12個時鐘滴答( tick),但它可能在之后的 5.5秒 6秒內的任意時刻超時。圖 18-7顯示了這一發(fā)生過程。 盡管定時器初始化為 12個時鐘滴答,但定時計數(shù)器會在設置后的第一個 0500 ms中的任意時秒刻減 1。從那以后,定時計數(shù)器大約每隔 500 ms減 1,但在第 1個 500 ms內是可變的(我們使用限定詞“大約”是因為在 TCP每隔 500 ms獲得系統(tǒng)控制的瞬間,系統(tǒng)內核可能會優(yōu)先處理其他中斷)。 當?shù)未鹩嫈?shù)器為 0時, 6秒的定時器便會超時(見圖 18-7),這個定時器會在以后的 24秒( 48個滴答)重新復位。之后的下一個定時器將更接近 24秒,因為當TCP的 500 ms定時器被內核調用時,它就會被修改一次。 在圖 18-6中,出現(xiàn)了符號 tos 0x10 。這是 IP數(shù)據(jù)報內的服務類型( TOS)字段(參見圖 3-2)。 BSD/386中的 Telnet客戶進程將這個字段設置為最小時延。 18.4 最大報文段長度最大報文段長度( MSS)表示 TCP傳往另一端的最大塊數(shù)據(jù)的長度。當一個連接建立時,連接的雙方都要通告各自的 MSS。我們已經(jīng)見過 MSS都是 1024。這導致 IP數(shù)據(jù)報通常是 40字節(jié) 長: 20字節(jié)的 TCP首部和 20字節(jié)的 IP首部。在有些書中,將它看作可“協(xié)商”選項。它并不是任何條件下都可協(xié)商。當建立一個連 接時,每一方都有用于通告它期望接收的 MSS選項( MSS選項只能出現(xiàn)在 SYN報文段中)。如果一方不接收來自另一方的 MSS值,則 MSS就定為默認值 536字節(jié)(這個默認值允許 20字節(jié)的 IP首部和 20字節(jié)的 TCP首部以適合 576字節(jié) IP數(shù)據(jù)報 ) 。 一般說來,如果沒有分段發(fā)生, MSS還是越大越好(這也并不總是正確,參見圖 24-3和圖 24-4中的例子)。報文段越大允許每個報文段傳送的數(shù)據(jù)就 越多,相對 IP和 TCP首部有更高的網(wǎng)絡利用率。當 TCP發(fā)送一個 SYN時,或者是因為一個本地應用進程想發(fā)起一個連接,或者是因為另一端的主機收到了一個連接請求,它能將 MSS值設置為外出接口上的 MTU長度減去固定的 IP首部和 TCP首部長度。對于一個以太網(wǎng), MSS值可達 1460字節(jié)。使用 IEEE 802.3的封裝(參見 2.2節(jié)),它的 MSS可達 1452字節(jié)。 在本章見到的涉及 BSD/386和 SVR4的 MSS為 1024,這是因為許多 BSD的實現(xiàn)版本需要 MSS為 512的倍數(shù)。其他的系統(tǒng),如 SunOS 4.1.3、 Solaris 2.2 和 AIX 3.2.2,當雙方都在一個本地以太網(wǎng)上時都規(guī)定 MSS為 1460。 Mogul 1993 的比較顯示了在以太網(wǎng)上 1460的 MSS在性能上比 1024的 MSS更好。如果目的 IP地址為“非本地的 (nonlocal)”, MSS通常的默認值為 536。而區(qū)分地址是本地還是非本地是簡單的,如果目的 IP地址的網(wǎng)絡號與子網(wǎng)號都和我們的相同,則是本地的;如果目的 IP地址的網(wǎng)絡號與我們的完全不同,則是非本地的;如果目的 IP地址的網(wǎng)絡號與我們的相同而子網(wǎng)號與我們的不同,則可能是本地的 ,也可能是非本地的。大多數(shù) TCP實現(xiàn)版都提供了一個配置選項(附錄 E和圖 E-1),讓系統(tǒng)管理員說明不同的子網(wǎng)是屬于本地還是非本地。這個選項的設置將確定 MSS可以選擇盡可能的大(達到外出接口的 MTU長度)或是默認值 536。 MSS讓主機限制另一端發(fā)送數(shù)據(jù)報的長度。加上主機也能控制它發(fā)送數(shù)據(jù)報的長度,這將使以較小 MTU連接到一個網(wǎng)絡上的主機避免分段??紤]我們的主機slip,通過 MTU為 296的 SLIP鏈路連接到路由器 bsdi上。圖 18-8顯示這些系統(tǒng)和主機 sun。 從 sun向 slip發(fā)起一個 TCP連接, 并使用 tcpdump來觀察報文段。圖 18-9顯示這個連接 的建立(省略了通告窗口大?。?在這個例子中, sun發(fā)送的報文段不能超過 256字節(jié)的數(shù)據(jù),因為它收到的 MSS選項值為 256(第 2行)。此外,由于 slip知道它外出接口的 MTU長度為 296,即使 sun已經(jīng)通告它的 MSS為 1460,但為避免將數(shù)據(jù)分段,它不會發(fā)送超過 256字節(jié)數(shù)據(jù)的報文段。系統(tǒng)允許發(fā)送的數(shù)據(jù)長度小于另一端的 MSS值。 只有當一端的主機以小于 576字節(jié)的 MTU直接連接到一個網(wǎng)絡中,避免這種分段才會有效。如果兩端的主機都連接到以 太網(wǎng)上,都采用 536的 MSS,但中間網(wǎng)絡采用 296的 MTU,也將會出現(xiàn)分段。使用路徑上的 MTU發(fā)現(xiàn)機制(參見 24.2節(jié))是關于這個問題的唯一方法。 18.11 TCP 服務器的設計 我們在 1 . 8節(jié)說過大多數(shù)的 T C P服務器進程是并發(fā)的。當一個新的連接請求到達服務器時,服務器接受這個請求,并調用一個新進程來處理這個新的客戶請求。不同的操作系統(tǒng)使用不同的技術來調用新的服務器進程。在 U n i x系統(tǒng)下,常用的技術是使用 f o r k函數(shù)來創(chuàng)建新的進程。如果系統(tǒng)支持,也可使用輕型進程,即線程( t h r e a d)。 我們感興趣的是 T C P與若干并發(fā)服務器的交互作用。需要回答下面的問題:當一個服務器進程接受一來自客戶進程的服務請求時是如何處理端口的?如果多個連接請求幾乎同時到 達會發(fā)生什么情況? 18.11.1 TCP 服務器端口號 通過觀察任何一個 T C P服務器,我們能了解 T C P如何處理端口號。我們使用 n e t s t a t命令來觀察 Te l n e t服務器。下面是在沒有 Te l n e t連接時的顯示(只留下顯示 Te l n e t服務器的行) sun % netstat -a -n -f inet Active Internet connections (including servers) Proto Recv-Q Send-Q Local Address Foreign Address (state) tcp 0 0 *.23 *.* LISTEN a標志將顯示網(wǎng)絡中的所有主機端,而不僅僅是處于 E S TA B L I S H E D的主機端。 - n標志將以點分十進制的 形式顯示 I P地址,而不是通過 D N S將地址轉化為主機名,同時還要求顯示端口號(例如為 2 3)而不是服務名稱(如 Te l n e t)。 -f inet選項則僅要求顯示使用 T C P或 U D P的主機。顯示的本地地址為 * . 2 3,星號通常又稱為通配符。這表示傳入的連接請求(即 S Y N)將被任何一個本地接口所接收。如果該主機是多接口主機,我們將制定其中的一個 I P地址為本地 I P地址,并且只接收來自這個接口的連接(在本節(jié)后面我們將看到這樣的例子)。本地端口為 2 3,這是 Te l n e t的熟知端口號。 遠端地址顯示為 * . *,表示還不知道遠端 I P地址和端口號,因為該端還處于 L I S T E N狀態(tài),正等待連接請求的到達。現(xiàn)在我們在主機 s l i p( 1 4 0 . 2 5 2 . 1 3 . 6 5)啟動一個 Te l n e t客戶程序來連接這個 Te l n e t服務器。以下是 n e t s t a t程序的輸出行: Proto Recv-Q Send-Q Local Address Foreign Address (state) tcp 0 0 3.23 5.1029 ESTABLISHED tcp 0 0 *.23 *.* LISTEN 端口號為 23的第 1行表示處于 E S TABLISHED 狀態(tài)的連接。另外還顯示了這個連接的本地 I P地址、本地端口號、遠端 I P地址和遠端端口號。本地 I P地址為該連接請求到達的接口(以太網(wǎng)接口, 1 4 0 . 2 5 2 . 1 3 . 3 3)。處于 L I S T E N狀態(tài)的服務器進程仍然存在。這個服務器進程是當前 Te l n e t服務器用于接收其他的連接請求。當傳入的連接請求到達并被接收時,系統(tǒng)內核中的T C P模塊就創(chuàng)建一個處于 E S TA B L I S H E D狀態(tài)的進程。另外,注意處于 E S TA B L I S H E D狀態(tài)的連接的端口不會變化:也是 2 3,與處于 L I S T E N狀態(tài)的進程相同?,F(xiàn)在我們在主機 s l i p上啟動另一個 Te l n e t客戶進程,并仍與這個 Te l n e t服務器進行連接。以下是 n e t s t a t程序 的輸出行: Proto Recv-Q Send-Q Local Address Foreign Address (state) tcp 0 0 3.23 5.1030 ESTABLISHED tcp 0 0 3.23 5.1029 ESTABLISHED tcp 0 0 *.23 *.* LISTEN 現(xiàn)在我們有兩條從相同主機到相同服務器的處于 E S TA B L I S H E D 的連接。它們的本地端口號均為 2 3。由于它們的遠端端口號不同,這不會造成沖突。因為每個 Te l n e t 客戶進程要使用一個外設端口,并且這個外設端口會選擇為主機( s l i p)當前未曾使用的端口,因此它們的端口號肯定不同。這個例子再次重申 T C P使用由本地地址和遠端地址組成的 4元組:目的 I P 地址、目的端口號、源 I P地址和源端口號來處理傳入的多個連接 請求。 T C P 僅通過目的端口號無法確定那個進程接收了一個連接請求。另外,在三個使用端口 2 3的進程中,只有處于 L I S T E N 的進程能夠接收新的連接請求。處于 E S TA B L I S H E D的進程將不能接收 S Y N報文段,而處于 L I S T E N的進程將不能接收數(shù)據(jù)報文段。下面我們從主機 s o l a r i s 上啟動第 3 個 Te l n e t 客戶進程,這個主機通過 S L I P鏈路與主機 s u n 相連,而不是以太網(wǎng)接口。 第 21 章 TCP 的超時與重傳 21.1 引言 TCP提供可靠的運輸層。它使用的方法之一就是確認從另一端收到的數(shù)據(jù)。但數(shù)據(jù)和確認都有可能會丟失。 TCP通過在發(fā)送時設置一個定時器來解決這種問題。如果當定時器溢出時還沒有收到確認,它就重傳該數(shù)據(jù)。對任何實現(xiàn)而言,關鍵之處就在于超時和重傳的策略,即怎樣決定超時間隔和如何確定重傳的頻率。 我們已經(jīng)看到過兩個超時和重傳的例子:( 1)在 6.5節(jié)的 ICMP端口不能到達的例子中,看到 TFTP客戶使用 UDP實現(xiàn)了一個簡單的超時和重傳機制:假定 5秒是一個適當?shù)臅r間間隔,并每隔 5秒進行重傳;( 2)在向一個不存在的主機發(fā) 送 ARP的例子中(第 4.5節(jié)),我們看到當 TCP試圖建立連接的時候,在每個重傳之間使用一個較長的時延來重傳 SYN。 對每個連接, TCP管理 4個不同的定時器。 1) 重傳定時器使用于當希望收到另一端的確認。在本章我們將詳細討論這個定時器以及一些相關的問題,如擁塞避免。 2) 堅持 (persist)定時器使窗口大小信息保持不斷流動,即使另一端關閉了其接收窗口。第 22章將討論這個問題。 3) ?;?(keepalive)定時器可檢測到一個空閑連接的另一端何時崩潰或重啟。第 23章將描述這個定時器。 4) 2MSL定時器測量一個連接處于 TIME_WAIT狀態(tài)的時間。我們在 18.6節(jié)對該狀態(tài)進行了介紹。 本章以一個簡單的 TCP超時和重傳的例子開始,然后轉向一個更復雜的例子。該例子可以使我們觀察到 TCP時鐘管理的所有細節(jié)??梢钥吹?TCP的典型實現(xiàn)是怎樣測量 TCP報文段的往返時間以及 TCP如何使用這些測量結果來為下一個將要傳輸?shù)膱笪亩谓⒅貍鞒瑫r時間。接著我們將研究 TCP的擁塞避免 當分組丟失時 TCP所采取的動作 并提供一個分組丟失的實際例子,我們還將介紹較新的快速重傳和快速恢復算法,并介紹該算法如何使 TCP檢測分組丟失比等待時鐘超時更快 21.2 超時與重傳的簡單例子 首先觀察 TCP所使用的重傳機制,我們將建立一個連接,發(fā)送一些分組來證明一切正常,然后拔掉電纜,發(fā)送更多的數(shù)據(jù),再觀察 TCP的行為。 圖 21-1表示的是 tcpdump的輸出結果(已經(jīng)去掉了 bsdi設置的服務類型信息)。 圖 21-1 TCP超時和重傳的簡單例子 第 1、 2和 3行表示正常的 TCP連接建立的過程,第 4行是“ hello, world”( 12個字符加上回車和換行)的傳輸過程,第 5行是其確認。接著我們從 svr4拔掉了以太網(wǎng)電纜, 第 6行表示 and hi”將被發(fā)送。第 718行是這個報文段的 12次重傳過程,而第 19行則是發(fā)送方的 TCP最終放棄并發(fā)送一個復位信號的過程。 現(xiàn)在檢查連續(xù)重傳之間不同的時間差,它們取整后分別為 1、 3、 6、 12、 24、48和多個 64秒。在本章的后面,我們將看到當?shù)谝淮伟l(fā)送后所設置的超時時間實際上為 1.5秒(它在首次發(fā)送后的 1.0136秒而不是精確的 1.5秒后,發(fā)生的原因我們已在圖 18-7中進行了解釋),此后該時間在每次重傳時增加 1倍并直至 64秒。 這個倍乘關系被稱為“指數(shù)退避 (exponential backoff)”??梢詫⒃摾优c 6.5節(jié)中的 TFTP例子比較,在那里每次重傳總是在前一次的 5秒后發(fā)生。 首次分組傳輸(第 6行, 24.480秒)與復位信號傳輸(第 19行, 566.488秒)之間的時間差約為 9分鐘,該時間在目前的 TCP實現(xiàn)中是不可變的。 對于大多數(shù)實現(xiàn)而言,這個總時間是不可調整的。 Solaris 2.2允許管理者改變這個時間( E.4節(jié)中的 tcp_ip_abort_interval變量),且其默認值為 2分鐘,而不是最常用的 9分鐘。 21.3 往返時間測量 TCP超時與重傳中最重要的部分就是 對一個給定連接的往返時間( RTT)的測量。由于路由器和網(wǎng)絡流量均會變化,因此我們認為這個時間可能經(jīng)常會發(fā)生變化, TCP應該跟蹤這些變化并相應地改變其超時時間。 首先 TCP必須測量在發(fā)送一個帶有特別序號的字節(jié)和接收到包含該字節(jié)的確認之間的 RTT。在上一章中,我們曾提到在數(shù)據(jù)報文段和 ACK之間通常并沒有一一對應的關系。在圖 20.1中,這意味著發(fā)送方可以測量到的一個 RTT,是在發(fā)送報文段 4(第 11024字節(jié))和接收報文段 7(對 11024字節(jié)的 ACK)之間的時間,用 M表示所測量到的 RTT。 最初的 TCP規(guī)范使 TCP使用低通過濾器來更新一個被平滑的 RTT估計器(記為O)。 R R+(1- )M 這里的 是一個推薦值為 0.9的平滑因子。每次進行新測量的時候,這個被平滑的 RTT將得到更新。每個新估計的 90來自

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內容里面會有圖紙預覽,若沒有圖紙預覽就沒有圖紙。
  • 4. 未經(jīng)權益所有人同意不得將文件中的內容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內容本身不做任何修改或編輯,并不能對任何下載內容負責。
  • 6. 下載文件中如有侵權或不適當內容,請與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

評論

0/150

提交評論