《TCP/IP Sockets 编程》笔记5
第5章 發送和接收數據
There is nomagic: any programs that exchange information must agree on how that information will be encoded—represented as a sequence of bits—as well as which program sends what information when, and how the information received a?ects the behavior of the program. This agreement regarding the form and meaning of information exchanged over a communication channel is called a protocol .?
Most application protocols are de?ned in terms of discrete messages made up of sequences of ?elds. Each ?eld contains a speci?c piece of information encoded as a sequence of bits.?
5.1 Encoding Integers
平臺(platform)的解釋:
By “platform” in this book we mean the combination of compiler, operating system, and hardware architecture. The gcc compiler with the Linux operating system, running on Intel’s IA-32 architecture, is an example of a platform.
?
5.1.1 Sizes of Integers
確定平臺上整型的大小。
sizeof()需要注意的兩件事:
第一,sizeof(char)總是1。因此,在C語言里,一個"byte"就是一個char類型變量占據的空間,sizeof()的單位其實是sizeof(char);
第二,預定義常量CHAR_BIT指示表示一個char類型的值需要多少bit。
Here are a couple of things to note about sizeof(). First, the language speci?es that sizeof(char) is 1—always. Thus in the C language a “byte” is the amount of space occupied by a variable of type char, and the units of sizeof() are actually sizeof(char). But exactly how big is a C-language “byte”? That’s the second thing: the prede?ned constant CHAR_BIT tells how many bits it takes to represent a value of type char?—usually 8, but possibly 10 or even 32.
The C99 language standard speci?cation o?ers a solution in the form of a set of optional types: int8_t, int16_t, int32_t, and int64_t (along with their unsigned counterparts uint8_t, etc) all have the size (in bits) indicated by their names. On a platform where CHAR_BIT is eight, these are 1, 2, 4 and 8 byte integers,?respectively. Although these types may not be implemented on every platform, each is required to be de?ned if any native primitive type has the corresponding size. (So if, say, the size of an int on the platform is 32 bits, the “optional” type int32_t is required to be de?ned.
?
5.1.2 Byte Ordering
There are two obvious choices: start at the “right” end of the number, with the least signi?cant bits—so-called little-endian order—or at the left end, with the most signi?cant bits— big-endian order. (Note that the ordering of bits within bytes is, fortunately, handled by the implementation in a standard way.)
Most protocols that send multibyte quantities in the Internet today use big-endian byte order; in fact, it is sometimes called network byte order. The byte order used by the hardware (whether it is big- or little-endian) is called the native byte order.
Addresses and ports that cross the Sockets API are always in network byte order.
?
5.1.3 Signedness and Sign Extension
Given k bits, we can represent values in the range ?2k-1 through 2k-1 ? 1 using two’s-complement. Note that the most signi?cant bit (msb) tells whether the value is positive (msb=0) or negative (msb=1).On the other hand, a k-bit unsigned integer can encode values in the range 0 through 2k ? 1 directly.
The signedness of the integers being transmitted should be determined by the range of values that need to be encoded.
?
Some care is required when dealing with integers of di?erent signedness because of sign extension.
1.When a signed value is copied to any wider type, the additional bits are copied from the sign (i.e., most signi?cant) bit.
當把有符號的值復制到任意更寬的類型時,將從符號位(即最高有效位)復制到額外的位。
2.The value of an unsigned integer type is—reasonably enough—not sign-extended.
?
One ?nal point to remember: when expressions are evaluated, values of variables are widened (if needed) to the “native” ( ) size before any computation occurs. Thus, if you add the values of two variables together, the type of the result will be int, not char.
?
5.1.5 Wrapping TCP Sockets in Streams
A way of encoding multibyte integers for transmission over a stream (TCP) socket is to use the built-in -stream facilities.
FILE * fdopen(int socketdes, const char*?mode)
The fdopen() function “wraps” the socket in a stream and returns the result. This allows bu?ered I/O to be performed on the socket via operations like fgets(), fputs(), fread() and fwrite().
int fclose(FILE*?stream)
fclose() closes the stream along with the underlying socket.
int fflush(FILE* stream)
fflush() pushes bu?ered data to underlying socket, causes any bu?ered to be sent over the underlying socket.
size_t fwrite(const void * ptr, size_t size, size_t nmemb, FILE * stream)
The fwrite() method writes the speci?ed number of objects of the given size to the stream.
size_t fread(void * ptr, size_t size, size_t nmemb, FILE * stream)
The fread() method goes in the other direction, reading the given number of objects of the given size from the given stream and placing them sequentially in the location pointed to by ptr.
Note that the sizes are given in units of sizeof(char), while the return values of these methods are the number of objects read/written, not the number of bytes. In particular, fread() never reads part of an object from the stream, and similarly fwrite() never writes a partial object. If the underlying connection terminates, these methods will return a short item count.
if (fwrite(&val8, sizeof(val8), 1, outstream) != 1) ...
Among the advantages of using bu?ered -streams with sockets is the ability to “put back” a byte after reading it from the stream (via ungetc()); this can sometimes be useful when parsing messages.
FILE-streams can only be used with TCP sockets.
?
5.1.6 Structure Overlays: Alignment and Padding
The C language rules for laying out data structures include speci?c alignment requirements, including that the ?elds within a structure begin on certain boundaries based on their type. The main points of the requirements can be summarized as follows:
1. Data structures are maximally aligned. That is, the address of any instance of a structure (including one in an array) will be divisible by the size of its largest native integer ?eld.
2. Fields whose type is a multibyte integer type are aligned to their size (in bytes). Thus, an int32_t integer ?eld’s beginning address is always divisible by four, and a uint16_t integer ?eld’s address is guaranteed to be divisible by two.
To enforce these constraints, the compiler may add padding between the ?elds of a structure.
針對布置數據結構,C語言的規則包含特定的對齊要求,結構中的字段基于其類型開始于特定的邊界。要點可以概括如下:
1.數據結構是最大化對齊的。一個結構任何實例(包括數組中的元素)的地址,可以被結構中最大整型字段的大小整除。
2.多字節整型字段與它們的大小對齊。因此,一個int32_t整型字段的開始地址總是能被4整除,一個unt16_t整型字段的地址則保證能被2整除。
?
5.1.7 Strings and Text
A mapping between a set of symbols and a set of integers is called a coded character set.
The C99 extensions standard de?nes a type wchar_t (“wide character”) to store characters from charsets that may use more than one byte per symbol. In addition, various library functions are de?ned that support conversion between byte sequences and arrays of wchar_t, in both directions. (In fact, there is a wide character string version of virtually every library function that operates on character strings.) To convert back and forth between wide strings and encoded char (byte) sequences suitable for transmission over the network, we would use the wcstombs() (“wide character string to multibyte string”) and mbstowcs() functions.
#include <stdlib.h>
size_t wcstombs(char *restrict s, const wchar_t *restrict pwcs, size_t n);
size_t mbstowcs(wchar_t *restrict pwcs, const char *restrict s, size_t n);
The terminating null is an artifact of the language, and not part of the string itself. It therefore should not be transmitted with the string unless the protocol explicitly speci?es that method of marking the end of the string.
The bad news is that C99’s wide character facilities are not designed to give the programmer explicit control over the encoding scheme. Indeed, they assume a single, ?xed charset de?ned according to the “locale” of the platform. Although the facilities support a variety of charsets, they do not even provide the programmer any way to learn which charset or encoding is in use. In fact, the C99 standard states in several situations that the e?ect of changing the locale’s charset at runtime is unde?ned. What this means is that if you want to implement a protocol using a particular charset, you’ll have to implement the encoding yourself.
?
5.2 Constructing, Framing, and Parsing Messages
?
A clean design further decomposes the process into two parts:
The ?rst is concerned with framing, or marking the boundaries of the message, so the receiver can ?nd it in the stream.
The second is concerned with the actual encoding of the message, whether it is represented using text or binary data.
Notice that these two parts can be independent of each other, and in a well-designed protocol they should be separated.
?
struct VoteInfo {
uint64_t count; // invariant: !isResponse => count==0
int candidate; // invariant: 0 <= candidate <= MAX_CANDIDATE
bool isInquiry;
bool isResponse;
};
typedef struct VoteInfo VoteInfo;
enum {
MAX_CANDIDATE = 1000,
MAX_WIRE_SIZE = 500
};
int GetNextMsg(FILE *in, uint8_t *buf, size_t bufSize);
int PutMsg(uint8_t buf[], size_t msgSize, FILE *out);
bool Decode(uint8_t *inBuf, size_t mSize, VoteInfo *v);
size_t Encode(VoteInfo *v, uint8_t *outBuf, size_t bufSize);
?
5.2.1 Framing
If a receiver tries to receive more bytes from a socket than were in the message, one of two things can happen.
If no other message is in the channel, the receiver will block and will be prevented from processing the message; if the sender is also blocked waiting for a reply, the result will be deadlock: each side of the connection waiting for the other to send more information.
On the other hand, if another message is already in the channel, the receiver may read some or all of it as part of the ?rst message, leading to other kinds of errors. Therefore framing is an important consideration when using TCP sockets.
?
Two general techniques enable a receiver to unambiguously ?nd the end of the message:
1. Delimiter-based: The end of the message is indicated by a unique marker, a particular, agreed-upon byte (or sequence of bytes) that the sender transmits immediately following the data.
The downside of such techniques is that both sender and receiver have to scan every byte of the message.
2. Explicit length: The variable-length ?eld or message is preceded by a length ?eld that tells how many bytes it contains. The length ?eld is generally of a ?xed size; this limits the maximum size message that can be framed.
The length-based approach is simpler but requires a known upper bound on the size of the message.
?
?
//--------------------------------------------------------------------DelimFramer.c #include <stdio.h> #include <stdlib.h> #include <stdint.h> #include "Practical.h"static const char DELIMITER = '\n';/* Read up to bufSize bytes or until delimiter, copying into the given* buffer as we go.* Encountering EOF after some data but before delimiter results in failure.* (That is: EOF is not a valid delimiter.)* Returns the number of bytes placed in buf (delimiter NOT transferred).* If buffer fills without encountering delimiter, negative count is returned.* If stream ends before first byte, -1 is returned.* Precondition: buf has room for at least bufSize bytes.*/ int GetNextMsg(FILE *in, uint8_t *buf, size_t bufSize) {int count = 0;int nextChar;while (count < bufSize) {nextChar = getc(in);if (nextChar == EOF) {if (count > 0)DieWithUserMessage("GetNextMsg()", "Stream ended prematurely");elsereturn -1;}if (nextChar == DELIMITER)break;buf[count++] = nextChar;}if (nextChar != DELIMITER) { // Out of space: count==bufSizereturn -count;} else { // Found delimiterreturn count;} }/* Write the given message to the output stream, followed by* the delimiter. Return number of bytes written, or -1 on failure.*/ int PutMsg(uint8_t buf[], size_t msgSize, FILE *out) {// Check for delimiter in messageint i;for (i = 0; i < msgSize; i++)if (buf[i] == DELIMITER)return -1;if (fwrite(buf, 1, msgSize, out) != msgSize)return -1;fputc(DELIMITER, out);fflush(out);return msgSize; } //--------------------------------------------------------------------DelimFramer.c?
//--------------------------------------------------------------------LengthFramer.c #include <stdio.h> #include <stdlib.h> #include <stdint.h> #include <netinet/in.h> #include "Practical.h"/* Read 2-byte length and place in big-endian order.* Then read the indicated number of bytes.* If the input buffer is too small for the data, truncate to fit and* return the negation of the *indicated* length. Thus a negative return* other than -1 indicates that the message was truncated.* (Ambiguity is possible only if the caller passes an empty buffer.)* Input stream is always left empty.*/ int GetNextMsg(FILE *in, uint8_t *buf, size_t bufSize) {uint16_t mSize = 0;uint16_t extra = 0;if (fread(&mSize, sizeof(uint16_t), 1, in) != 1)return -1;mSize = ntohs(mSize);if (mSize > bufSize) {extra = mSize - bufSize;mSize = bufSize; // Truncate}if (fread(buf, sizeof(uint8_t), mSize, in) != mSize) {fprintf(stderr, "Framing error: expected %d, read less\n", mSize);return -1;}if (extra > 0) { // Message was truncateduint8_t waste[BUFSIZE];fread(waste, sizeof(uint8_t), extra, in); // Try to flush the channelreturn -(mSize + extra); // Negation of indicated size} elsereturn mSize; }/* Write the given message to the output stream, followed by* the delimiter. Precondition: buf[] is at least msgSize.* Returns -1 on any error.*/ int PutMsg(uint8_t buf[], size_t msgSize, FILE *out) {if (msgSize > UINT16_MAX)return -1;uint16_t payloadSize = htons(msgSize);if ((fwrite(&payloadSize, sizeof(uint16_t), 1, out) != 1) || (fwrite(buf,sizeof(uint8_t), msgSize, out) != msgSize))return -1;fflush(out);return msgSize; }//--------------------------------------------------------------------LengthFramer.c?
//--------------------------------------------------------------------VoteEncodingText.c /* Routines for Text encoding of vote messages.* Wire Format:* "Voting <v|i> [R] <candidate ID> <count>"*/ #include <string.h> #include <stdint.h> #include <stdbool.h> #include <stdlib.h> #include <stdio.h> #include <string.h> #include "Practical.h" #include "VoteProtocol.h"static const char *MAGIC = "Voting"; static const char *VOTESTR = "v"; static const char *INQSTR = "i"; static const char *RESPONSESTR = "R"; static const char *DELIMSTR = " "; enum {BASE = 10 };/* Encode voting message info as a text string.* WARNING: Message will be silently truncated if buffer is too small!* Invariants (e.g. 0 <= candidate <= 1000) not checked.*/ size_t Encode(const VoteInfo *v, uint8_t *outBuf, const size_t bufSize) {uint8_t *bufPtr = outBuf;long size = (size_t) bufSize;int rv = snprintf((char *) bufPtr, size, "%s %c %s %d", MAGIC,(v->isInquiry ? 'i' : 'v'), (v->isResponse ? "R" : ""), v->candidate);bufPtr += rv;size -= rv;if (v->isResponse) {rv = snprintf((char *) bufPtr, size, " %llu", v->count);bufPtr += rv;}return (size_t) (bufPtr - outBuf); }/* Extract message information from given buffer.* Note: modifies input buffer.*/ bool Decode(uint8_t *inBuf, const size_t mSize, VoteInfo *v) {char *token;token = strtok((char *) inBuf, DELIMSTR);// Check for magicif (token == NULL || strcmp(token, MAGIC) != 0)return false;// Get vote/inquiry indicatortoken = strtok(NULL, DELIMSTR);if (token == NULL)return false;if (strcmp(token, VOTESTR) == 0)v->isInquiry = false;else if (strcmp(token, INQSTR) == 0)v->isInquiry = true;elsereturn false;// Next token is either Response flag or candidate IDtoken = strtok(NULL, DELIMSTR);if (token == NULL)return false; // Message too shortif (strcmp(token, RESPONSESTR) == 0) { // Response flag presentv->isResponse = true;token = strtok(NULL, DELIMSTR); // Get candidate IDif (token == NULL)return false;} else { // No response flag; token is candidate ID;v->isResponse = false;}// Get candidate #v->candidate = atoi(token);if (v->isResponse) { // Response message should contain a count valuetoken = strtok(NULL, DELIMSTR);if (token == NULL)return false;v->count = strtoll(token, NULL, BASE);} else {v->count = 0L;}return true; } //--------------------------------------------------------------------VoteEncodingText.c?
//--------------------------------------------------------------------VoteEncodingBin.c /* Routines for binary encoding of vote messages* Wire Format:* 1 1 1 1 1 1* 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+* | Magic |Flags| ZERO |* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+* | Candidate ID |* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+* | |* | Vote Count (only in response) |* | |* | |* +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+**/#include <string.h> #include <stdbool.h> #include <stdlib.h> #include <stdint.h> #include <netinet/in.h> #include "Practical.h" #include "VoteProtocol.h"enum {REQUEST_SIZE = 4,RESPONSE_SIZE = 12,COUNT_SHIFT = 32,INQUIRE_FLAG = 0x0100,RESPONSE_FLAG = 0x0200,MAGIC = 0x5400,MAGIC_MASK = 0xfc00 };typedef struct voteMsgBin voteMsgBin;struct voteMsgBin {uint16_t header;uint16_t candidateID;uint32_t countHigh;uint32_t countLow; };size_t Encode(VoteInfo *v, uint8_t *outBuf, size_t bufSize) {if ((v->isResponse && bufSize < sizeof(voteMsgBin)) || bufSize < 2* sizeof(uint16_t))DieWithUserMessage("Output buffer too small", "");voteMsgBin *vm = (voteMsgBin *) outBuf;memset(outBuf, 0, sizeof(voteMsgBin)); // Be surevm->header = MAGIC;if (v->isInquiry)vm->header |= INQUIRE_FLAG;if (v->isResponse)vm->header |= RESPONSE_FLAG;vm->header = htons(vm->header); // Byte ordervm->candidateID = htons(v->candidate); // Know it will fit, by invariantsif (v->isResponse) {vm->countHigh = htonl(v->count >> COUNT_SHIFT);vm->countLow = htonl((uint32_t) v->count);return RESPONSE_SIZE;} else {return REQUEST_SIZE;} }/* Extract message info from given buffer.* Leave input unchanged.*/ bool Decode(uint8_t *inBuf, size_t mSize, VoteInfo *v) {voteMsgBin *vm = (voteMsgBin *) inBuf;// Attend to byte order; leave input unchangeduint16_t header = ntohs(vm->header);if ((mSize < REQUEST_SIZE) || ((header & MAGIC_MASK) != MAGIC))return false;/* message is big enough and includes correct magic number */v->isResponse = ((header & RESPONSE_FLAG) != 0);v->isInquiry = ((header & INQUIRE_FLAG) != 0);v->candidate = ntohs(vm->candidateID);if (v->isResponse && mSize >= RESPONSE_SIZE) {v->count = ((uint64_t) ntohl(vm->countHigh) << COUNT_SHIFT)| (uint64_t) ntohl(vm->countLow);}return true; } //--------------------------------------------------------------------VoteEncodingBin.c?
?
轉載于:https://www.cnblogs.com/custa/archive/2010/09/11/1824083.html
創作挑戰賽新人創作獎勵來咯,堅持創作打卡瓜分現金大獎總結
以上是生活随笔為你收集整理的《TCP/IP Sockets 编程》笔记5的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 博弈论进阶之Anti-SG游戏与SJ定理
- 下一篇: 推荐一款好用的文件加密传输软件——Kle