《HTTP Programming Recipes for C# Bots》
第一章
选择GET还是POST取决于传送到服务器的数据的多少。GET传送的数据少,POST几乎对传送的数据无限制。
It is important to note that only one physical file is transferred per HTTP request.
每次HTTP请求只传送了一个物理文件
调用顺序:
• Step 1: Obtain a HttpWebRequest object.
获取一个HttpWebRequest对象
• Step 2: Set any HTTP request headers.
设置HTTP请求头
• Step 3: POST data, if this is a POST request.
附上数据,如果是POST请求?
• Step 4: Obtain a HttpWebResponse object.
获取一个HttpWebResponse对象
• Step 4: Read HTTP response headers.
读取HTTP响应头
• Step 5: Read HTTP response data.
读取HTTP响应数据
Typical Request Headers GET /1/1/typical.php HTTP/1.1 Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/x-shockwave-flash, */* Referer: http://www.httprecipes.com/1/1/ Accept-Language: en-us UA-CPU: x86 Accept-Encoding: gzip, deflate User-Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727) Host: www.httprecipes.com Connection: Keep-Alive
There are really two parts to the headers: the first line and then the rest of the header
lines. The first line, which begins with the request type, is the most important line in the
header block, and it has a slightly different format than the other header lines. The request
type can be GET, POST, HEAD, or one of the other less frequently used headers. Browsers
will always use GET or POST. Following the request type is the file that is being requested.
In the above request, the following URL is being requested:
http://www.httprecipes.com/1/1/typical.phpThere are really two parts to the headers: the first line and then the rest of the header
lines. The first line, which begins with the request type, is the most important line in the
header block, and it has a slightly different format than the other header lines. The request
type can be GET, POST, HEAD, or one of the other less frequently used headers. Browsers
will always use GET or POST. Following the request type is the file that is being requested.
In the above request, the following URL is being requested:
http://www.httprecipes.com/1/1/typical.phpThere are really two parts to the headers: the first line and then the rest of the header
lines. The first line, which begins with the request type, is the most important line in the
header block, and it has a slightly different format than the other header lines. The request
type can be GET, POST, HEAD, or one of the other less frequently used headers. Browsers
will always use GET or POST. Following the request type is the file that is being requested.
In the above request, the following URL is being requested:
http://www.httprecipes.com/1/1/typical.php
The above URL is not represented exactly as seen above in the request header. The
“Host” header line in the header names the web server that contains the file. The request
shows the remainder of the URL, which in this case is /1/1/typical.php. Finally, the
third thing that the first line provides is the version of the HTTP protocol being used. As of
the writing of this book there are only two versions currently in widespread use:
• HTTP/1.1
• HTTP/1.0
This book only deals with HTTP 1.1. Because this book is about writing programs to connect
to web servers, it will be assumed that HTTP 1.1 is being used, which is what C# uses
when the C# HTTP classes are used.
The lines after the first line make up the actual HTTP headers. Their format is colon
delimited. The header name is to the left of the colon and the header value is to the right. It
is valid to have two of the same header names in the same request. Two headers of the same
name are used when cookies are specified. Cookies will be covered in Chapter 8, “Handling
Sessions and Cookies.”
The headers give a variety of information. Examining the headers shows type of browser
being used as well as the operating system, as well as other information. In the headers listed
above in Listing 1.3, the Internet Explorer 7 browser was being used on the Windows XP
platform.
The headers finally terminate with a blank line. If the request had been a POST, any
posted data would follow the blank line. Even when there is no posted data, as is the case
with a GET, the blank line is still required.
A web server should respond to every HTTP request from a web browser. The web
server’s response is discussed in the next section.
HTTP Response Headers
When the web server responds to a HTTP request, HTTP response header lines are
sent. The HTTP response headers look very similar to the HTTP request headers. Listing
1.4 shows the contents of typical HTTP response headers.
Listing 1.4: Typical Response Headers HTTP/1.1 200 OK Date: Sun, 02 Jul 2006 22:28:58 GMT Server: Apache/2.0.40 (Red Hat Linux) Last-Modified: Sat, 29 Jan 2005 04:13:19 GMT ETag: "824319-509-c6d5c0" Accept-Ranges: bytes Content-Length: 1289 Connection: close Content-Type: text/html
As can be seen from the above listing, at first glance, response headers look nearly the
same as request headers. However, look at the first line.
Although the first line is space delimited as in the request, the information is different.
The first line of HTTP response headers contains the HTTP version and status information
about the response. The HTTP version is reported as 1.1, and the status Code, 200, means
“OK,” no error. Also, this is where the famous error code 404 (page not found) comes
from.
Error codes can be grouped according to the digit in their hundreds position:
• 1xx: Informational - Request received, continuing process
• 2xx: Success - The action was successfully received, understood, and accepted
• 3xx: Redirection - Further action must be taken in order to complete the request
• 4xx: Client Error - The request contains bad syntax or cannot be fulfilled
• 5xx: Server Error - The server failed to fulfill an apparently valid request
Immediately following the headers will be a blank line, just as was the case with HTTP
requests. Following the blank line delimiter will be the data that was requested. It will be
of the length specified in the Content-Length header. The Content-Length
header in Listing 1.4 indicates a length of 1289 bytes. For a list of HTTP codes, refer to Appendix
E, “HTTP Response Codes.”