[转自:千年胡杨]
在.net 代码中执行http请求,得到网页内容的几种方式, 至少有三种方式。
1. HttpWebRequest & HttpWebResponse
WebRequest类
WebRequest类是.NET Framework中“请求/响应”模型的abstract基类,用于访问Internet数据。使用WebRequest类请求/响应模型的应用程序可以用协议不可知的方式从Internet请求数据,在这种方式下,应用程序处理WebRequest类的实例,而协议特定的子类则执行请求的具体细节,请求从应用程序发送到某个特定的URI,如服务器上的网页。URI从一个为应用程序注册的WebRequest子类列表中确定要创建的适当子类。注册WebRequest子类通常是为了处理某个特定的协议(如HTTP或FTP),但是也可以注册它以处理对特定服务器或服务器上的路径的请求。
WebRequest类中最常用的是Create方法,Create方法用于为指定的URI方案初始化新的WebRequest实例。
语法:
publicstatic WebRequest Create
(
string requestUriString
)
参数:
requestUriString:标识Internet资源的URI。
返回值:特定URI方案的WebRequest子类。
注意:Create方法将运行时确定的WebRequest类的子类作为与requestUri最接近的注册匹配项返回。例如,当以http://开头的URI在requestUri中传递时,由Create返回一个HttpWebRequest。如果改为传递以file://开头的URI,则Create方法将返回FileWebRequest实例。.NET Framework包括对http://和file:// URI方案的支持。
WebResponse类
WebResponse类是abstract基类,协议特定的响应类从该抽象基类派生。应用程序可以使用WebResponse类的实例以协议不可知的方式参与请求和响应事务,而从WebResponse类派生的协议特定的类携带请求的详细信息。
在WebResponse类中最常用的是GetResponse方法,GetResponse方法用于当在子类中被重写时,返回对 Internet 请求的响应
语法:
public virtual WebResponseGetResponse ()
返回值:包含对Internet请求的响应的WebResponse。
源代码
string strResponse = "";
string initialURl = "http://www.cnblog.com";
HttpWebRequest request_step1 = (HttpWebRequest)WebRequest.Create(initialURl);
request_step1.KeepAlive = true;
request_step1.UserAgent = "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/534.16 (KHTML, like Gecko) Chrome/10.0.648.204 Safari/534.16";
request_step1.Accept = "application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5";
request_step1.Headers.Add("Accept-Encoding", " gzip,deflate,sdch");
request_step1.Headers.Add("Accept-Language", "zh-CN,zh;q=0.8");
request_step1.Headers.Add("Accept-Charset", "GBK,utf-8;q=0.7,*;q=0.3");
CookieContainer cookieFromStep1 = newCookieContainer();
request_step1.CookieContainer = cookieFromStep1;
HttpWebResponse response_1 = (HttpWebResponse)request_step1.GetResponse();
StreamReader readStream_1 = newStreamReader(newGZipStream(response_1.GetResponseStream(), CompressionMode.Decompress), Encoding.UTF8);
strResponse = readStream_1.ReadToEnd();
//close response and stream.
response_1.Close();
readStream_1.Close();
2. WebClient
WebClient类位于System.Net命名空间下,WebClient类提供向URI标识的任何本地、Intranet或Internet资源发送数据以及从这些资源接收数据的公共方法。
源代码
string ageUrl = "http://www.cnblog.com"; //需要获取源代码的网页
WebClient wc = newWebClient(); // 创建WebClient实例提供向URI 标识的资源发送数据和从URI 标识的资源接收数据
wc.Credentials = CredentialCache.DefaultCredentials; // 获取或设置用于对向 Internet 资源的请求进行身份验证的网络凭据。
//方法一:
Encoding enc = Encoding.GetEncoding("GB2312"); // 如果是乱码就改成 utf-8 / GB2312
Byte[] pageData = wc.DownloadData(PageUrl); // 从资源下载数据并返回字节数组。
string str = enc.GetString(pageData); // 输出字符串(HTML代码)
//方法二:
Stream resStream = wc.OpenRead(PageUrl); //以流的形式打开URL
Encoding enc = Encoding.GetEncoding("GB2312"); // 如果是乱码就改成 utf-8 / GB2312
StreamReader sr = newStreamReader(resStream,enc); //以指定的编码方式读取数据流
string str = sr.ReadToEnd(); //输出(HTML代码)
resStream.Close();
3. WebBrowser
WebBrowser在命名空间System.Windows.Forms下定义, WebBrowser通常只会用在winform程序中。
首先,定义一个新的WebBrowser, 并将其放到一个winform容器中。
System.Windows.Forms.WebBrowserwebBrowser1 = new System.Windows.Forms.WebBrowser();
源代码
privatevoid Navigate(String address)
{
if (String.IsNullOrEmpty(address)) return;
if (address.Equals("about:blank")) return;
if (!address.StartsWith("http://") &&
!address.StartsWith("https://"))
{
address = "http://" + address;
}
try
{
webBrowser1.Navigate(newUri(address));
string webBrowser1.DocumentText
}
catch (System.UriFormatException)
{
return;
}
}
4. 三种方法的简单比较:
1)。WebRequest 和httpresponse最简单直接。
2)。WebClient对WebRequest作了包装,可以用于上传与下载文件,使用起来方便。 但是如果需要设置httpRequest的一些属性,如timeout,cache-level,则没有办法做到。需要用户重载。
3)。WebBrowser 最强大,但是耗资源最多。集成了Js引擎,依赖于OS的IE内核,能自动执行返回结果中的JS脚本。但是,一般只能用于winForm程序中。 如果需要在console程序中WebBrowser,请参考:
5. web blogs:
WebBrowser is actually in the System.Windows.Forms namespace and is avisual control that you can add to a form. It is primarily a wrapper around theInternet Explorer browser (MSHTML). It allows you to easily display andinteract programmatically with a web page. You call the Navigate method passinga web URL, wait for it to complete downloading and display and then interactwith the page using the object model it provides.
HttpWebRequest is a concrete class that allows you to request in code anysort of file over HTTP. You usually receive it as a stream of bytes. What youdo with it after that is up to your application.
HttpWebResponse allows you to process the response from a web server thatwas previously requested using HttpWebRequest.
WebRequest and WebResponse are the abstract base classes that theHttpWebRequest and HttpWebResponse inherit from. You can't create thesedirectly. Other classes that inherit from these include Ftp and File classes.
WebClient I have always seen as a nice helper class that provides simplerways to, for example, download or upload a file from a web url. (egDownloadFile and DownloadString methods). I have heard that it actually usesHttpWebRequest / HttpWebResponse behind the scenes for certain methods.
If you needmore fine grained control over web requests and responses, HttpWebRequest /HttpWebResponse are probably the way to go. Otherwise WebClient is generallysimpler and will do the job.
1). http://www.pin5i.com/showtopic-24684.html
2). http://hi.baidu.com/javaecho/blog/item/079c6d2a0d4efd5d4fc226b1.html