• 我的第一个网络爬虫 C#版 福利 程序员专车


    最近在自觉python,看到了知乎上一篇文章(https://www.zhihu.com/question/20799742),在福利网上爬视频。。。

    由是我就开始跟着做了,但答主给的例子是基于python2.x的,而我开始学的是3.x,把print用法改了以后还是有很多模块导入不了,新手又不知道该怎么解决。

    于是,为了学(shang)习(che),我就把其中的一段代码用C#写了一次。在加了一些延时的情况下,一会儿硬盘就被占用了3个多g了。。。同学们,要注意身体啊

    下面贴出代码。。代码中故意留了几个bug,避免非程序员上车

    class Program
        {
            static void Main(string[] args)
            {
                var baseString = "http://w*w.46ek.c*m/view/{0}.html";
                Regex regex = new Regex(@"http://m4.26ts.com/[.0-9-a-zA-Z]*.mp4");
                WebClient wc = new WebClient();
    
    
                uint startIndex = ReadStartIndex();
                uint loop = ReadLoopLen();
    
                for (int i = 0; i < lop; i++)
                {
                    var subUrl = string.Format(baseString, startIndex + i);
                    WebRequest wReq = System.Net.WebRequest.Create(subUrl)
    
                    try
                    {
                        WebResponse wResp = wReq.GetResponse();
                        Stream respStream = wResp.GetResponseStream();
    
                        using (StreamReader reader = new StreamReader(respStream, Encoding.GetEncoding("GB18030")))
                        {
                            var htmlString = reader.ReadToEnd();
    
                            Match m = regex.Match(htmlString);
                            if (m.Success)
                            {
                                DownloadFile(wc, m.Value, string.Format("{0}.mp4", startIndex + i));
                            }
                        }
                    }
                    catch (Exception exc)
                    {
                        Console.WriteLine("Error : {0}", exc.Message);
                    }
    
                    Thread.Sleep(5);
                }
                
            }
    
            private static uint ReadStartIndex()
            {
                while (true)
                {
                    Console.Write("Set start index :");
    
                    string line = Console.ReadLine();
    
                    uint index = 0;
    
                    if (UInt32.TryParse(line, out index))
                    {
                        Console.WriteLine("Start index setted : "+ index);
                        return index;
                    }
    
                    Thread.Sleep(500);
                }
            }
    
            private static uint ReadLoopLen()
            {
                while (true)
                {
                    Console.Write("Set loop len :");
    
                    string line = Console.ReadLine();
    
                    uint index = 0;
    
                    if (UInt32.TryParse(line, out index))
                    {
                        Console.WriteLine("Loop len setted : " + index);
                        return index;
                    }
    
                    Thread.Sleep(500);
                }
            }
    
            private static void DownloadFile(WebClient wc, string url, string localname)
            {
                Console.WriteLine("Downloading file {1} to {2}", url, localname);
    
                wc.DownloadFile(url, localname);
    
                Console.WriteLine("File {0} download completed!", localname);
            }
  • 相关阅读:
    java_IO流之 NIO
    No enclosing instance of type Outer is accessible. Must qualify the allocation with an enclosing instance of type Outer (e.g. x.new A() where x is an instance of Outer)
    JAVA I/O流 之入门
    10年老司机经验总结--程序员兼职的那些事
    python 去除html 超链接href 如何实现?
    《模式分类(原书第二版)》pdf格式下载电子书免费下载
    通知-招财猫问题通知专用
    Centos6.5 安装 python3.5 虚拟环境 virtualenvwrapper
    5.区块链平台以太坊从入门到精通之 以太网区块链网络
    4.区块链平台以太坊从入门到精通之 以太币
  • 原文地址:https://www.cnblogs.com/GuoRL/p/8328329.html
Copyright © 2020-2023  润新知