• Python各种花式截图工具,截到你手软


    前言:

       最近,项目中遇到了一个关于实现通过给定URL,实现对网页屏幕进行截图的一个功能,前面代码中已经用python的第三方库实现了截图功能,但在上线以后出现了一些bug,所以就改bug的任务就落在了我的头上,这方面前面没有接触过,正好利用这个机会学习一下python中各种截图方法,下面就是我要介绍的一些常用截图功能的基本使用,希望可以帮到你,也可以提出文中不足的地方。

    PIL和Pillow

    PIL:Python Imaging Library,已经是Python平台事实上的图像处理标准库了。PIL功能非常强大,但API却非常简单易用,不过只支持到Python 2.7

    Pillow:是PIL的一个派生分支,但如今已经发展成为比PIL本身更具活力的图像处理库。目前最新版本是3.0.0

    1、安装

    在Debian/Ubuntu Linux下直接通过apt安装:

    $ sudo apt-get install python-imaging
    

    Mac和其他版本的Linux可以直接使用easy_install或pip安装,安装前需要把编译环境装好:

    $ sudo easy_install PIL
    

    Windows平台就去PIL官方网站下载exe安装包或pip

    $ pip install pillow
    

    2、实现屏幕截图

    from PIL import ImageGrab
    im = ImageGrab.grab()
    im.save("1.png")    #定义保存的路径和保存的图片格式
    

    PyQt4

    PyQt是一个创建GUI应用程序的工具包。它是Python编程语言和Qt库的成功融合。Qt库是目前最强大的库之一。PyQt是由Phil Thompson 开发。PyQt是Qt库的Python版本。PyQt3支持Qt1到Qt3。 PyQt4支持Qt4。它的首次发布也是在1998年,但是当时它叫 PyKDE,因为开始的时候SIP和PyQt没有分开。PyQt是用SIP写的。PyQt 提供 GPL版和商业版。

    1、安装

    windows下:

    32位: http://sourceforge.net/projects/pyqt/files/PyQt4/PyQt-4.11.4/PyQt4-4.11.4-gpl-Py2.7-Qt4.8.7-x32.exe
    64位: http://sourceforge.net/projects/pyqt/files/PyQt4/PyQt-4.11.4/PyQt4-4.11.4-gpl-Py2.7-Qt4.8.7-x64.exe

    linux下:

    https://www.riverbankcomputing.com/software/pyqt/download

    2、实现网页截图

    有些图片较多,或是页面较长的网站,截出来的图,会出现图片未加载完毕或页面截取不完整等情况,找了很多资料也没解决,如果那个小伙伴会的话,可以教一下我

    #!/usr/bin/env python
    # -*- coding:utf-8 -*-
    import sys,time
    import os.path
    from PyQt4 import QtGui, QtCore, QtWebKit
    
    class PageShotter(QtGui.QWidget):
        def __init__(self, url, parent=None):
            QtGui.QWidget.__init__(self, parent)
            self.url = url
    
        def shot(self):
            webView = QtWebKit.QWebView(self)
            webView.load(QtCore.QUrl(self.url))
            self.webPage = webView.page()
            self.connect(webView, QtCore.SIGNAL("loadFinished(bool)"), self.savePage)
    
        def savePage(self, finished):
            if finished:
                print "开始截图!"
                size = self.webPage.mainFrame().contentsSize()
                print "页面宽:%d,页面高:%d" % (size.width(), size.height())
                self.webPage.setViewportSize(QtCore.QSize(size.width() + 16, size.height()))
                img = QtGui.QImage(size, QtGui.QImage.Format_ARGB32)
                painter = QtGui.QPainter(img)
                self.webPage.mainFrame().render(painter)
                painter.end()
                fileName = "shot.png"
                if img.save(fileName):
                    filePath = os.path.join(os.path.dirname(__file__), fileName)
                    print "截图完毕:%s" % filePath
                else:
                    print "截图失败"
            else:
                print "网页加载失败!"
            self.close()
    
    
    if __name__ == "__main__":
        app = QtGui.QApplication(sys.argv)
        shotter = PageShotter("https://www.jd.com/")
        shotter.shot()
        sys.exit(app.exec_())
    

    加强版

    #!/usr/bin/python
    # coding:utf-8
    
    import sys
    import os.path
    import requests
    import urlparse
    import time
    
    sys.path.append('../')
    from PyQt4 import QtGui, QtCore, QtWebKit
    from PyQt4.QtNetwork import QNetworkRequest
    
    
    class WebStatus(object):
            def __init__(self, timeout, tries):
                    '''
                    类说明:获取网页的状态码。如200表示“网页正常,可访问”
                    参数:timeout(s) 请求等待时间;tries 请求次数;
                    '''
                    self.__timeout = timeout
                    self.__tries = tries
    
                    self.__headers = {
                            "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
                            "Accept-Language": "en-US,en;q=0.5",
                            "Accept-Encoding": "gzip, deflate",
                            "User-Agent": "Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Firefox/45.0"
                    }
    
            def clear_url(self, url):
                    if not url:
                            return None
                    if ',' in url:
                            url = url.split(',')[0]
                    url = url.strip()
                    if url.startswith('http://') or url.startswith('https://'):
                            pass
                    else:
                            url = 'http://' + url
                    try:
                            parse = urlparse.urlparse(url)
                            url_new = parse.scheme + '://' + parse.netloc
                    except:
                            url_new = url
                    if url_new.endswith('.'):
                            url_new = url_new.rstrip('.')
                    return url_new
    
            def isAccessible(self, url):
                    '''
                    说明:获取网页状态码,判断该网页是否可正常访问的,200表示网页可正常访问的。
                    参数:url 网址;
                    返回:
                                    True:状态码200,网页正常可访问;
                                    False:网页状态码非200,或者请求异常。
                    '''
                    url = self.clear_url(url)
                    tries = self.__tries
                    status = False
                    r = None
                    while tries > 0:
                            try:
                                    r = requests.get(url=url, headers=self.__headers, timeout=self.__timeout)
                            except:
                                    tries -= 1
                                    status = False
                            else:
                                    if r.status_code == 200:
                                            status = url
                                    break
                            finally:
                                    if r:
                                            r.close()
                                            r = None
                    return status
    
            def __del__(self):
                    pass
    
    
    class PageShotter(QtGui.QWidget):
            def __init__(self, url, parent=None,pic_path = './pic_path'):
                    path = os.path.exists(pic_path)
                    if not path:
                            os.mkdir(pic_path)
                    self.request = QNetworkRequest()
                    QtGui.QWidget.__init__(self, parent)
                    self.url = url
                    self.dir_path = os.path.join(pic_path,str(time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(time.time()))) + '_' + urlparse.urlparse(self.url).netloc + '.png')
    
            def shot(self):
                    webView = QtWebKit.QWebView(self)
                    self.request.setRawHeader("User-Agent", "Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Firefox/45.0")
                    webView.load(QtCore.QUrl(self.url))
                    self.webPage = webView.page()
                    self.connect(webView, QtCore.SIGNAL("loadFinished(bool)"), self.savePage)
                    return self.dir_path
    
            def savePage(self, finished):
                    if finished:
                            size = self.webPage.mainFrame().contentsSize()
                            self.webPage.setViewportSize(QtCore.QSize(size.width() + 16, size.height()))
                            img = QtGui.QImage(size, QtGui.QImage.Format_ARGB32)
                            painter = QtGui.QPainter(img)
                            self.webPage.mainFrame().render(painter)
                            painter.end()
                            if img.save(self.dir_path):
                                    filePath = os.path.join(os.path.dirname(__file__),self.dir_path)
                                    #print "截图完毕:%s" % filePath
                            else:
                                    print "截图失败"
                    else:
                            print "网页加载失败!"
                    self.close()
    
    
    if __name__ == "__main__":
            obj1 = WebStatus(10, 2)
            status = obj1.isAccessible("https://www.jd.com")
            if status:
                    app = QtGui.QApplication(sys.argv)
                    shotter = PageShotter(status,pic_path = './dirpath_test')
                    obj = shotter.shot()
                    print obj
                    sys.exit(app.exec_())
            else:
                    print 'Invalid URL'
    pyqt4截图加强版

    selenium +phantomjs

    安装请参考  http://www.cnblogs.com/luxiaojun/p/6144748.html

    from selenium import webdriver
    from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
     
    dcap = dict(DesiredCapabilities.PHANTOMJS)  #设置userAgent
    dcap["phantomjs.page.settings.userAgent"] = ("Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) Gecko/20100101 Firefox/25.0 ")
     
    obj = webdriver.PhantomJS(executable_path='C:Python27Scriptsphantomjs.exe',desired_capabilities=dcap) #加载网址
    obj.get('http://wap.95533pc.com')#打开网址
    obj.save_screenshot("1.png")   #截图保存
    obj.quit() 
    

    另外还有利用subprocess中的popen方法指令linux命令进行截图的方法,如 cutycapt,会在以后的文章中继续写到,同样也可以参考以下文章:

    http://www.111cn.net/sys/linux/81361.htm

  • 相关阅读:
    C# 文件过滤器Filter
    C#实现打印功能
    Entity Framework 批量插入很慢
    C#关于日期月天数和一年有多少周及某年某周时间段的计算
    lc.exe 已退出 代码为 1
    oracle获取本月第一天和最后一天及Oracle trunc()函数的用法
    [ASP.NET AJAX]How to register javascript functions after UpdatePanel updated
    PPC调用webservice精要
    Oracle的思维(4)Oracle的万能分页并不万能2
    Microsoft Tech ED 2006
  • 原文地址:https://www.cnblogs.com/luxiaojun/p/6259491.html
Copyright © 2020-2023  润新知