• Django压缩包下载


      原文链接 http://www.codingsoho.com/zh/blog/djangoya-suo-bao-xia-zai/

    前言

     系统有的时候需要下载一些内容到本地,这些内容原来可能就是在服务器上某个位置或者离散分布,文件内容格式多样而且大小不一,本文将一步步来解决这些问题。

     本文环境:

    Python 2.7.10

    Django 1.11

    zip-file

    安装

    执行pip安装

    pip install zipfile

     python自带,不需要安装

    打包完整目录

     下面代码将完全打包当前目录,保存为abcd.zip文件里,存放到当前目录

    存放在当前目录有个问题:压缩包里会再次包含压缩包,大小为0。所以可以将压缩包放到另外的目录。

     文件zips.py

    import zipfile
    import os
    def gen_zip_with_zipfile(path_to_zip, target_filename):
        import StringIO
    
        f = None
        try:
            f = zipfile.ZipFile(target_filename, 'w' ,zipfile.ZIP_DEFLATED)
            for root,dirs,files in os.walk(path_to_zip):
                for filename in files:
                    f.write(os.path.join(root,filename))
                if  len(files) == 0:
                    zif=zipfile.ZipInfo((root+'\'))
                    f.writestr(zif,"")
        except IOError, message:
            print message
            sys.exit(1)
        except OSError, message:
            print message
            sys.exit(1)
        except zipfile.BadZipfile, message:    
            print message
            sys.exit(1)
        finally: 
            # f.close()
            pass
    
        if zipfile.is_zipfile(f.filename):
            print "Successfully packing to: "+os.getcwd()+"\"+ target_filename
        else:
            print "Packing failed"
    
        return f
    

      

    将压缩功能封装为以上函数,我把文件关闭注释掉了,否则后面有些操作不能进行,可以在这些操作都完成之后关闭文件。

    import zipfile
    import os 
    
    tmpPath = ".\document"
    f = gen_zip_with_zipfile(tmpPath,'..\abcd.zip')
    # print f.namelist()
    # print f.fp
    f.close()
    

      

     

    上面是一个调用的例子,通过上面操作,当前目录下的document文件夹包括子文件夹打包到abcd.zip文件里,存放到上一级目录 

    Zip文件添加到HttpResponse 

     

    def post(self, *args, **kwargs):
    
        import zipfile
        import os 
    
        from zips import gen_zip_with_zipfile
    
        path_to = ".\document"
        f = gen_zip_with_zipfile(path_to,'..\abcd.zip')
        f.close()
    
        fread = open(f.filename,"rb")
        # response = HttpResponse(f.fp, content_type='application/zip')
        response = HttpResponse(fread, content_type='application/zip')
        response['Content-Disposition'] = 'attachment;filename="{0}"'.format("download.zip")
        fread.close()
        return response
    

        

    不确定是否可以直接从f (zipfile) 读取内容而避免重复的额外再打开文件,尝试过 用f.fp,但是这个是写句柄。

    使用临时文件代替固定文件

    上面的操作中,我们总是会在服务器上生成一个文件,这些数据没有保存意义,必定会带来冗余,有什么办法可以解决这个问题吗?这一节我们尝试用临时文件来处理一下

    FileWrapper在djagno1.9之后有了位置变化,记得更新

    实际运行结果文件为空,不知道原因是什么,后面有时间会再查一查

      

    def post(self, *args, **kwargs):
        # from django.core.servers.basehttp import FileWrapper
        from wsgiref.util import FileWrapper
        import tempfile
        temp = tempfile.TemporaryFile() 
        archive = zipfile.ZipFile(temp, 'w', zipfile.ZIP_DEFLATED) 
        archive.write(".\document\ckeditor.md") 
        archive.close() 
    
        wrapper = FileWrapper(temp) 
        response = HttpResponse(wrapper, content_type='application/zip') 
        response['Content-Disposition'] = 'attachment; filename=test.zip' 
        response['Content-Length'] = temp.tell() 
        temp.seek(0) 
        return response
    

      

    使用内存代替文件

    根据python里Zipfile的定义 https://docs.python.org/3/library/zipfile.html#zipfile-objects

    class zipfile.ZipFile(filemode='r'compression=ZIP_STOREDallowZip64=True)

        Open a ZIP file, where file can be a path to a file (a string), a file-like object or a path-like object.

    所以我们可以用 BytesIO或者StringIO来代替固定文件

     

    def post(self, *args, **kwargs):
    
        import StringIO
        # Files (local path) to put in the .zip
        # FIXME: Change this (get paths from DB etc)
        filenames = [".\document\ckeditor.md",]
    
        # Folder name in ZIP archive which contains the above files
        # E.g [thearchive.zip]/somefiles/file2.txt
        # FIXME: Set this to something better
        zip_subdir = "zipfolder"
        zip_filename = "%s.zip" % zip_subdir
    
        # Open StringIO to grab in-memory ZIP contents
        s = StringIO.StringIO()
    
        # The zip compressor
        zf = zipfile.ZipFile(s, "w")
    
        for fpath in filenames:
            # Calculate path for file in zip
            fdir, fname = os.path.split(fpath)
            zip_path = os.path.join(zip_subdir, fname)
    
            # Add file, at correct path
            zf.write(fpath, zip_path)
    
        # Must close zip for all contents to be written
        zf.close()
    
        # Grab ZIP file from in-memory, make response with correct MIME-type
        resp = HttpResponse(s.getvalue(), content_type='application/zip') 
        # ..and correct content-disposition
        resp['Content-Disposition'] = 'attachment; filename=%s' % zip_filename
    
        return resp
    

         

    上面内容的解释:

    在filenames里列出了即将被压缩的内容,这儿只是个简单的例子,实际项目中可以根据需要压缩整个文件夹或者选择性的离散压缩磁盘上的各个文件。

    指定zip_subdir这个功能并不是必须的,如果指定了,那么压缩后的文件将会按指定目录结果放置文件,否则的话将会按压缩源的结果排列文件。比如在上面的例子上ckeditor.md放在document文件夹下面,默认压缩包里也是这个目录结果,如果如上面列子一样指定了放在zipfolder下面的话,那么压缩包的结构会变成zipfolder/ckeditor.md

    有些地方的例子里,HttpResponse的参数为mimetype = "application/x-zip-compressed",这个在django1.5之后已经改成了content_type。

     https://stackoverflow.com/questions/2463770/python-in-memory-zip-library

    https://stackoverflow.com/questions/12881294/django-create-a-zip-of-multiple-files-and-make-it-downloadable

    接下来这段代码中,文件的内容会根据queryset,从磁盘中获取,目标目录会截取相对目录

      

    def post(self, *args, **kwargs):
    
        import StringIO
        s = StringIO.StringIO()
        zf = zipfile.ZipFile(s, "w")   
        zip_subdir = "media"
        qs = self.get_queryset()
        f = self.filter_class(self.request.GET, queryset=qs)
        for obj in f.qs:
            path = obj.image_before.path
            fdir, fname = os.path.split(path)
            zip_path = os.path.join(zip_subdir, path[len(settings.MEDIA_ROOT)+1:])            
            zf.write(path, zip_path)
    
        zf.close()
        resp = HttpResponse(s.getvalue(), content_type='application/zip')  
        resp['Content-Disposition'] = 'attachment; filename=%s' % "daily_inspection_export.zip"
        return resp
    

         

    加入临时生成文件

    前面的例子中都将磁盘上的文件写入压缩包,但是如果有的文件是临时生成的,这种情况应该如何处理呢

    下面代码中,在前面生成zip文件的基础上,在向里面添加一个daily_inspection_export.csv文件,这个文件在磁盘上并不存在,而且根据内容临时生成的

     

    def post(self, *args, **kwargs):
    
        import StringIO
        s = StringIO.StringIO()
        zf = zipfile.ZipFile(s, "w")   
        zip_subdir = "media"
        qs = self.get_queryset()
        f = self.filter_class(self.request.GET, queryset=qs)
        for obj in f.qs:
            path = obj.image_before.path
            fdir, fname = os.path.split(path)
            zip_path = os.path.join(zip_subdir, path[len(settings.MEDIA_ROOT)+1:])            
            zf.write(path, zip_path)
    
        from inspection.utils import gen_csv_file
        import tempfile
        temp = tempfile.NamedTemporaryFile()
        temp.close()
    
        fields_display = [ "category", "rectification_status", "location" ]
        fields_fk = ["inspector",  ]
        fields_datetime = ["due_date","created", "updated","completed_time"]
        excludes = [field.name for field in self.model._meta.get_fields() if isinstance(field, models.ManyToOneRel) or field.name.lower()=="id"]
        fields_multiple = ["impact",]    
        gen_csv_file(temp.name, self.model, f.qs, fields_display, fields_fk, fields_datetime, excludes, fields_multiple)
    
        zf.write(temp.name, "daily_inspection_export.csv")
        os.remove(temp.name)
    
        zf.close()
        resp = HttpResponse(s.getvalue(), content_type='application/zip')  
        resp['Content-Disposition'] = 'attachment; filename=%s' % "daily_inspection_export.zip"
        return resp
    

        

    文件生成函数

     

    def gen_csv_file(model, qs, filename, fields_display, fields_fk, fields_datetime, excludes, fields_multiple=None):
    
        import csv
        with open(filename, 'wb') as csvfile:
            writer = csv.writer(csvfile, dialect='excel')
    
            csvfile.write(codecs.BOM_UTF8)   
            
            row = []
            for field in model._meta.get_fields():
                if field.name in excludes:
                    continue
                row.append(field.verbose_name)
            writer.writerow(row)
    

       

    首先创建一个临时文件在磁盘文件,方法可以用tempfile.NamedTemporaryFile()或者tempfile.TemporaryFile(),差别是前者有文件名后者没有。对于本文例子,这两种方式都能工作,但是因为本身对file.name进行操作了,还是推荐NamedTemporaryFile,不管哪种方式,在close之后都会自动删除。还有一种方式是tempfile.mkdtemp(),调用这种方式必须用os.removedirs手动删除。详细用法参考 https://docs.python.org/2/library/tempfile.html

    临时文件生成后,如果程序立即再打开它会报错

     [Errno 13] Permission denied: 'c:\users\admini~1\appdata\local\temp\tmph_mdma'

    查看官方文件定义

    tempfile.NamedTemporaryFile([mode='w+b'[, bufsize=-1[, suffix=''[, prefix='tmp'[, dir=None[, delete=True]]]]]])

    This function operates exactly as TemporaryFile() does, except that the file is guaranteed to have a visible name in the file system (on Unix, the directory entry is not unlinked). That name can be retrieved from the name attribute of the returned file-like object. Whether the name can be used to open the file a second time, while the named temporary file is still open, varies across platforms (it can be so used on Unix; it cannot on Windows NT or later). If delete is true (the default), the file is deleted as soon as it is closed.

    The returned object is always a file-like object whose file attribute is the underlying true file object. This file-like object can be used in a with statement, just like a normal file.

    注意到其中的描述,在Window下,如果文件打开了,再次打开是不允许的,所以我们必须关闭这个文件才能重新打开。虽然说临时文件关闭后会自动删除,但是好像并不是立即删除,后面可以主动调用os.remove()函数来删除这个临时文件。

    CSV格式

    在csv文件操作时碰到两个问题

    1. 提示我的文件格式不匹配,检测到是SYLK格式

    原因是因为我的csv内容是以ID开头的,这个是微软的一个bug,会出现这样的问题,具体见https://www.alunr.com/excel-csv-import-returns-an-sylk-file-format-error/

    2. 修改上ID之后,文件可以正常打开,但是中文全部乱码

    解决方案:在文件头部加入csvfile.write(codecs.BOM_UTF8),具体原因我还没有去研究,但这种方法能工作,不管是生成本地文件还是HttpResponse

    HttpResponse方案

     

    response = HttpResponse(content_type='text/csv')        
    response['Content-Disposition'] = 'attachment; filename={0}'.format(filename)
    response.write(codecs.BOM_UTF8) # add bom header
    writer = csv.writer(response)
    

      

    磁盘文件方案

    import csv
    with open(filename, 'wb') as csvfile:
        csvfile.write(codecs.BOM_UTF8)   
    

       

    zip-stream

    zip-stream有好多版本,这儿使用的是 https://github.com/allanlei/python-zipstream

    安装

    执行pip安装

    pip install zipstream

    文件下载

    import zipstream
    z = zipstream.ZipFile()
    z.write('static\css\inspection.css')
    
    with open('zipfile.zip', 'wb') as f:
        for data in z:
            f.write(data)
    

    Web Response

    基本方法如下

     

    from django.http import StreamingHttpResponse
    
    def zipball(request):
    	z = zipstream.ZipFile(mode='w', compression= zipfile.ZIP_DEFLATED)
    	z.write('/path/to/file')
    
        response = StreamingHttpResponse(z, content_type='application/zip')
        response['Content-Disposition'] = 'attachment; filename={}'.format('files.zip')
        return response
    

      

    将上面的例子用这个方法实现,基本能够对压缩文件的构成和返回进行了简化。另外,zip文件和临时文件都不能删除,否则写入会有问题。

    def post(self, *args, **kwargs):
    
    	from django.http import StreamingHttpResponse
    	import zipstream
    
    	zf = zipstream.ZipFile(mode='w', compression=zipfile.ZIP_DEFLATED)
    	zip_subdir = "media"
    	for obj in f.qs:
    		path = obj.image_before.path
    		fdir, fname = os.path.split(path)
    		zip_path = os.path.join(zip_subdir, path[len(settings.MEDIA_ROOT)+1:])            
    		zf.write(path, zip_path)
    
    		if obj.image_after:
    			path = obj.image_after.path
    			fdir, fname = os.path.split(path)
    			zip_path = os.path.join(zip_subdir, path[len(settings.MEDIA_ROOT)+1:])            
    			zf.write(path, zip_path)
    
    
    	from inspection.utils import gen_csv_file
    	import tempfile
    	temp = tempfile.NamedTemporaryFile()
    	temp.close()
    
    	fields_display = [ "category", "rectification_status", "location" ]
    	fields_fk = ["inspector",  ]
    	fields_datetime = ["due_date","created", "updated","completed_time"]
    	excludes = [field.name for field in self.model._meta.get_fields() if isinstance(field, models.ManyToOneRel)]
    	fields_multiple = ["impact",]        
    	gen_csv_file(self.model, f.qs, temp.name, fields_display, fields_fk, fields_datetime, excludes, fields_multiple)
    
    	zf.write(temp.name, "daily_inspection_export.csv")            
    
    	response = StreamingHttpResponse(zf, content_type='application/zip')            
    	response['Content-Disposition'] = 'attachment; filename={}'.format('daily_inspection_export.zip')
    	# zf.close()
    	# os.remove(temp.name)
    	return response
    

      

    大文件下载

    为避免在磁盘和内容存放过多的内容,结合生成器的使用,zipstream提供了迭代的方法去存储文件,官方Demo如下。迭代和非迭代的方式是可以混合使用的。这儿不展开了。

     

    def iterable():
        for _ in xrange(10):
            yield b'this is a byte stringx01
    '
    
    z = zipstream.ZipFile()
    z.write_iter('my_archive_iter', iterable())
    z.write('path/to/files', 'my_archive_files')
    
    with open('zipfile.zip', 'wb') as f:
        for data in z:
            f.write(data)
    

      

    关注下方公众号获取更多文章

     

    参考文档

    API

    https://docs.python.org/2/library/tempfile.html

    https://docs.python.org/2/library/zipfile.html

    github

    https://pypi.org/project/django-zipfile/0.3.0/

    https://github.com/allanlei/python-zipstream

    https://github.com/SpiderOak/ZipStream (山寨)

    其他

    https://blog.csdn.net/klzs1/article/details/9339391

    https://stackoverflow.com/questions/2463770/python-in-memory-zip-library

    io.BytesIO

    https://stackoverflow.com/questions/908258/generating-file-to-download-with-django

    https://stackoverflow.com/questions/12881294/django-create-a-zip-of-multiple-files-and-make-it-downloadable

    python笔记之ZipFile模块

    django 下载文件的几种方法

    Django 大文件下载

    https://docs.djangoproject.com/en/dev/ref/request-response/#telling-the-browser-to-treat-the-response-as-a-file-attachment

    django zip

    使用Python压缩文件/文件夹

    使用Python在内存中生成zip文件

    Django 利用zipstream压缩下载多文件夹

    python-django文件下载

    Python模块学习——tempfile

    tempfile.NamedTemporaryFile创建临时文件在windows没有权限打开

    Excel CSV import returns an SYLK file format error

  • 相关阅读:
    牛客练习赛64 D.宝石装箱 【容斥原理+背包DP】
    洛谷 P5212 SubString【SAM+LCT】
    洛谷 P4219 [BJOI2014]大融合【LCT】
    洛谷 P1501 [国家集训队]Tree II【LCT】
    洛谷 P5357 【模板】AC自动机(二次加强版)
    洛谷 P3690 【模板】Link Cut Tree (动态树)
    洛谷 P2463 [SDOI2008]Sandy的卡片【后缀数组】
    P3181 [HAOI2016]找相同字符【后缀数组】
    洛谷 SP705 【后缀数组】
    牛客小白月赛18 E.Forsaken的数列【Splay】
  • 原文地址:https://www.cnblogs.com/2dogslife/p/8972293.html
Copyright © 2020-2023  润新知