用Python和FFmpeg查找大码率的视频文件

本文使用Python2.7, 这个工作分两步

遍历目录下的视频文件
用ffprobe获取是视频文件的码率信息

用ffprobe 获取json格式的视频信息

用ffprobe.exe是FFmpeg自带的查看视频信息的工具。其获取json格式的信息命令例如以下

ffprobe -v quiet -print_format json -show_format -show_streams -i filename

这个命令会输出带有 streams和format项的json结构

Python读取json

用os.popen(strCmd).read() 来获取命令行的输出
用json.loads 解析json, 这个必须加try。否则某些乱码会导致挂机

import os,re,json
# ffprobe 需放置在 system32, not user's PATH
# 调用ffprobo 获取信息的json格式
def getJsonString(strFileName):
    strCmd =  'ffprobe -v quiet -print_format json -show_format -show_streams -i "' +  strFileName  + '"'  
    mystring = os.popen(strCmd).read()
    return  mystring

# UnicodeDecodeError: 'utf8' codec can't decode byte 0xc0 in position 57: invalid start byte
filecontent = getJsonString(strFileName)

try:
    js = json.loads(filecontent)
except Exception,e:   
    print Exception,":",e, strFileName 
    return

获取视频信息

有时候video项中没有bit_rate这一项，这时须要从format项中取

iVideoWidth = 0
iVideoHeight = 0
iVideoBitRate = 0
iAllBitRate = 0
strCodecName = ''


for stream in arrStreams:
    if(stream['codec_type'] == 'video'):

        strCodecName = stream['codec_name']
        iVideoWidth = int(stream['width'])
        iVideoHeight = int(stream['height'])

        # h264 可能没有这一项
        if  'bit_rate'  in stream.keys() :
            iVideoBitRate = int (stream['bit_rate'])

        break


iAllBitRate = int(js['format']['bit_rate'])

print  'CodecName (%s), width(%d), height(%d), video bit_rate(%d), all bit_rate (%d)' % (strCodecName, iVideoWidth, iVideoHeight, iVideoBitRate, iAllBitRate )

获取目录里的全部文件名

这个网上比較多，取了一个实现简单的递归版本号

g_fileList = []

def getFiles(path):
    if os.path.exists(path):   
        files = os.listdir(path)
        for f in files :
            subpath=os.path.join(path,f)
            if os.path.isfile(subpath):
                g_fileList.append(subpath)
            else:
                getFiles(subpath)

过滤视频文件

# 按扩展名过滤        
def filterExname (fileList, arrExtnames):
    filterList = []
    for strFile in fileList:
        strLowFileName = strFile.lower() # 转小写先

        for strExtName in arrExtnames :            
            if strLowFileName.endswith(strExtName) :
                filterList.append(strFile)    

    return filterList

g_fileList = []

# 假设是网络路径，能够先映射到本地, python有可能不支持网络路径 \
getFiles('.')

print 'g_fileList len = ', len(g_fileList)        
arrExtName = ['.mkv', '.rmvb', '.rm', '.wmv', '.avi', '.mp4', '.mov', '.mpg', '.xvid', '.asf', '.mpeg', '.vob', '.3gp', '.flv', '.ts']
arrVideoFiles = filterExname (g_fileList, arrExtName)

过滤大的码率文件

# 设置单位像素 比特率 阈值 2.5 - 4.0
PIEXL_RATE_MAX = 3.9

def isLargeBps(iWidth, iHeight, iBitrate):
    # 基准 每像素字节数

    fCurrentBitRatePixel = float(iBitrate) / (iWidth * iHeight)

    print  'isNeedConvert input = ', iWidth, iHeight, iBitrate, fCurrentBitRatePixel
    return (fCurrentBitRatePixel > PIEXL_RATE_MAX)

总结

大致就是这样，至于输出batch命令行，输出csv结果就不必细讲了。

相关阅读:
《JavaScript高级程序设计》笔记：客户端检测（九）
《JavaScript高级程序设计》笔记：BOM（八）
《JavaScript高级程序设计》笔记：函数表达式（七）
《JavaScript高级程序设计》笔记：面向对象的程序设计（六）
小tips：JS的Truthy和Falsy（真值与假值）
footer固定在页面底部的实现方法总结
 WEB前端需要了解的XML相关基础知识
 vuex最简单、最直白、最全的入门文档
 原生JS替代jQuery的各种方法汇总
 数据挖掘优秀工具对比
原文地址：https://www.cnblogs.com/blfshiye/p/5244893.html