• [speech] 统计音频文件总时长


    统计音频文件总时长

    两种实现方法,调用sox工具,调用python wave库。

    sox

    命令行中键入sox wavfile -n stat

    -bash-4.2$ sox arctic_a0001.wav -n stat  
    Samples read:             53680
    Length (seconds):      3.355000
    Scaled by:         2147483647.0
    Maximum amplitude:     0.628510
    Minimum amplitude:    -0.649933
    Midline amplitude:    -0.010712
    Mean    norm:          0.069802
    Mean    amplitude:    -0.000027
    RMS     amplitude:     0.114387
    Maximum delta:         0.332764
    Minimum delta:         0.000000
    Mean    delta:         0.019452
    RMS     delta:         0.033908
    Rough   frequency:          754
    Volume adjustment:        1.539
    

    其中的length就是长度,单位seconds。

    Code

    import os,sys
    import os,sys
    import math
    import subprocess
    import csv
    
    wavdir = 'wav'
    txtdir = 'txt'
    wavlst = os.listdir(wavdir)
    total = 0
    def get_wav_duration(wav_id):
        cmd = "sox {} -n stat 2>&1".format(os.path.join(wavdir,wav_id))
        tmp = os.popen(cmd)
        dur_line = tmp.readlines()[1].split()
        dur = math.floor(float(dur_line[2]) * 10)/10
        #print dur
        global total
        total = total + dur 
        return str(dur)
    
    with open('text.csv','wb') as csvfile:
        writer = csv.writer(csvfile)
        for wav_id in wavlst:
            utt_id = wav_id.split('.')[0]
            duration = get_wav_duration(wav_id)
            #sound_nframe = os.path.join(wavdir, utt_id)
            write_tmp = [utt_id, duration]                                                                                        
            writer.writerow(write_tmp)
    print(total)
    

    os.popen简记

    python中调用外部命令行命令,主要使用os.system(cmd)os.popen(cmd),两个命令的区别在于前者返回cmd退出状态码,后者能够返回脚本执行过程中的输出内容。在python的document中貌似并不推荐os.popen,推荐使用subprocess.popen,用法稍微复杂。
    os.system

    #!/bin/bash
    echo "hello world!"
    exit 3
    os.system(cmd):返回16bit,低位为杀死所调用脚本的信号号码,高位为脚本的退出状态码(即高位信号有用)
    >>> n = os.system(test.sh)
    >>> n >> 8
    >>> 3
    

    os.popen
    这种调用方式是通过管道的方式来实现,函数返回一个file-like的对象,里面的内容是脚本输出的内容(可简单理解为echo输出的内容),如果需要其他的内容,就需要使用重定向2>&1【注意重定向】

    import os
    cmd = 'echo haha'
    tmp = os.popen(cmd).readlines()
    print(tmp)
    

    wave lib

    统计The World English Bible中语音的时长,12G的文件,两层文件夹,统计后总时长为263965.387755s(约73h)

    import os
    import wave
    import contextlib
    def get_wav_duration(fname, print_flag=0):
        with contextlib.closing(wave.open(fname,'r')) as f:
            frames = f.getnframes()
            rate = f.getframerate()
            wav_duration = frames / float(rate)
            if str(print_flag) != '0':
                print('wav time: {}'.format(wav_duration))
            return wav_duration
    
    wavdirdir = 'WEB'
    wavdirlst = os.listdir(wavdirdir)
    wavlst = []
    for lst in wavdirlst:
        wavdir = os.path.join(wavdirdir,lst)
        wavpath = os.listdir(wavdir)
        for wav in wavpath:
            wavlst.append(os.path.join(wavdir,wav))
    total = 0 
    for lst in wavlst:
        total += get_wav_duration(lst)
    print(total)
    

    Reference

    http://blog.csdn.net/yogurt0928/article/details/46625731
    https://taizilongxu.gitbooks.io/stackoverflow-about-python/content/6/README.html
    http://blog.csdn.net/windone0109/article/details/8895875
    https://www.cnblogs.com/bluescorpio/archive/2010/05/04/1727020.html
    http://blog.csdn.net/y_xianjun/article/details/73245482

  • 相关阅读:
    C++ crash 堆栈信息获取(三篇文章)
    qt捕获全局windows消息(使用QAbstractNativeEventFilter,然后注册这个类)
    qt之fillder抓包(QT网络版有一些具体的坑)
    qt之窗口换肤(一个qss的坑:当类属性发现变化时需要重置qss,使用rcc资源文件)
    模块化开发AraeRegistration
    CLR的组成和运转
    开源框架Caliburn.Micro
    URL
    负载均衡架构
    C add
  • 原文地址:https://www.cnblogs.com/zhanxiage1994/p/7873304.html
Copyright © 2020-2023  润新知