这两天一直在处理音频的工作。走了不少弯路。还好问题解决了,特此记录一下。
音频出现过好多问题,包括:
0. 无法打开某一个codec
1. 没有声音
2. 有噪声,但能听到所要的音频
3. 完全噪声
4. 无法转码到特定的格式,比如AAC,MP3等
先说明一下几个重要的参数吧
1. sample_fmt. 对应音频格式,主要是音频raw data的解释方法
enum AVSampleFormat {
AV_SAMPLE_FMT_NONE = -1,
AV_SAMPLE_FMT_U8, ///< unsigned 8 bits
AV_SAMPLE_FMT_S16, ///< signed 16 bits
AV_SAMPLE_FMT_S32, ///< signed 32 bits
AV_SAMPLE_FMT_FLT, ///< float
AV_SAMPLE_FMT_DBL, ///< double
AV_SAMPLE_FMT_U8P, ///< unsigned 8 bits, planar
AV_SAMPLE_FMT_S16P, ///< signed 16 bits, planar
AV_SAMPLE_FMT_S32P, ///< signed 32 bits, planar
AV_SAMPLE_FMT_FLTP, ///< float, planar
AV_SAMPLE_FMT_DBLP, ///< double, planar
AV_SAMPLE_FMT_NB ///< Number of sample formats. DO NOT USE if linking dynamically
};
/**
* Audio Sample Formats
*
* @par
* The data described by the sample format is always in native-endian order.
* Sample values can be expressed by native C types, hence the lack of a signed
* 24-bit sample format even though it is a common raw audio data format.
*
* @par
* The floating-point formats are based on full volume being in the range
* [-1.0, 1.0]. Any values outside this range are beyond full volume level.
*
* @par
* The data layout as used in av_samples_fill_arrays() and elsewhere in FFmpeg
* (such as AVFrame in libavcodec) is as follows:
*
* For planar sample formats, each audio channel is in a separate data plane,
* and linesize is the buffer size, in bytes, for a single plane. All data
* planes must be the same size. For packed sample formats, only the first data
* plane is used, and samples for each channel are interleaved. In this case,
* linesize is the buffer size, in bytes, for the 1 plane.
*/翻译过来意义有出入,这个说得很清楚了。
planar/ channel/ data plane/ interleaved /linesize/
p代表planar平面方式组织数据,其它的是交错方式,目前好像只有这两种
里面的从0到9, 常用的是1,aac, 8,mp3的
2. sample_rate. 好像大部分都是44100. 如果调小了,声音失真,大了,效果也不明显。原因是人耳的频率因素。
3. bit_rate 这个是用户设定,一般不会出问题。
4. channels声道数. channel_layout
解决:
0. 无法打开codec
解决:如果不是没注册codec等初级错误,那就是参数设置不正确。
特定的codec格式是确定的,比如s16, s16p. 如果不对应,也会打不开
1. 当源音频和目标音频格式不同或者采样率不同时,要进行转换
swr_ctx = swr_alloc(); if (!swr_ctx) { printf( "Could not allocate resampler context "); return -1; } /* set options */ av_opt_set_int(swr_ctx, "in_channel_layout",iaCodecCtx->channel_layout, 0); av_opt_set_int(swr_ctx, "in_sample_rate", iaCodecCtx->sample_rate, 0); av_opt_set_sample_fmt(swr_ctx, "in_sample_fmt", iaCodecCtx->sample_fmt, 0); av_opt_set_int(swr_ctx, "out_channel_layout",oaCodecCtx->channel_layout, 0); av_opt_set_int(swr_ctx, "out_sample_rate", oaCodecCtx->sample_rate, 0); av_opt_set_sample_fmt(swr_ctx, "out_sample_fmt", oaCodecCtx->sample_fmt, 0); /* initialize the resampling context */ if ((ret = swr_init(swr_ctx)) < 0) { printf("Failed to initialize the resampling context "); return -1; }
ret = swr_convert(swr_ctx, pFrame->data, pFrame->nb_samples, (const uint8_t **)aFrame->data, aFrame->nb_samples);
以上转到特定格式,变换采样率,数据复制都做了
2.完全噪声
原因是数据写错了, packet.data里面全为空,但写进了文件中
3. 目标声音很弱
原因:
数据大小计算错误,加入了不必要的数据