vx 搜索『gjzkeyframe』 关注『关键帧Keyframe』来及时获得最新的音视频技术文章。
这个公众号会路线图 式的遍历分享音视频技术:音视频基础(完成) → 音视频工具(完成) → 音视频工程示例(进行中) → 音视频工业实战(准备)。
iOS/Android 客户端开发同学如果想要开始学习音视频开发,最丝滑的方式是对音视频基础概念知识有一定了解后,再借助 iOS/Android 平台的音视频能力上手去实践音视频的采集 → 编码 → 封装 → 解封装 → 解码 → 渲染
过程,并借助音视频工具来分析和理解对应的音视频数据。
在音视频工程示例这个栏目的前面 6 篇 AVDemo 文章中,我们拆解了音频的采集 → 编码 → 封装 → 解封装 → 解码 → 渲染
流程并基于 iOS 系统 API 实现了 Demo:
你可以在关注本公众号后,在公众号发送消息『AVDemo』来获取 Demo 的全部源码。
如果你看完这些 Demo,对 iOS 平台的音视频开发多多少少会有一些认识了,在这个基础上我们来总结一下 iOS 音频处理框架,以及在前面的 Demo 中我们用到的主要 API 和数据结构有哪些。
当我们想要了解 iOS 的音频处理框架时,以下是我们能比较容易找到的两张官方架构图。它们分别出自 Audio Unit Hosting Guide for iOS[1] 和 Core Audio Overview[2] 这两篇文档。
iOS Audio Frameworks
Core Audio API Layers
但这两篇文档已经比较陈旧了,是多年之前的文档,以至于和最新的 iOS 15 的框架有不少的出入。而新版本的 iOS 官方技术文档也没有给出比较清晰的音频架构图。所以在这里我们就按照 Demo 中涉及的系统 API 来挑选介绍几个相关的 Framework:
Audio Unit Framework[3]:最底层的音频处理 API,功能强大,直接驱动底层硬件,提供快速、模块化的音频处理。当你要实现低延迟的音频处理
(比如 VoIP)、对合成声音进行响应式的播放
(比如音乐游戏、合成乐器声音)、实现特定的音频能力
(比如回声消除、混音、声音均衡)、实现音频处理链支持灵活组装音频处理单元
时,你可以选择使用 Audio Unit 的 API。
需要注意的是,在最新的 iOS 系统库架构中,Audio Unit Framework 的实现都已经迁移到 Audio Toolbox Framework 中去了。
下面是 Audio Unit 框架的主要模块:
1)Audio Component Services[4]:定义了发现、开启和关闭音频单元(audio unit)以及音频编解码器的接口。
常用的数据类型:
查找方法(AudioComponentFindNext)
来找到符合描述的音频组件,然后再去使用创建方法(AudioComponentInstanceNew)
创建一个对应的音频组件实例。常用的接口:
2)Audio Unit Component Services[12]:提供了使用音频单元(audio unit)的 C 语言接口。一个音频单元(audio unit)是用来进行音频数据处理或者音频数据生成的插件单元。要发现、开启、关闭音频单元(audio unit)则可以使用 Audio Component Services。
常用的数据类型:
typedef AudioComponentInstance AudioUnit;
。AudioUnit 就是一种 AudioComponentInstance。常用的接口:
常用的回调:
3)Output Audio Unit Services[25]:提供了 start、stop 用于 I/O 的音频单元(通常是用于输出的音频单元)的 C 语言接口。
常用的接口:
AudioOutputUnitStart(...)[26]:启动一个 I/O AudioUnit,同时会启动与之连接的 AudioUnit Processing Graph。
AudioOutputUnitStop(...)[27]:关闭一个 I/O AudioUnit,同时会关闭与之连接的 AudioUnit Processing Graph。
Core Media Framework[28]:定义和封装了 AVFoundation 等更上层的媒体框架需要的媒体处理流水线(包含时间信息)以及其中使用的接口和数据类型。使用 Core Media 层的接口和数据类型可以高效的处理媒体采样数据、管理采样数据队列。下面是 Core Media 框架的主要模块:
1)Sample Processing[29]:采样数据处理。常用的数据类型:
typedef CMFormatDescriptionRef CMAudioFormatDescriptionRef;
。CMAudioFormatDescription 是一种 CMFormatDescriptionRef。这里我们还要补充介绍 CoreAudioTypes Framework 中的几种数据类型:
2)Time Representation[50]:时间信息表示。常用的数据类型:
3)Queues[54]:数据容器。常用的数据类型:
(void *)
元素,元素不能是 NULL 或 0,如果元素是指向分配内存的指针,其内存生命周期要在外面自己管理。可以用作音视频采样数据(CMSampleBufferRef)的队列,不过要自己加锁。Audio Toolbox Framework[58]:提供了音频录制、播放、流解析、编码格式转换、Audio Session 管理等功能接口。下面是 Audio Toolbox 框架的主要模块:
1)Audio Units[59]:音频单元。关于 Audio Unit 的内容,还可以参考上面讲到的 Audio Unit Framework。
2)Playback and Recording[65]:音频播放和录制。
3)Audio Files and Formats[69]:音频文件和格式。
4)Utilities[76]:其他音频功能支持。
AudioConverterNew(...)[78]:根据指定的输入和输出音频格式创建对应的转换器(编解码器)实例。
AudioConverterNewSpecific(...)[79]:根据指定的 codec 来创建一个新的音频转换器(编解码器)实例。
AudioConverterReset(...)[80]:重置音频转换器(编解码器)实例,并清理它的缓冲区。
AudioConverterDispose(...)[81]:释放音频转换器(编解码器)实例。
AudioConverterGetProperty(...)[82]:获取音频转换器(编解码器)的属性。
AudioConverterSetProperty(...)[83]:设置音频转换器(编解码器)的属性。
AudioConverterConvertBuffer(...)[84]:只用于一种特殊的情况下将音频数据从一种 LPCM 格式转换为另外一种,并且前后采样率一致。这个接口不支持大多数压缩编码格式。
AudioConverterFillComplexBuffer(...)[85]:转换(编码)回调函数提供的音频数据,支持不交错和包格式。大部分情况下都建议用这个接口,除非是要将音频数据从一种 LPCM 格式转换为另外一种。
AudioConverterComplexInputDataProc[86]:为 AudioConverterFillComplexBuffer(...)
接口提供输入数据的回调。
Audio Codec[87]:提供了支持将音频数据进行编码格式转换的 API。具体支持哪些编码格式取决于系统提供了哪些编解码器。
AVFoundation Framework[88] 是更上层的面向对象的一个音视频处理框架。它提供了音视频资源管理、相机设备管理、音视频处理、系统级音频交互管理的能力,功能非常强大。如果对其功能进行细分,可以分为如下几个模块:
在我们前面的 Demo 中封装 Muxer 和 Demuxer 及设置 AudioSession 时会用到 AVFoundation Framework 的一些能力,我们这里对应地介绍一下。
startSessionAtSourceTime:
开始写入会话,此后就可以使用对应的 AVAssetWriterInput 来写入媒体采样数据。startWriting
后调用,在写入媒体采样数据之前调用。finishWritingWithCompletionHandler:
结束写入时自动调用。readyForMoreMediaData
的状态。AVAssetReaderTrackOutput
、AVAssetReaderVideoCompositionOutput
等具体的实现类。以上这些框架及 API 基本上可以覆盖我们在前面的 Demo 中用到的能力了。
[1]
Audio Unit Hosting Guide for iOS: https://developer.apple.com/library/archive/documentation/MusicAudio/Conceptual/AudioUnitHostingGuide_iOS/AudioUnitHostingFundamentals/AudioUnitHostingFundamentals.html
[2]
Core Audio Overview: https://developer.apple.com/library/archive/documentation/MusicAudio/Conceptual/CoreAudioOverview/CoreAudioEssentials/CoreAudioEssentials.html
[3]
Audio Unit: https://developer.apple.com/documentation/audiounit?language=objc
[4]
Audio Component Services: https://developer.apple.com/documentation/audiounit/audio_component_services?language=objc
[5]
AudioComponent: https://developer.apple.com/documentation/audiotoolbox/audiocomponent?language=objc
[6]
AudioComponentDescription: https://developer.apple.com/documentation/audiotoolbox/audiocomponentdescription?language=objc
[7]
AudioComponentInstance: https://developer.apple.com/documentation/audiotoolbox/audiocomponentinstance?language=objc
[8]
AudioComponentFindNext(...): https://developer.apple.com/documentation/audiotoolbox/1410445-audiocomponentfindnext?language=objc
[9]
AudioComponentGetDescription(...): https://developer.apple.com/documentation/audiotoolbox/1410523-audiocomponentgetdescription?language=objc
[10]
AudioComponentInstanceNew(...): https://developer.apple.com/documentation/audiotoolbox/1410465-audiocomponentinstancenew?language=objc
[11]
AudioComponentInstanceDispose(...): https://developer.apple.com/documentation/audiotoolbox/1410508-audiocomponentinstancedispose
[12]
Audio Unit Component Services: https://developer.apple.com/documentation/audiounit/audio_component_services?language=objc
[13]
AudioUnit: https://developer.apple.com/documentation/audiotoolbox/audiounit?language=objc
[14]
AudioUnitParameter: https://developer.apple.com/documentation/audiotoolbox/audiounitparameter?language=objc
[15]
AudioUnitProperty: https://developer.apple.com/documentation/audiotoolbox/audiounitproperty?language=objc
[16]
AudioUnitInitialize(...): https://developer.apple.com/documentation/audiotoolbox/1439851-audiounitinitialize?language=objc
[17]
AudioUnitUninitialize(...): https://developer.apple.com/documentation/audiotoolbox/1438415-audiounituninitialize?language=objc
[18]
AudioUnitRender(...): https://developer.apple.com/documentation/audiotoolbox/1438430-audiounitrender?language=objc
[19]
AudioUnitGetProperty(...): https://developer.apple.com/documentation/audiotoolbox/1439840-audiounitgetproperty?language=objc
[20]
AudioUnitSetProperty(...): https://developer.apple.com/documentation/audiotoolbox/1440371-audiounitsetproperty?language=objc
[21]
AudioUnitGetParameter(...): https://developer.apple.com/documentation/audiotoolbox/1440055-audiounitgetparameter?language=objc
[22]
AudioUnitSetParameter(...): https://developer.apple.com/documentation/audiotoolbox/1438454-audiounitsetparameter?language=objc
[23]
AURenderCallback: https://developer.apple.com/documentation/audiotoolbox/aurendercallback?language=objc
[24]
AudioUnitPropertyListenerProc: https://developer.apple.com/documentation/audiotoolbox/audiounitpropertylistenerproc?language=objc
[25]
Output Audio Unit Services: https://developer.apple.com/documentation/audiounit/output_audio_unit_services?language=objc
[26]
AudioOutputUnitStart(...): https://developer.apple.com/documentation/audiotoolbox/1439763-audiooutputunitstart?language=objc
[27]
AudioOutputUnitStop(...): https://developer.apple.com/documentation/audiotoolbox/1440513-audiooutputunitstop?language=objc
[28]
Core Media: https://developer.apple.com/documentation/coremedia?language=objc
[29]
Sample Processing: https://developer.apple.com/documentation/coremedia?language=objc
[30]
CMSampleBuffer: https://developer.apple.com/documentation/coremedia/cmsamplebuffer-u71?language=objc
[31]
CMSampleBufferCreateReady(...): https://developer.apple.com/documentation/coremedia/1489513-cmsamplebuffercreateready?language=objc
[32]
CMSampleBufferCreate(...): https://developer.apple.com/documentation/coremedia/1489723-cmsamplebuffercreate?language=objc
[33]
CMSampleBufferSetDataBufferFromAudioBufferList(...): https://developer.apple.com/documentation/coremedia/1489725-cmsamplebuffersetdatabufferfroma?language=objc
[34]
CMSampleBufferGetFormatDescription(...): https://developer.apple.com/documentation/coremedia/1489185-cmsamplebuffergetformatdescripti?language=objc
[35]
CMSampleBufferGetDataBuffer(...): https://developer.apple.com/documentation/coremedia/1489629-cmsamplebuffergetdatabuffer?language=objc
[36]
CMSampleBufferGetPresentationTimeStamp(...): https://developer.apple.com/documentation/coremedia/1489252-cmsamplebuffergetpresentationtim?language=objc
[37]
CMBlockBuffer: https://developer.apple.com/documentation/coremedia/cmblockbuffer-u9i?language=objc
[38]
CMBlockBufferCreateWithMemoryBlock(...): https://developer.apple.com/documentation/coremedia/1489501-cmblockbuffercreatewithmemoryblo?language=objc
[39]
CMBlockBufferGetDataPointer(...): https://developer.apple.com/documentation/coremedia/1489264-cmblockbuffergetdatapointer?language=objc
[40]
CMFormatDescription: https://developer.apple.com/documentation/coremedia/cmformatdescription-u8g?language=objc
[41]
CMFormatDescriptionCreate(...): https://developer.apple.com/documentation/coremedia/1489182-cmformatdescriptioncreate?language=objc
[42]
CMAudioFormatDescription: https://developer.apple.com/documentation/coremedia/cmaudioformatdescription?language=objc
[43]
CMAudioFormatDescriptionCreate(...): https://developer.apple.com/documentation/coremedia/1489522-cmaudioformatdescriptioncreate?language=objc
[44]
CMAudioFormatDescriptionGetStreamBasicDescription(...): https://developer.apple.com/documentation/coremedia/1489226-cmaudioformatdescriptiongetstrea?language=objc
[45]
CMAttachment: https://developer.apple.com/documentation/coremedia/cmattachment?language=objc
[46]
AudioStreamBasicDescription: https://developer.apple.com/documentation/coreaudiotypes/audiostreambasicdescription?language=objc
[47]
AudioBuffer: https://developer.apple.com/documentation/coreaudiotypes/audiobuffer?language=objc
[48]
AudioBufferList: https://developer.apple.com/documentation/coreaudiotypes/audiobufferlist?language=objc
[49]
AudioTimeStamp: https://developer.apple.com/documentation/coreaudiotypes/audiotimestamp?language=objc
[50]
Time Representation: https://developer.apple.com/documentation/coremedia?language=objc
[51]
CMTime: https://developer.apple.com/documentation/coremedia/cmtime-u58?language=objc
[52]
CMTimeRange: https://developer.apple.com/documentation/coremedia/cmtimerange-qts?language=objc
[53]
CMSampleTimingInfo: https://developer.apple.com/documentation/coremedia/cmsampletiminginfo?language=objc
[54]
Queues: https://developer.apple.com/documentation/coremedia?language=objc
[55]
CMSimpleQueue: https://developer.apple.com/documentation/coremedia/cmsimplequeue?language=objc
[56]
CMBufferQueue: https://developer.apple.com/documentation/coremedia/cmbufferqueue?language=objc
[57]
CMMemoryPool: https://developer.apple.com/documentation/coremedia/cmmemorypool-u89?language=objc
[58]
Audio Toolbox: https://developer.apple.com/documentation/audiotoolbox?language=objc
[59]
Audio Units: https://developer.apple.com/documentation/audiotoolbox?language=objc
[60]
Audio Unit v3 Plug-Ins: https://developer.apple.com/documentation/audiotoolbox/audio_unit_v3_plug-ins?language=objc
[61]
Audio Components: https://developer.apple.com/documentation/audiotoolbox/audio_components?language=objc
[62]
Audio Unit v2 (C) API: https://developer.apple.com/documentation/audiotoolbox/audio_unit_v2_c_api?language=objc
[63]
Audio Unit Properties: https://developer.apple.com/documentation/audiotoolbox/audio_unit_properties?language=objc
[64]
Audio Unit Voice I/O: https://developer.apple.com/documentation/audiotoolbox/audio_unit_voice_i_o?language=objc
[65]
Playback and Recording: https://developer.apple.com/documentation/audiotoolbox?language=objc
[66]
Audio Queue Services: https://developer.apple.com/documentation/audiotoolbox/audio_queue_services?language=objc
[67]
Audio Services: https://developer.apple.com/documentation/audiotoolbox/audio_services?language=objc
[68]
Music Player: https://developer.apple.com/documentation/audiotoolbox/music_player?language=objc
[69]
Audio Files and Formats: https://developer.apple.com/documentation/audiotoolbox?language=objc
[70]
Audio Format Services: https://developer.apple.com/documentation/audiotoolbox/audio_format_services?language=objc
[71]
Audio File Services: https://developer.apple.com/documentation/audiotoolbox/audio_file_services?language=objc
[72]
Extended Audio File Services: https://developer.apple.com/documentation/audiotoolbox/extended_audio_file_services?language=objc
[73]
Audio File Stream Services: https://developer.apple.com/documentation/audiotoolbox/audio_file_stream_services?language=objc
[74]
Audio File Components: https://developer.apple.com/documentation/audiotoolbox/audio_file_components?language=objc
[75]
Core Audio File Format: https://developer.apple.com/documentation/audiotoolbox/core_audio_file_format?language=objc
[76]
Utilities: https://developer.apple.com/documentation/audiotoolbox?language=objc
[77]
Audio Converter Services: https://developer.apple.com/documentation/audiotoolbox/audio_converter_services?language=objc
[78]
AudioConverterNew(...): https://developer.apple.com/documentation/audiotoolbox/1502936-audioconverternew?language=objc
[79]
AudioConverterNewSpecific(...): https://developer.apple.com/documentation/audiotoolbox/1503356-audioconverternewspecific?language=objc
[80]
AudioConverterReset(...): https://developer.apple.com/documentation/audiotoolbox/1503102-audioconverterreset?language=objc
[81]
AudioConverterDispose(...): https://developer.apple.com/documentation/audiotoolbox/1502671-audioconverterdispose?language=objc
[82]
AudioConverterGetProperty(...): https://developer.apple.com/documentation/audiotoolbox/1502731-audioconvertergetproperty?language=objc
[83]
AudioConverterSetProperty(...): https://developer.apple.com/documentation/audiotoolbox/1501675-audioconvertersetproperty?language=objc
[84]
AudioConverterConvertBuffer(...): https://developer.apple.com/documentation/audiotoolbox/1503345-audioconverterconvertbuffer?language=objc
[85]
AudioConverterFillComplexBuffer(...): https://developer.apple.com/documentation/audiotoolbox/1503098-audioconverterfillcomplexbuffer?language=objc
[86]
AudioConverterComplexInputDataProc: https://developer.apple.com/documentation/audiotoolbox/audioconvertercomplexinputdataproc?language=objc
[87]
Audio Codec: https://developer.apple.com/documentation/audiotoolbox/audio_codec?language=objc
[88]
AVFoundation Framework: https://developer.apple.com/documentation/avfoundation?language=objc
[89]
AVAssetWriter: https://developer.apple.com/documentation/avfoundation/avassetwriter?language=objc
[90]
canAddInput:: https://developer.apple.com/documentation/avfoundation/avassetwriter/1387863-canaddinput?language=objc
[91]
addInput:: https://developer.apple.com/documentation/avfoundation/avassetwriter/1390389-addinput?language=objc
[92]
startWriting: https://developer.apple.com/documentation/avfoundation/avassetwriter/1386724-startwriting?language=objc
[93]
startSession(atSourceTime:): https://developer.apple.com/documentation/avfoundation/avassetwriter/1389908-startsessionatsourcetime?language=objc
[94]
endSessionAtSourceTime:: https://developer.apple.com/documentation/avfoundation/avassetwriter/1389921-endsessionatsourcetime?language=objc
[95]
finishWritingWithCompletionHandler:: https://developer.apple.com/documentation/avfoundation/avassetwriter/1390432-finishwriting?language=objc
[96]
cancelWriting: https://developer.apple.com/documentation/avfoundation/avassetwriter/1387234-cancelwriting?language=objc
[97]
AVAssetWriterInput: https://developer.apple.com/documentation/avfoundation/avassetwriterinput?language=objc
[98]
expectsMediaDataInRealTime: https://developer.apple.com/documentation/avfoundation/avassetwriterinput/1387827-expectsmediadatainrealtime?language=objc
[99]
readyForMoreMediaData: https://developer.apple.com/documentation/avfoundation/avassetwriterinput/1389084-readyformoremediadata?language=objc
[100]
requestMediaDataWhenReadyOnQueue:usingBlock:: https://developer.apple.com/documentation/avfoundation/avassetwriterinput?language=objc
[101]
appendSampleBuffer:: https://developer.apple.com/documentation/avfoundation/avassetwriterinput/1389566-appendsamplebuffer?language=objc
[102]
markAsFinished: https://developer.apple.com/documentation/avfoundation/avassetwriterinput/1390122-markasfinished?language=objc
[103]
AVAssetReader: https://developer.apple.com/documentation/avfoundation/avassetreader?language=objc
[104]
canAddOutput:: https://developer.apple.com/documentation/avfoundation/avassetreader/1387485-canaddoutput?language=objc
[105]
addOutput:: https://developer.apple.com/documentation/avfoundation/avassetreader/1390110-addoutput?language=objc
[106]
startReading: https://developer.apple.com/documentation/avfoundation/avassetreader/1390286-startreading?language=objc
[107]
cancelReading: https://developer.apple.com/documentation/avfoundation/avassetreader/1390258-cancelreading?language=objc
[108]
AVAssetReaderOutput: https://developer.apple.com/documentation/avfoundation/avassetreaderoutput?language=objc
[109]
AVAssetReaderTrackOutput: https://developer.apple.com/documentation/avfoundation/avassetreadertrackoutput?language=objc
[110]
alwaysCopiesSampleData: https://developer.apple.com/documentation/avfoundation/avassetreaderoutput/1389189-alwayscopiessampledata?language=objc
[111]
copyNextSampleBuffer: https://developer.apple.com/documentation/avfoundation/avassetreaderoutput/1385732-copynextsamplebuffer?language=objc
[112]
AVAudioSession: https://developer.apple.com/documentation/avfaudio/avaudiosession?language=objc
[113]
setCategory:withOptions:error:: https://developer.apple.com/documentation/avfaudio/avaudiosession/1616442-setcategory?language=objc
[114]
setMode:error:: https://developer.apple.com/documentation/avfaudio/avaudiosession/1616614-setmode?language=objc
[115]
setActive:withOptions:error:: https://developer.apple.com/documentation/avfaudio/avaudiosession?language=objc
推荐阅读