Audio manipulation in iOS can be very daunting. The documentation for low level core audio doesn’t seem to be up to par. There are, of course, frameworks out there that can be used to make things a little less painful when dealing with low level core audio. However, those may not fit your needs or they may be a bit excessive for what you need. So let’s get to something a little simpler but that is still quite powerful, AVAudioEngine.

AVAudioEngine is not a new API but it’s one that doesn’t seem to have gained a whole lot of exposure. It was introduced by Apple at the 2014 WWDC. It’s part of AVFoundation, which is a framework used for playback, recording, and editing media. AVAudioEngine simplifies low-latency, real-time audio. It can’t do everything that core audio can, but for playing, recording, mixing, effects, and even working with MIDI and samplers, it can be quite powerful.

The AVAudioEngine class manages graphs of audio nodes. The engine is used to connect the nodes into active chains. The graph can be reconfigured and the nodes can be created and attached as needed. There are 3 types of nodes - source, processing, and destination. These are pretty self explanatory, but just for the sake of clarity, source nodes are the player and microphone, processing nodes mixers and effects, and the destination nodes are the speaker and headphones. The engine creates three nodes implicitly, an input node, an output node, and the main mixer node. Originally each node could only have a single output, however thanks to

AVAudioConnectionPoint(node: AVAudioNode, bus: ABAudioNodeBus),

multiple outputs are now possible. Connections between each node are made via busses, which each have an audio format. One of the gotchas with AVAudioEngine seems to be these formats. If the formats are not compatible, you will most likely end up with a broken graph. This can be quite frustrating as the error messages are not real helpful in some cases. If you need to change formats, make sure you use a mixer node to avoid any problems.

So let's see what it looks like in action. You're basic steps are:

1) To set up an audio session
do {
    // Here recordingSession is just a shared instance of AVAudioSession

     try recordingSession.setCategory(AVAudioSessionCategoryPlayAndRecord, with: [.mixWithOthers, .defaultToSpeaker]) // There are several options here - choose what best suits your needs
     try recordingSession.setActive(true)

     // I suggest adding notifications here for route and configuration changes
 catch {
     // Handle the error

2) Create the engine
  let audioEngine = AVAudioEngine()

3) Create the nodes
  let audioMixer = AVAudioMixerNode()
 let micMixer = AVAudioMixerNode()
 let reverb = AVAudioUnitReverb()
 let echo = AVAudioUnitDelay()
 let audioPlayerNode = AVAudioPlayerNode()

4) Attach the nodes

5) Connect the nodes
 // Sound effect connections

 audioEngine.connect(audioMixer, to: audioEngine.mainMixerNode, format: audioFormat)
 audioEngine.connect(echo, to: audioMixer, fromBus: 0, toBus: 0, format: audioFormat)
 audioEngine.connect(reverb, to: echo, fromBus: 0, toBus: 0, format: audioFormat)
 audioEngine.connect(micMixer, to: reverb, format: audioFormat)

 // Here we're making multiple output connections from the player node 1) to the main mixer and 2) to another mixer node we're using for adding effects.

   let playerConnectionPoints = [
       AVAudioConnectionPoint(node: audioEngine.mainMixerNode, bus: 0),
       AVAudioConnectionPoint(node: audioMixer, bus: 1)

   audioEngine.connect(audioPlayerNode, to: playerConnectionPoints, fromBus: 0, format: audioFormat)

   // Finally making the connection for the mic input

   guard let micInput = audioEngine.inputNode else { return }

   let micFormat = micInput.inputFormat(forBus: 0)
   audioEngine.connect(micInput, to: micMixer, format: micFormat)

6) Prepare the files for playing/recording  
do {
   // Here trackURL is our audio track
       if let trackURL = trackURL {
           audioPlayerFile = try AVAudioFile.init(forReading: trackURL)
   catch {

   // Schedule track audio immediately if read is successful

   guard let audioPlayerFile = audioPlayerFile else { return }

   audioPlayerNode.scheduleFile(audioPlayerFile, at: nil, completionHandler: nil)

   audioURL = URL(fileURLWithPath: <YOUR_OUTPUT_URL>)

   if let audioURL = audioURL {
       do {
           self.recordedOutputFile = try AVAudioFile(forWriting:  audioURL, settings: audioMixer.outputFormat(forBus: 0).settings)
       catch {
           // HANDLE THE ERROR

7) Start the engine
    // Prepare and start audioEngine
   do {
       try audioEngine.start()
   catch {
       // HANDLE ERROR

  8) Add tap to record the output
     guard let recordedOutputFile = recordedOutputFile,
       let audioPlayerFile = audioPlayerFile else { return }

     let tapFormat = audioMixer.outputFormat(forBus: 0)

   // The data is collected from the buffer using a block
     audioMixer.installTap(onBus: 0, bufferSize: AVAudioFrameCount(recordedOutputFile.length), format: tapFormat)
     { buffer, _ in

         do {
             if recordedOutputFile.length < audioPlayerFile.length {
                 try self.recordedOutputFile?.write(from: buffer)
             else {
                 self.audioMixer.removeTap(onBus: 0)
                 // Handle your UI changes here

         catch {
             // Handle error

That's it, you're playing, processing, and recording! There is a whole lot more that can be done with AVAudioEngine. I suggest diving in. Documentation isn't always the greatest and trial and error in some cases is necessary. But, it's a little friendlier to use than the low level core audio!


Topics: Apple/ iOS

Christina McIntyre

Written by Christina McIntyre

Christina McIntyre, or Chris to those who really know her, joined Metova in 2015. She enjoys spending time with her husband and 3 kids. In her down time, she enjoys hunting (especially archery) and being involved in her church (particularly the bus ministry.)

Enjoy the article? Share it!