Some of you, who came across the term ‘3D sound’, probably are wondering what it actually is?
Intuitively, we may understand it by analogy to 3D space, which as we know is made up of three dimensions: width, depth, and height. In real life, any position in space is characterized by these dimensions.
Now imagine, that you are outside, for example in a park, full of various sounds. You can hear children laughing on the playground behind you; a dog is barking on your left, a couple of people are talking while sitting on a bench in front of you – you can hear their voices more and more clearly when you move towards them. We may say that you are in the 3-dimensional sound space. When you move, the sound you hear changes as well, corresponding to the position of your ears (its intensity, direction, height and even timbre).
Reproducing this natural human way of hearing in the recording isn’t easy, however, we already have the technology which allows us to do it – it is binaural and Ambisonics sound.
The most common type of recordings – which you hear while listening to the music on your computer or watching TV –is stereo. You may also come across mono recordings but these have been outdated with the introduction of stereo mixing. Stereo sound basically allows you only to hear if the sound comes from left or right which is the main advantage over the mono-type of audio.
Bellow, you may find some great recordings that will allow you to hear the difference between these different formats. We have also summarized their characteristics in the bullet points, to present the information most clearly.
#zylia #binaural #ambisonics #3Daudio #surround #spatial #sound
by Pedro Firmino
This tutorial is based on the solution developed by professor Angelo Farina for preparing a 360 video with 3rd Order audio (source http://www.angelofarina.it/Ambix+HL.htm).
In this adaptation, we will show you how to create a 360 video with 3rd Order Ambisonics audio using:
This tutorial consists in 2 parts:
A: Preparing the 360 content with 16 channels
B: Injecting metadata using Spatial Media Injector version, modified by Angelo Farina.
At the moment, only HOAST library ( https://hoast.iem.at/ ) is the only platform which allows online video playback of 3rd Order Ambisonics and therefore the content created from this tutorial is meant to be watched locally using VLC player.
For this tutorial, basic Python knowledge is advised.
For preparing a 360 video with 1st order Ambisonics, visit the link:
1. As usual, start by recording your 360 video with the ZYLIA ZM-1 microphone and remember to have the front of the ZM-1 aligned with the front of the 360 camera.
2. After recording, import the 360 video and the 19 Multichannel audio file into Reaper.
Syncronize the audio and video.
3. On the ZM-1 audio track, insert ZYLIA Ambisonics Converter and select 3rd Order Ambisonics. This will decode your 19 multichannel track into 16 channels (3rd Order Ambisonics).
4. On the Master track, click on the Route button, On the track channels, select 16. Now you are receiving the signal from the 16 channels from the audio track.
5. Once the video is ready for exporting, click File – Render.
As for the settings:
Sample rate: 48000
Channels: 16 (click on the space and manually type 16)
Output format: Video (ffmpeg/libav encoder)
Size: 3840 x 1920 (or Get width/height/framerate from current video item
Video Codec: H.264
Audio Codec: 24 bit PCM
Render the video.
After having the 360 video with 16 channels, it is necessary to inject metadata for Spatial Audio.
In order to do this, Python is required. Python is preinstalled in macOS but
you can download Python 2.7 version here: https://www.python.org/download/releases/2.7/
Afterward, download Angelo Farina’s modified version of Spatial Media Metadata Injector, located at:
The next part:
1. With the downloaded file located in your Desktop, run macOS Terminal application.
2. Using “cd” command, go to folder where you have Spatial Media Injector (eg. “cd ~/Desktop/spatial-media-2/”)
3. Run Python script “sudo python setup.py install”. Type your password.
After the build is complete, type command: “cd build/lib/spatialmedia”
6. Enter python gui.py and the application should run.
With the Spatial Media Metadata Injector opened, simply open the created 360 video file, and check the boxes for the 360 format and spatial audio. Inject metadata and your video will be ready for playback using 3rd Order Ambisonics audio.
In this tutorial we describe the process of converting 360 video and 3rd order Ambisonics to 2D video with binaural audio with linked rotation parameters.
This allows us to prepare a standard 2D video while keeping the focus on the action from the video and audio perspective.
It also allows us to control the video and audio rotation in real time using a single controller.
Reaper DAW was used to create automated rotation of 360 audio and video.
Audio recorded with ZYLIA ZM-1 microphone array.
Below you will find our video and text tutorial which demonstrate the setup process.
Thank you Red Bull Media House for providing us with the Ambisonics audio and 360 video for this project.
Ambisonics audio and 360 video is Copyrighted by Red Bull Media House Chief Innovation Office and Projekt Spielberg, contact: cino (@) redbull.com
Created by Zylia Inc. / sp. z o.o. https://www.zylia.co
Requirements for this tutorial:
We will use Reaper as a DAW and video editor, as it supports video and multichannel audio from the ZM-1 microphone.
Before recording the 360 video with the ZM-1 microphone make sure to have the front of the camera pointing the same direction as the front of the ZM-1 (red dot on the equator represents the front of the ZM-1 microphone) , this is to prevent future problems and to know in which direction to rotate the audio and video.
Step 1 - Add your 360 video to a Reaper session.
The video file format may be .mov .mp4 .avi or other.
From our experience we recommend to work on a compressed version of the video and replace this media file later for rendering (step 14).
To open the Video window click on View – VIDEO or press Control + Shift + V to show the video.
Step 2 - Add the multichannel track recorded with the ZM-1 and sync the Video with the ZM-1 Audio track.
Import the 19 channel file from your ZM-1 and sync it with the video file.
Step 3 – Disable or lower the volume of the Audio track from the video file.
Since we will not use the audio from the video track, we require to remove or put the volume from the audio track at minimum value.
To do so, right click on the Video track – Item properties – move the volume slider to the minimum.
Step 4 – Merge video and audio on the same track.
Select both the video and audio track and right click – Take – implode items across tracks into takes
This will merge video and audio to the same track but as different takes.
Step 5 – Show both takes.
To show both takes, click on Options – Show all takes in lanes (when room) or press Ctrl + L
Step 6 – Change the number of channels to 20.
Click on the Route button and change the number of track channels from 2 to 20, this is required to utilize the 19 multichannel of the ZM-1.
Step 7 - Play both takes simultaneously.
If we press play right now, it will only play the selected take, therefore we need to be able to play both takes simultaneously, therefore:
Right click on the track – Item settings – Play all takes.
Step 8 – Change 360 video to standard video.
Next we will need to convert the 360 video to equirectangular video to visualize and control the rotation of the camera.
To do so, open the FX window on our main track and search for Video processor.
On the preset selection, choose Equirectangular/spherical 360 panner, this will flatten your 360 video allowing you to control the camera parameters such as field of view, yaw, pitch and roll.
Step 9 – As FX, add ZYLIA Ambisonics Converter plugin and IEM binaural Converter.
On the FX window add as well:
You should now have the binaural audio which you can test by changing the rotation and elevation parameters in ZYLIA Ambisonics Converter plugin.
Step 10 – Link the rotation of both audio and video.
The next steps will be dedicated to linking the Rotation of the ZYLIA Ambisonics Converter and the YAW parameter from the Video Processor.
On the main track, click on the Track Envelopes/Automation button and enable the UI for the YAW (in Equirectangular/spherical 360 panner) and Rotation (in ZYLIA Ambisonics Converter plugin).
Step 11 – Control Video yaw with the ZYLIA Ambisonics Converter plugin.
On the same window, on the YAW parameters click on Mod… (Parameter Modulation/Link for YAW) and check the box Link from MIDI or FX parameter.
Select ZYLIA Ambisonics plugin: Rotation
Step 12 – Align the position of the audio and video using the Offset control.
On the Parameter Modulation window you are able to fine-tune the rotation of the audio with the video.
Here we changed the ZYLIA Ambisonics plugin Rotation Offset to -50 % to allow the front of the video match the front of the ZM-1 microphone.
Step 13 – Change the Envelope mode to Write.
To record the automation of this rotation effect, right-click on the Rotation parameter and select Envelope to make the envelope visible.
After, on the Rotation Envelope Arm button (green button), right click and change the mode to write.
By pressing play you will record the automation of video and audio rotation in real time.
Step 14 – Prepare for Rendering
After writing the automation, change the envelope to Read mode instead of Write mode.
Disable the parameter modulation from the YAW control:
Right click on Yaw and uncheck “Link from MIDI or FX parameter”
OPTIONAL: Replace your video file with the uncompressed version.
If you have been working with a compressed video file, this is the time to replace it with the original media file. To do this, right click on the video track and select item properties.
Scroll to the next page and click Choose new file.
Then select your original uncompressed video file.
Step 15 – Render!
You should now have your project ready for Rendering.
Click on File – Render and set Channels to Stereo.
On the Output format choose your preferred Video format.
We exported our clip in .mov file with video codec H.264 and 24bit PCM for the Audio Codec.
Thank you for reading and don’t hesitate to contact us with any feedback, questions or your results from following this guide.
To support our customers and their workflow we have prepared several presets of ZYLIA Studio PRO for Dolby Atmos.
Simply download the zip package, extract files and import the appropriate surround preset into your Reaper session.
What is Dolby Atmos?
Wikipedia - "Dolby Atmos is a surround sound technology developed by Dolby Laboratories. It expands on existing surround sound systems by adding height channels, allowing sounds to be interpreted as three-dimensional objects."
Read more at Wikipedia>
by Eduardo Patricio
In general VR related workflows can be complex and everyone seems to be looking for standard solutions. Here, we will show you, step by step, how to prepare a 360 video with spatial audio in, possibly, the shortest way!
After following steps A, B and C, you’ll have a video file with 1st order Ambisonics spatial audio that can be played on your computer with compatible video players (e.g. VLC) or uploaded to YouTube.
OK, let’s have a close look at each step.
Having said that, a small horizontal offset is not the end of the world
With the gear in place, start recording both audio and video and clap in between the mic and the camera. The clap sound spike can be used to sync the footage later.
Here’s a video showing all the sub-steps in Reaper:
If you need to check how the recording sounds, add a binaural decoder plugin (e.g. IEM Binaural decoder) to the audio track, after ZYLIA Ambisonics Converter.
Now you can enjoy the spatial audio
*Software tools used
Allegro generic alternative for us to test: https://allegro.pl/oferta/ramie-przegubowe-11-magic-arm-do-kamery-8505530470
We are happy to announce the new release of ZYLIA Ambisonics Converter plugin v1.4.0.
New features and improvements:
What would happen if on a rainy and cloudy day, during a walk along a forest path, you could move into a completely different place thousands of kilometers away from you? Putting the goggles on would get you into a virtual reality world, you would find yourself on a sunny island in the Pacific Ocean, you would be on the beach, admiring the scenery and walking among the palm trees listening to the sound of waves and colorful parrots screeching over your head.
It sounds unrealistic, but such goals are determined by the latest trends in the development of Augmented / Virtual Reality technology (AR / VR). Technology and content for full VR or 6DoF (6 Degrees-of-Freedom) rendered in real time will give the user the opportunity to interact and navigate through virtual worlds. To experience the feeling of "full immersion" in the virtual world, realistic sound must also follow a high-level image. Therefore, only each individual sound source present in virtual audio landscape provided to the user as a single object signal can reliably reflect both the environment and the way the user interacts with it.
What are Six Degrees of Freedom (6DOF)
"Six degrees of freedom" is a specific parameter count for the number of degrees of freedom an object has in three-dimensional space, such as the real world. It means that there are six parameters or ways that the object can move.
There are many possibilities of using a 6DoF VR technology. You can imagine exploring a movie plan in your own pace. You could stroll between the actors, look at the action from different sides, listen to any conversations and paying attention to what is interesting only for you. Such technology would provide really unique experiences.
A wide spectrum of virtual reality applications drives the development of technology in the audio-visual industry. Until now, image-related technologies have been developing much faster, leaving the sound far behind. We have made the first attempts to show that 6DoF for sound is also achievable.
How to record audio in 6DoF?
It's extremely challenging to record high-quality sound from many sources present in the sound scene at the same time. We managed to do this using nine ZYLIA ZM-1 multi-track microphone arrays evenly spaced in the room.
In our experiment the sound field was captured using two different spatial arrangements of ZYLIA ZM-1 microphones placed within and around the recorded sound scenes. In the first arrangement, nine ZYLIA ZM-1 microphones were placed on a rectangular grid. Second configuration consisted of seven microphones placed on a grid composed of equilateral triangles.
Fig. Setup of 9 and 7 ZYLIA ZM-1 microphone arrays
Microphone signals were captured using a personal computer running GNU/Linux operating system. Signals originating from individual ZM-1 arrays were recorded with the specially designed software.
We recorded a few takes of musical performance with instruments such as an Irish bouzouki (stringed instrument similar to the mandolin), a tabla (Indian drums), acoustic guitars and a cajon.
Unity and 3D audio
To present interesting possibilities of using audio recorded with multiple microphone arrays we have created a Unity project with 7 Ambisonics sources. In this simulated environment, you will find three sound sources (our musicians) represented by bonfires among whom you can move around. Experiencing fluent immersive audio becomes so natural that you can actually feel being inside of this scene.
MPEG Standardization Committee
For scenario A, we used the regular stitched video from the Gear 360 and a 1st order Ambisonics audio file.
Scenario A - Basic steps taken:
Here are the detailed steps taken for the conversion to Ambisonics:
* Standard currently (August 2018) used on YouTube.
Since, the source material is the same as the one from scenario A, we’ll list here only the steps that differ.
Scenario B steps:
- Process stereoscopic video from Gear 360 on Insta360 Studio to have the ‘tiny planet’ effect;
- Convert the raw 19-channel file from ZYLIA Studio to binaural, using ZYLIA Studio PRO running in REAPER.
- Edit 360-degree video and Ambisonics audio on Adobe Premiere.
We had a great pleasure to meet Yao Wang during our visit at Berklee College of Music. A few days ago Yao published her project 'Unraveled' - it is a phenomenal immersive 360 audio and visual experience. You as a listener find yourself at the center of all elements, you are surrounded by choir, strings, synths, and imagery. You can experience being in the middle of the music scene.
Get to know more about this project and read an interview with Yao Wang.
Art work by @cdelcastillo.art
Yao: Last spring, I was 9 months away from graduating from Berklee College of Music, and the panic of post-graduation uncertainty was becoming unbearable. I was struggling to plan my career and I wanted to do something different. I spent a whole summer researching the ins and outs of spatial audio and decided to do my Senior Portfolio Project around my research. What I have found is that spatial audio is often found in VR games and films - recreating a 3D environment. It is rarely used as a tool for music composition and production. I saw my opportunity.
With the help and hard work of my team (around 60 students involved), we succeeded in creating ‘Unraveled’, an immersive 360 audio and visual experience, where the audience would find themselves at the center of all elements, being surrounded by choir, strings, synths and imagery. My role was the project leader, composer, and executive producer. I found a most talented team of friends to work on this together: Gareth Wong and Deniz Turan as co-producers, Carlos Del Castillo as visual designer, Ben Knorr as music contractor, Paden Osburn as music contractor and conductor, Jeffrey Millonig as lead engineer and Sherry Li as lead vocalist and lyricist. Not to mention the wonderful musicians and choir members. I am truly grateful for their hard work, dedication and focus.
‘Unraveled’ also officially kickstarts my company ICTUS, a company that provides music and sound design content specializing in spatial audio solutions. For immersive experiences such as VR, AR and MR, we are your one-stop audio shop for a soundscape that completes the reality. We provide music composition, sound design, 360 recording, mixing, mastering, post-production, spatialization and visualizing services tailored to your unique project.
We are incredibly humbled that 'Unraveled' has been officially selected for the upcoming 44th Seattle International Film Festival, which runs May 17 to June 10, and to have been accepted for the Art and Technology Exhibition at the Boston Cyberarts Gallery, from Saturday May 26 to Sunday July 1.
‘Unraveled’ has been officially selected for the upcoming 44th Seattle International Film Festival, which runs May 17 - June 10, with more than 400 films from 80 countries, running 25 days, and with over 155,000 attendees!
Yao: I worked very closely with Paden Osburn, the conductor and music contractor, to schedule, revise, coordinate and plan the session. Paden is a dear to work with, basically allowing me to focus on the music while she coordinated with the rest of the amazing choir members. We had developed a great workflow.
I also had many meetings with the team of engineers as well as many professors to figure out the simplest, most efficient way to record. It was indeed very challenging and stressful to pull off, but it was also one the most magical night of my life.
Yao: On October 27, 2017, we had a recording session of the choir parts with 40 students from Berklee College of Music. The recording was done using three ambisonic microphones (Zylia, Ambeo, TetraMic). We tried forging a 320 piece choir by asking the 40 students to shift their positions around the microphones for every overdub. We also recorded 12 close mic-ed singers to have some freedom spatializing individual mono sources.
The spatialization was achieved through Facebook360 Spatial Workstation in REAPER. Many sound design elements were created in Ableton and REAPER. The visuals were done in Unity. We basically created a VR game and recorded a 360 video of the performance. Carlos Del Castillo did an outstanding job creating an abstract world that had many moments syncing with the musical cues.
Zylia: What do you think about using ZYLIA ZM-1 mic for recording an immersive 360 audio?
Yao: I clearly remember meeting Tomasz Zernicki on the Sunday prior to our choir session. The Zylia team came to Berklee and demonstrated the capabilities of their awesome microphone, and I thought I had nothing to lose, so I asked for a potential (and super last minute!) collaboration that has proven to be fruitful. This has also brought me great friendship with Edward C. Wersocki who operated the microphone at our session. Unfortunately, he couldn't stay for the whole session, so only partial lines were recorded with the ZYLIA. He also guided me with the A to B format conversion which was extremely easy and user-friendly. I loved the collaboration and will only keep pursuing and exploring more possibilities with spatial audio. Hopefully, this will be the first of many collaborations.
Behind the scene, ZYLIA ZM-1, photo by @jamiexu0528.
Yao: My long-term goal would be to establish my company ICTUS as one of the leading experts in the field of spatial audio. We are currently working on an interactive VR music experience called ‘Flow’ with an ethnic ensemble, GAIA, and the visuals are influenced by Chinese water paintings. The organic nature of this project will be a nice contrast to ‘Unraveled’s futuristic space vibe.
Another segment of the company is focused on creating high quality, cinematic spatial audio for VR films and games. We are producing a 3D audio series featuring short horror/thriller stories with music, descriptive narration, dialogues, SFX and soundscapes. Empathy is truly at the heart of this project, some of our stories will have a humanitarian purpose and we will be associated with many organizations that are fighting to end domestic abuse, human trafficking, rape, abuse and other violent crimes. We hope to bring more awareness and traffic to these causes with our art. Spatial audio is incredibly powerful, it really allows you to be in the shoe of the victims and without the visuals, I swear your imagination will go crazy!
Yao is a composer, sound designer, producer and artist. She recently graduated from Berklee College of Music with a Bachelor of Music Degree in Electronic Production & Design and Film Scoring. Passionate about immersive worlds and storytelling, Yao has made it her mission to pursue a career combining her love for music, sound and technology. With this mission in mind, she is now the CEO and founder of ICTUS, a company that provides spatial audio solutions for multimedia.
There are also quality improvements and bug fixes for 2nd and 3rd order HOA. This update significantly increases the perceptual effect of rotation in HOA domain as well as corrects spatial resolution for 2nd and 3rd order. It is recommended to update to this new version.