Australia's leading broadcast technology magazine & website
   Home   |   TV   |   Radio   |   Audio   |   Video & Post   |   Broadcast   |   Careers   
Search


Sections
  Web Guide
  Broadcast + Media Jobs
  Calendar
  Classifieds
  Products

White Papers News
Monitoring the audio signal

 
In the past, it was not unusual for audio signals to receive less attention than video from broadcasters and content creators alike due to time and budget restrictions. Nevertheless, discerning members of an audience do hear the difference—a situation that reflects poorly on the TV station or producer. In an industry where reputation and ratings are everything, poor content quality is not a competitive advantage. Now, with new technologies like surround sound being used to sell standard definition (SD) and high definition (HD) television sets, processing and distributing a clean audio signal throughout a video facility is more important than ever before. With the advent of multi-channel broadcast and post-production facilities, audio monitoring has become an integral part of quality assurance and broadcast monitoring strategies everywhere. A key component of such strategies is the audio monitor—a tool designed to take the guesswork out of what is often an arduous and confusing task. There are many attributes of audio that must be understood in order to get the most out of an audio monitoring system. Getting familiar with various terms and definitions, as well as understanding how an audio monitor works, enables users to immediately understand problems and quickly resolve them. The latest audio monitors even offer automated monitoring, requiring minimal human intervention.



Understanding Audio Signals

Most production technologies recognize audio as either a balanced or an unbalanced signal. Unbalanced systems use a signal and ground, with shielded conductors sometimes used as well. Connecting unbalanced signals is simple using relatively inexpensive cables and connectors, such as the RCA jack. The lower cost and complexity of unbalanced signals allows them to be used on consumer products, where they are most commonly found. Unfortunately, this means most consumer products don’t enjoy the same noise and common-mode rejection properties associated with balanced audio signals and are therefore more susceptible to interfering signals. Additionally, consumer products also are limited to short cable runs. A balanced signal consists of two components, equal in amplitude but opposite in polarity, while the impedance characteristics of the conductors are matched. The cables used to distribute balanced signals usually have three conductors; two arranged as a twisted pair while the third serves as a shield to minimize interference. Balanced audio systems, using XLR connectors, are popular in professional applications where noise reduction and high signal amplitude outweigh the interconnect complexity and higher cost.

When setting up a professional audio environment, it’s important to make sure the cables used within the facility are correctly wired to prevent problems. A good audio monitoring system can detect this, saving engineers significant time, money and frustration.



Basic Audio Monitoring

The simplest form of audio monitoring is accomplished using a level meter that displays the audio signal’s amplitude. There are two types of metering that address this—a Volume Unit (VU) meter or a Peak Program Meter (PPM). The VU meter displays the average volume level of the audio signal, has symmetrical rise and fall times, and a relatively long integration time (typically 300ms). A PPM displays the peak volume level of the audio signal with a fast rise time (10ms), a slow fall time (2.85s) and a 10ms integration time.

Because of these inherent differences, it’s rare for a VU meter and PPM to have identical responses to the same audio content.

When testing an audio signal, the PPM must read lower than the VU meter to make them comparable. Broadcasters have found that 8 dB is a good average difference between peak–to–reading ratio of the PPM and VU meter. A test tone reading of 0 on the VU meter should read –8 dB on the PPM. With this alignment, the PPM provides more reliable control of program peak levels. Audio program material should be adjusted to have peak amplitude of 0 dB on a PPM.



The Lissajous pattern

The Lissajous display provides instant feedback about the overall energy distribution during an audio mix. A good monitor can identify errors, such as clipping, which appear on the Lissajous display as “squared off” pattern edges. The pattern orientation indicates at a glance whether the present mix is balanced or concentrated to one side or the other.



System phase errors

Phase errors can introduce any number of undesirable effects in an audio signal. A quick check with an audio monitor can help identify and quantify any significant amount of system phase error.

Select the auto gain control to make the edges of the ellipse touch the phase tangent lines. If a straight line, coincident with the L=R axis is observed, then the left and right channels of the equipment under test are exactly matched in phase and gain. If a slanted line is observed, the left and right channels match in phase but do not have the same amplitude. A straight line perpendicular to the L=R axis indicates reversed phase between channels. Finally, an ellipse whose major axis falls on the L=R line indicates equal amplitude but phase mismatch.



Polarity reversal

Recordings that use multiple microphones in a complex studio environment present dozens of opportunities to introduce polarity reversals that cause audible problems. Any time a polarity reversal occurs, an audio monitor can be used to trace the problem back to its source quickly. A correctly phased signal will produce a straight vertical line on the Lissajous display. If the signal is phase reversed, the Lissajous display will appear as a horizontal line.



Digital Audio

The transition to digital audio has evolved over many years, with the AES 3 standard dominating the video industry. The standard supports sampling rates of 32kHz, 44.1kHz (CD) and 48kHz (professional), the last of which is frequently used within video facilities. The analogue audio signal is sampled at the sample clock rate and 16, 20 or 24 bits can be used to represent the amplitude of the audio signal. A greater number of bits are required for audio than video because the larger dynamic range increases the number of bits needed to produce an adequate signal-to-noise ratio (SNR). The data embedded within the serial data stream contains two audio channels, Channel 1 and Channel 2, which are multiplexed together. These channels may be separate monophonic channels, a stereo pair containing Left and Right sources, a single audio channel with the second channel identical to the first, or they could contain no data at all.



Auxiliary Data Bits

When a 20 bit audio sample is used, the four least significant bits (LSB) may be reserved for auxiliary data. One application for these auxiliary data bits is for talkback channels in a production environment. Otherwise, these bits can be used to carry the four LSBs of a 24 bit audio sample.



Audio Sample Data Bits

Audio sample data is placed between bits 4 to 27, with the most significant bit (MSB) located at bit 27 and supporting a maximum sample of 24 bits. If all of the 24 bits are not used for an audio data sample, the LSBs are set to “0”. Typically within broadcast facilities an audio sample of 20 bits is used. This allows for auxiliary data channels within the four LSBs from 4 to 7.

The 20-bit audio sample is used for most applications within a broadcast environment. However, a 24-bit audio sample is supported in AES/EBU by using the sample bits from 4 to 27 and no auxiliary data bits. Each audio channel includes subframes that carry additional data bits, which also provide useful information.



Understanding the Terms Validity Bit (V)

When the Validity bit is set to zero, the subframe audio data can be decoded to analogue audio; when the Validity bit is set to “1”, the audio sample data cannot be decoded. Audio test equipment can be set up to ignore the V bit and continue to use the data for measurement purposes.



User Data bit (U)

The user data bits can be used to carry additional information about the audio signal. Each U bit from the 192 subframes can be assembled together to produce a total of 192 bits per block. The operator can use this information for such purposes as the addition of copyright information.



Channel Status Bit (C)

The Channel Status bit provides information on various parameters associated with the audio signal. These parameters are gathered for each C bit within the 192 subframes for each audio channel.



Channel Status Use

There are three levels of implementation for channel status data: minimum, standard, and enhanced. The standard implementation is recommended for use in professional television applications, so the channel status data will contain information about signal emphasis, sampling frequency, channel mode (stereo, mono, etc.), use of auxiliary bits (extend audio data to 24 bits or another use), and a cyclic redundancy code (CRC) for monitoring the total channel status block.



Parity Bit (P)

On an audio monitor, the parity bit is set so that the values of bits 4 to 31 form an even parity (even number of ones) that can be used to check for errors within a subframe.



Part II: Connecting to the Future

It’s generally known that proper wiring and the correct type of digital audio connections installed throughout a facility ensure the highest quality signals at the receiving end. In many cases when problems in the audio stream are identified, it’s a bad connector that’s to blame.

There are two basic types of connectors that can carry the AES/EBU serial digital data. A standard XLR can be used to distribute the digital signal over a twisted pair cable, where it can travel through 100 meters of cable without the need for an equalizer. The other connection most commonly used is a standard 75W coaxial cable and BNC connector as defined by the AES3-ID standard. This is an unbalanced interface that permits broadcast facilities to transmit AES/EBU digital audio on standard coax using the existing infrastructure of the plant.

With a BNC connector and coaxial cable, signals can be distributed through more than 1,000 meters of cable. There are a variety of circuits that allow interconnection between XLR and BNC interfaces, either using simple resistor networks or using a transformer and attenuator circuit.

Beyond that, an audio monitoring system is essential to maintaining error-free audio operation. There are many advantages to such equipment, and using it correctly will help users get the most out of it.



Setting Levels

There are several things to understand about setting levels and interpreting digital audio signals on an audio monitoring device. The maximum digital audio value is represented by an audio sample of all 1s, referred to as 0 dBFS (Full Scale). Clipping and other distortions in the audio signal may occur if the original analogue audio source exceeds this value.

Additionally, digital audio signals may produce high amplitude when converted back to analogue domain. This is because a low-pass filter that is added to the analogue output stage of the conversion process causes a higher analogue amplitude signal level than the digital value represents. Therefore, most audio monitors offer an interpolated view of the audio signal to show these peak values and to indicate when clipping occurs. Besides the VU and PPM level data that an analogue audio monitor provides, digital monitors also offer a True Peak Meter. This type of metering instantly displays actual signal peaks, regardless of their duration. Within any of the meter data the user can choose the reference level and peak level to meet their specific requirements. A practical implementation would be to choose a test level of –18 dBFS and peak program level of –8 dBFS.



Embedded Audio SD/HD

AES/EBU audio data can be embedded into the ancillary data space of a serial digital video signal. This is particularly useful in large facilities where separate routing of digital audio becomes costly.

In smaller systems, such as a post-production suite, it’s generally more economical to keep the audio separate from the video, thus eliminating the need for numerous embedder and de-embedder modules.



Ancillary Data

Looking at the ancillary data space available in component digital video, all of the horizontal and vertical blanking intervals are available, except for a small amount required for end-of-active video (EAV) and start-of-active video (SAV) sequences. The ancillary data space has been divided into two types: horizontal ancillary data (HANC) and vertical ancillary data (VANC). In standard definition (SD) infrastructures, the audio data is divided up into the 20 bit audio samples and the extended auxiliary 4 bits of data, whereas in high definition (HD) environments, the 24 bits of audio data are carried as one packet.

This audio data is located in the HANC area for both SD and HD formats. Additional extended data is also carried within HANC for SD systems. Up to 16 channels of embedded audio are specified for HANC, which is assembled into four groups, with each containing four audio data channels.



Ancillary Data Formatting

Ancillary data is formatted into packets prior to multiplexing into the video data stream. Each data block may contain up to 255 user data words and multiple data packets may be placed in individual ancillary data spaces—thus providing a rather flexible data communications channel. At the beginning of each data packet is a header using word values that are excluded for digital video data and reserved for synchronizing purposes. With embedded audio, the DBN (Data Block Number) may be used to detect the occurrence of a vertical interval switch. This allows the receiver to process the audio data to remove the transient “click” or “pop” that is likely to occur. Just prior to the data is the Data Count (DC) word, indicating the amount of data the packet contains. Finally, following the data is a one-word “check-sum” that is used to detect errors in the packet.



Basic SD embedded audio

Embedded audio defined in the SMPTE 272M spec contains up to 16 channels of 20-bit audio data sampled at 48 kHz with the sample clock locked to the television signal. Although specified in the composite digital part of the standard, the same method is also used for component digital video. This basic embedded audio corresponds to Level A in the embedded audio standard. Other levels of operation provide more channels, other sampling frequencies, and additional information about the audio data. The Audio Data Packet contains one or more audio samples from up to four audio channels. Channels of embedded audio are essentially independent (although they are always transmitted in pairs).

There are several restrictions regarding distribution of the audio data packets, although there is a “grandfather clause” in the standard to account for older equipment. Audio data packets are not transmitted in the horizontal ancillary data space following the normal vertical interval switch and they’re also not transmitted in the ancillary data space designated for error detection checkwords.



Basic HD Embedded Audio

There are both similarities and differences in the implementation of AES/EBU within an HD environment. The formatting of the ancillary data packets is the same between SD and HD, but the information contained within the user data is different. This is because the full 24 bits of audio data are sent as a group and not split up into 20 bits of audio data and an extended packet containing the four auxiliary bits. Therefore, the total number of bits used in HD is 29 (compared with 23 bits in SD), where the 24 bits of audio data are placed in four ancillary data words along with a C, V, U and Z-bit flag. Since the full 24 bits of audio data are carried within the user data, there is no extended data packet used within HD.



AES/EBU Serial Digital Audio

With the transition to digital, serial digital video and audio are becoming commonplace in production and post-production facilities as well as television stations. In many cases, the video and audio is combined and handled as a single serial digital data stream. This allows a user to keep signals in the digital domain and switch them together with a serial digital video routing switcher to maintain the highest quality. If the user wants to separate some of the audio sources from the video, the digital audio can be de-multiplexed and switched separately via a digital audio routing switcher. At the receiving end, after the multiplexed audio has passed through a routing switcher, it may be necessary to extract the audio from the video so that editing, audio sweetening, or other processing can be accomplished. This requires a de-multiplexer that strips off the AES/EBU audio from the serial digital video. The output of a typical demultiplexer has a serial digital video BNC as well as connectors for the two-stereo-pair AES/EBU digital audio signals.

As we move into the digital future, audio quality will continue to gain importance to producers, broadcasters and viewers. Considering the intense competition of today, content quality could literally make or break some media businesses. With this in mind, digital facilities will have to pay more attention to the creation and processing of audio signals to remain competitive.

After years of neglect, audio is finally being given its due. With the appropriate audio monitoring devices, error detection can be immediate and a clean, high quality signal ensured. dBV =

5 May 2004


home | news archive | products archive | publications | advertising | Disclaimer | contact us | about us

Copyright © Reed Business Information. All material on this site is subject to copyright. All rights reserved. No part of this material may be reproduced, translated, transmitted, framed or stored in a retrieval system for public or private use without the written permission of the publisher.
eNewsletter
 
enter email to register/unregister