of 20 /20
as shown in Figure 3-4.1. Not until thirty years later did German researchers pioneer magnetically coated, pape r-ba sed tape for good qual ity reco rder /rep ro- ducers. By 1936, German scientists had advanced magnetic based reco rdin g usin g cellulos e base -tap e and achieved rema rkab ly good sound qualit y. Afte r the war, the AKG Magnetophone (see Figure 3.4-2) was copied and commercia lly expl oited worl dwide. Pape r tape backing was immediately abandoned. A host of bene- ts including portability, immediacy of playback, ease of storage, wide dynamic range, low distortion and freedom from ticks and pops propelled magnetic re- cording to the forefront in broadcastin g. Of paramount signicance was the inuence of a technology that no longer depended on laboratory processing. This was in modern terminology a democratizing development, much as is the current proliferation of digital audio workstations. For nearly a half-century magnetic tape recording has dominated broadcast production. Even today audio archivists continue to insist that important original re- cordings come to them on analog reel-to-reel, rather than any of the multitude of recent digital media for- mats. Despite the inexorable and accelerating trend for INTRODUCTION Until just over 100 years ago all sound was  live sound. It was only with the harnessing and commercial use of elec tric ity that amplica tion was developed and subsequently led to the widespread ability to store and reproduce sound waves. The faithful reproduction of sound waves is the goal of audio recording systems. For most of the 20th century, analog magnetic tape has been the broadcaster’s principal medium for audio recording. Magnetic recording is an interdisciplinary eld of physics, chemical and material sciences, elec- tronics and mechanical engineering. Current practice and research is focused on the implementation and standardization of digital recording techniques using magnetic, optical and even static storage media. The massive storage requirements for digital audio have led to signicant advances in perceptual compression techni que s and hig h-d ens ity res ea rch and dev el- opment. History Audio recording preceded and helped fuel the intro- duction of broadcasting. The earliest recorded audio was Edison’s 1877 cylindrical gramophon e employing a cons tant velo city vertical reco rdin g groo ve. The gramophone’s cylindrical media mandated that each recording be a master and stymied mass production. The at disc recordings that dominated recorded audio for the next half century permitted impression duplica - tion in mass quantities. Early broadcast use of recorded media exploded in the late 1920’s with the intr oduc tion of Berl iner ’s mass-produced, laterally recorded at disc. This devel- opment coincided with the rapid proliferation of AM broadcasting. Among the rst actions of the Federal Radio Commission in 1928 was the deletion of several stations due to their heavy reliance on airing commer- cial records which the FRC cited as, “provision of a service which the public can readily enjoy without the service.” The new FRC favored original programming and this too stimulated the use of recording lathes at the burg eoni ng popu latio n of radi o stati ons. Whil e many, if not most, early broadcast facilities acquired recording lathes for production of recorded audio, it was not until after World War II that saw the wide- spre ad debu t of the more forgiv ing and affo rdab le magnetic tape recording. Danish telephone engineer Valdemar Poulsen dem- onstrated a magnetic wire recorder as early as 1898 321 3.4 AUDIO RECORDING SYSTEMS MICHAEL STARLING NATIONAL PUBLIC RADIO, WASHINGTON, DC Figure 3.4-1. Valdemar Poulsen’s Telegraphone won a Grand Prix Award at the 1900 Paris World’s Fair. The device used pole pie ces that attach ed to each sid e of thewire. Lat er improvements included a disk surface of 5.25 inches with longitudinal recording on both sides of the disk. (From Jorgensen,  The Complete Hand- book of Magnetic Recording, 4th edition, 1995, McGraw-Hill. Re- produced with the permission of the McGraw-Hill Compani es.)

audio recording systems

  • Upload

  • View

  • Download

Embed Size (px)

Text of audio recording systems

Page 1: audio recording systems

8/12/2019 audio recording systems

http://slidepdf.com/reader/full/audio-recording-systems 1/20

as shown in Figure 3-4.1. Not until thirty years laterdid German researchers pioneer magnetically coated,paper-based tape for good quality recorder/repro-ducers.

By 1936, German scientists had advanced magneticbased recording using cellulose base-tape and achievedremarkably good sound quality. After the war, theAKG Magnetophone (see Figure 3.4-2) was copiedand commercially exploited worldwide. Paper tape

backing was immediately abandoned. A host of bene-fits including portability, immediacy of playback, easeof storage, wide dynamic range, low distortion andfreedom from ticks and pops propelled magnetic re-cording to the forefront in broadcasting. Of paramountsignificance was the influence of a technology that nolonger depended on laboratory processing. This wasin modern terminology a democratizing development,much as is the current proliferation of digital audioworkstations.

For nearly a half-century magnetic tape recordinghas dominated broadcast production. Even today audioarchivists continue to insist that important original re-cordings come to them on analog reel-to-reel, ratherthan any of the multitude of recent digital media for-mats. Despite the inexorable and accelerating trend for


Until just over 100 years ago all sound was  live sound.It was only with the harnessing and commercial useof electricity that amplification was developed andsubsequently led to the widespread ability to store andreproduce sound waves. The faithful reproduction of sound waves is the goal of audio recording systems.

For most of the 20th century, analog magnetic tapehas been the broadcaster’s principal medium for audiorecording. Magnetic recording is an interdisciplinaryfield of physics, chemical and material sciences, elec-tronics and mechanical engineering. Current practiceand research is focused on the implementation andstandardization of digital recording techniques usingmagnetic, optical and even static storage media. Themassive storage requirements for digital audio haveled to significant advances in perceptual compressiontechniques and high-density research and devel-opment.

HistoryAudio recording preceded and helped fuel the intro-

duction of broadcasting. The earliest recorded audio

was Edison’s 1877 cylindrical gramophone employinga constant velocity vertical recording groove. Thegramophone’s cylindrical media mandated that eachrecording be a master and stymied mass production.The flat disc recordings that dominated recorded audiofor the next half century permitted impression duplica-tion in mass quantities.

Early broadcast use of recorded media exploded inthe late 1920’s with the introduction of Berliner’smass-produced, laterally recorded flat disc. This devel-opment coincided with the rapid proliferation of AMbroadcasting. Among the first actions of the FederalRadio Commission in 1928 was the deletion of severalstations due to their heavy reliance on airing commer-cial records which the FRC cited as, “provision of aservice which the public can readily enjoy without the

service.” The new FRC favored original programmingand this too stimulated the use of recording lathes atthe burgeoning population of radio stations. Whilemany, if not most, early broadcast facilities acquiredrecording lathes for production of recorded audio, itwas not until after World War II that saw the wide-spread debut of the more forgiving and affordablemagnetic tape recording.

Danish telephone engineer Valdemar Poulsen dem-onstrated a magnetic wire recorder as early as 1898




Figure 3.4-1. Valdemar Poulsen’s Telegraphone won a GrandPrix Award at the 1900 Paris World’s Fair. The device used polepieces that attached to each side of thewire. Later improvementsincluded a disk surface of 5.25 inches with longitudinal recordingon both sides of the disk. (From Jorgensen, The Complete Hand- book of Magnetic Recording, 4th edition, 1995, McGraw-Hill. Re-produced with the permission of the McGraw-Hill Companies.)

Page 2: audio recording systems

8/12/2019 audio recording systems

http://slidepdf.com/reader/full/audio-recording-systems 2/20


modern media to embrace digital techniques, profes-sional archivists continue to prefer and insist on analogrecordings due to known fault tolerances and stableformat standards that will undoubtedly be readily avail-able for the foreseeable future, even if they are fastlosing favor in daily production.


Domain theory is the basis for explaining ferromag-netic phenomena. In 1907, Pierre Weiss first positedthe existence of  atomic magnets, although Ampere had

earlier theorized that magnetism in lodestones mightbe due to some molecular circulating currents insidethe material.

At room temperature there are three materials wherethere are unpaired electron spins: four electrons iniron, three in cobalt, and two in nickel—precisely thematerials in use in magnetic recording technology.These unpaired electrons will undergo exchange be-tween adjacent atoms resulting in quantum mechanicalforces of exchange. When the exchange value is posi-tive, neighboring electron spins align and producestrong magnetization levels inside the ferromagneticmaterial. The spins will align themselves inside smallvolumes, called   domains   that can be as small as 1m or as large as 2 cm. At temperatures approachingabsolute zero (0 degrees Kelvin) spin alignment be-

comes seemingly uniform and magnetization is high-est. At higher temperatures thermal agitation causesexcursions in the angle of mutual orientation as theexchange forces lose control. The temperature at whichthe material loses ferromagnetism and becomes dia-magnetic is called the Curie Temperature. Upon cool-ing the material becomes ferromagnetic again, but hasno memory of its prior magnetization. Typically, be-tween 1012 and 1015 atomic moments comprise eachmagnetic domain.

The direction of magnetic domains varies from do-main to domain, with net overall magnetization beingzero in perfectly virgin ferromagnetic material, suchas deeply erased audiotape. Domain behavior has beenobserved by applying a colloidal iron powder on thepolished surface of a sample and viewing the resultantbitter patterns.

Magnetic fields are expressed as  H , with flux linesshown by the symbol Ø. The intensity of the magneticfield is expressed in terms of flux density B  and varieswith the numberof field lines. Flux density is measuredin weber/m2.

When a magnetic field is strong enough to realignthe magnetic domains, even after the field is removed,the amount of magnetism remaining is called retentiv-ity   or   remanence. The field required to oppose andreduce the retentivity to zero, or in other words toerase the magnetic field is called  coercivity   and isexpressed as  H c.

Coercivity  is measured in Oersteds (Oe), with typi-cal analog tape stock requiring about 370 Oe for fullerasure. Digital open reel tapes require over 700 Oeand metal particle tapes between 1,200–1,500 Oe forfull erasure.

 Hysteresis  is the lagging of the magnetizing effectbehind the magnetizing force as illustrated in Figure3.4-3. Flux density (B)increases with thefield intensity(H), but in a non-simultaneous fashion. As field inten-sity increases flux density increases reasonably uni-formly until approaching saturation beyond which nofurther flux density is produced. Reducing the fieldintensity reduces the flux density, but note that whenthe field intensity is reduced to zero the flux densityremains above zero. This is due to the magnetic mate-rial’s retentivity. Only by applying an opposite polarity

field of sufficient intensity will the flux density againreach zero. Hysterisis plots are unique for specificmedia samples due to vagaries in impurities, crystalstructure, slip planes and stresses.


Figure 3.4-2. Early German Magnetophone brought to the USAafter capture by officers in the U.S. Army Signal Corps. (FromJorgensen, The Complete Handbook of Magnetic Recording,  4thedition, 1995, McGraw-Hill. Reproduced with the permission of

the McGraw-Hill Companies.)

Figure 3.4-3. Depiction of a typical hysteresis loop. 1, 2, 3 repre-sent the original magnetization curve, with 3, 4, 5, 6, 7, 8, 3representingthe hysteresis loop.(From Jorgensen, The Complete Handbook of Magnetic Recording, 4th edition, 1995,McGraw-Hill.Reproduced with the permission of the McGraw-Hill Companies.)

Page 3: audio recording systems

8/12/2019 audio recording systems

http://slidepdf.com/reader/full/audio-recording-systems 3/20


The bias oscillator is also fed to the erase head, andfor multitrack recorders a single oscillator is bufferedand fed to each gap to prevent intermodulation. Biascurrent is typically 8–10 times the signal current. Nor-mal bias is defined as the bias field which saturatesthe tape to its exact coating thickness. Overbias resultsin short wavelength losses but is frequently employedin professional audio recorders where equalization andhigher tape speed maintain high frequency response;thus, even lower distortion is achieved.

Print-throughPrint-through affects most magnetic tapes and is of 

major concern to archivists. As one would expect,print-through is more of an issue in thin base tapes,but coating techniques also affect print-through charac-teristics. Fe2O3 particle sizes can be problematic if toosmall or if oriented to be susceptible to magnetizationlevels in the adjacent contact layer. In some early oxideformulations 4% of the particles were small enoughto be susceptible to magnetization by adjacent layers.Longer storage times as well as higher temperaturescaused print-through to be increased. Typically, botha pre-print and post-print can be observed on each side

of a strong signal in the quiet pauses around that signal.Pre- and post-prints are typically of different magni-tudes because the magnetization effect will be additivefor one layer and subtractive for the other. When thewavelength of the recorded signal is equal to totaltape thickness print-through is greatest. In practice afrequency of 1200 Hz on 2 mil tape at 15 ips combinesfor greatest print-through. Because this falls at thepoint of greatest aural sensitivity, methods to combatthe effect are important. Slower speeds and thinner

Considerations Unique to Analog RecordingSystems

Decades of incremental refinements have been madein magnetic recording techniques. Bias, frequency re-sponse, signal-to-noise, print-through, oxide composi-tion, backing media, lubrication and adhesive composi-tion interact with one another and require trade-offsdepending on the intended purpose of the recording.A wide range of tape formulations are available, withfeatures specific to the application. Additionally, awide range of speed, equalization, track widths, noisereduction and level standards have evolved on analogrecorders for specific purposes.

Noise, Frequency Response and BiasBecause areas of magnetization increase as the

square of the current value, the magnetization andrecording process is nonlinear (see Figure 3.4-4). Thisnonlinearity presents innumerable challenges in repro-

ducing high fidelity audio recordings. Early on it wasdiscovered that when high frequency bias signals areadded to the record head, the linearity of the magneticrecording process is improved. This is because theresulting heterodyned signal now moves into the morelinear regions of the magnetization curve.

The mechanism of ac bias is familiar to anyonewho has manually experimented with producing deeperasures on analog tape stock. The more smoothly theac field is removed, the deeper the erasure. In practicethis principle is applied as the ac bias field decayswhile the tape stock is moved past the trailing edgeof the record gap at a constant velocity. Thus, the highfrequency bias field causes the magnetization intensityto ramp down along a predictable hysteresis plot, leav-ing the tape in a final state that is proportional to the

applied audio signal which is heterodyned with thebias signal (see Figure 3.4-5).The remanence imprint is cleaner with higher fre-

quency bias signals due to limited intermodulation.Thus, the bias frequency is generally set to at leastfive times the highest audio frequency to be recorded.


Figure 3.4-4. Zero crossing distortion is shown in a remanentmagnetization coating. (From Jorgensen,  The Complete Hand- book of Magnetic Recording, 4th edition, 1995, McGraw-Hill. Re-produced with the permission of the McGraw-Hill Companies.)

Figure 3.4-5. The principle of AC bias on a moving tape area,showing the decreasing magnetization swing of the high fre-quency bias signal, leaving the recorded intensity proportional tothe superimposed signal field. (From Sharrock, NAB EngineeringHandbook, 8th edition, 1992.)

Page 4: audio recording systems

8/12/2019 audio recording systems

http://slidepdf.com/reader/full/audio-recording-systems 4/20


tape move the frequency of maximum print-throughlower. Because perpendicular susceptibility in modernaudio tape formulations is approximately one-quarterof the longitudinal susceptibility one means to combatthe more annoying pre-print effect is to wind tapesoxide out, as is common in European studios as wellas standard in cassette tapes.

Velour EffectAs is the case for varying pre- and post-print effects,

particle orientation has also been shown to exhibitdifferent high frequency responses in forward versusreverse motion. When particles exist at a nearly rightangle to the perpendicular recording field they maynot be influenced strongly enough by the recordingfield. In general the very outermost layer of the coatingsurface exhibits less remanence. This is referred to asthe dead layer . The dead layer is actually exploited insome digital recording systems with a longitudinalrecording being multiplexed with a vertically recorded

signal for greater information storage density. Strongerrecordings result when the particle field is closer toagreement with the recorded field as shown in Figure3.4-6.

Maximum Operating Level (MOL)MOL333  is the maximum level output when the re-

cord level has been adjusted to produce 3% harmonicdistortion. Some manufacturers use a 5% figure.MOL10   is the output between the normal referencelevel and the tape level at saturation recorded at 10kHz. Operating levels are expressed in nano-Webersper meter (nWb/m) and are typically between 200–1000 nWb/m for audio tape. In broadcast environmentsa typical 15 ips setting may employ 320 nWb/m, with7.5 ips levels set for 250 nWb/m.

Recorded levels depend on the amount of bias cur-rent. Each frequency will have an optimum bias cur-rent. High bias currents produce a clean, in-depth re-cording with low harmonic distortion, while a smallbias current results in higher high frequency responseand high distortion. A universal relationship regardlessof tape formulation is that for tape stock with biasadjusted for maximum output at a 5 mil wavelength.One percent distortion is about 10 dB below tape satu-

ration and the 5% distortion level is roughly 5 dBbelow saturation.

Coating thickness losses account for a 6 dB/octaveincrease in the voltage versus frequency response curveat low frequencies. Reproduce amplifiers must employequalization circuits with a crossover frequency appro-priate to the speed and coating thickness (see Table

3.4-1). Because relatively little high frequency energyis present in most music and speech, record currentis boosted to achieve a uniform record level at highfrequencies. The amount of boost is standardized inthe United States according to NAB published criteriaand DIN, CCIR and IEC in Europe.

Mechanical ConsiderationsHigh quality spooled recording media require pre-

cise speed stability and constant tension. In profes-sional machines three motors are typically employed,one for the capstan speed, one for spooling the supplyand one for the takeup reel. Most modern professionalreel-to-reel analog tape recorders (ATRs) employ con-stant tension technology to minimize the possibility of stretching the magnetic tape stock. Constant tension

is similarly important in achieving good tape pack tominimize cinching, debris pickup, and print-through.Most professional facilities store their tapes   tails out not only to insure best pack but also to require a rewindprior to subsequent playback to shed any deposits oradhesion effects that can impair playback integrity.

The speed of the capstan must be precise and stable.The capstan’s dimension must be perfectly concentric.Since perfection is hard to achieve in practice, mechan-ical variations that result in speed changes of up to10 Hz are commonly referred to as  wow, with speedvariations above 10 Hz referred to as   flutter . Wow ismost typically caused by capstan shaft irregularitiessuch as a shape eccentricity or debris pickup such asadhesives or dirt. Motor cogging and layer-to-layeradhesion in a tape pack can also contribute to wow.

Flutter is more typically generated by anomalies inthe tape itself. Since magnetic tape has elastic proper-ties, movement over tape guides and heads can causeminute longitudinal oscillations. Resonances set up inthe tape medium by tension settings can also causemechanical vibrations that will be exhibited as flutter.A variety of techniques have been devised to compen-sate for these effects, including simple masking tech-niques such as adding idler rollers along the tape pathto shift the flutter resonances to higher frequencies

Table 3.4-1.Tape speed, coating thicknesses and time constants

for audiotapes.

Speed Coating    f   1 ⁄ 2f   IEC/DIN NAB

Application IPS   m Hz   S   S   S

Studio 30 10 11,395 13 35 10Professional 15 5 11,395 13 35 50Home A 71 ⁄ 2   5 5,970 27 50–70 50Home B 33 ⁄ 4   5 2,985 54 90 90Cas set te 17 ⁄ 8   2.5 2,985 54 120 90

(From Jorgensen,   The Complete Handbook of Magnetic Recording,  4thedition, 1995, McGraw-Hill. Reproduced with the permission of theMcGraw-Hill Companies.)


Figure 3.4-6. When particles are perpendicular to the recordingfield, a weak surface recording results (a); a stronger surfacerecording results when particles are not aligned perpendicularlyto thefieldlines(b). (From Jorgensen, TheCompleteHandbookof Magnetic Recording, 4th edition, 1995, McGraw-Hill. Reproducedwith the permission of the McGraw-Hill Companies.)

Page 5: audio recording systems

8/12/2019 audio recording systems

http://slidepdf.com/reader/full/audio-recording-systems 5/20


polymers are becoming common for better dimen-sional stability in thinner tapes. (See Figure 3.4-7.)

Maintenance, Care & Storage of AudioRecordings

Although many analog recordings have held up ingood condition for decades they are quite sensitive topermanent physical damage from improper handling,machine malfunction and environmental hazards.Winding tapes   tails out   immediately after completeplayback is the most important safeguard in preventingedge damage to audiotapes. Cleanliness and controlledtemperature and humidity are the most important fac-tors in preventing environmental damage.

Tape wind or pack must be even to prevent protru-sion scatter between layers that will crease and perma-nently damage tape edges during subsequent playback.Scatter wound tape is susceptible to edge damage fromthe pressure exerted on flanges during careless han-dling. For this reason, reels should be handled by theirhubs rather than by the flanges. Similarly cinching of layers with actual foldover is possible during rapidacceleration/deceleration from jerky transport oper-ation.

Many professional recorders have a   library wind mode that operates at a higher than normal operatingspeed but with constant tension to assure a smoothpack. Tape libraries invariably have professional tapewinding equipment that is optimized for gentle han-dling during higher speed precision winding. At pro-fessional libraries preventive maintenance includes pe-riodic rewinding to minimize print-through, depletionof lubrication and to interrupt stiction buildup fromadhesive action. Recommended periods between re-windings varies greatly with storage conditions. Tapes

stored at 20°C should be rewound every 3,000 hours.Tapes stored at 30°C should be rewound roughly every300 hours.

Minute particles can cause serious system degrada-tion (see Figure 3.4-8). Static buildup, scraping,scratching of the tape surface and separation in pack and head contact can cause dropouts and permanent

where they are less noticeable. Clock synchronizationtechniques are commonly employed in digital audioequipment that correct for mechanically inducedeffects inherent in the recorder’s drive mechanism.

The amounts of wow and flutter that are generallyaccepted to be just perceptible are:

Speech .6% peak  Music .3% peak  Classical Music .15% peak 

Tape CompositionModern magnetic audio recording tape is of two

basic types particulate and thin film. Particulate mediaare characterized by magnetic domains coated onto aflexible (polymer film) or rigid substrate (aluminumor glass/ceramic disks). As the name implies, thin filmmedia consist of continuous films of magnetic materi-als deposited onto a substrate by vacuum techniques.Because thin film techniques yield a smoother andthinner medium than the particulate media, greater

recording densities with lower error rates are achieved.Thin film techniques are almost universally used forhard disk storage and are beginning to be used forhigher density tape stocks as well.

Particulate flexible media have a magnetic coatingon both sides and the substrate is about 76  m thick.The base film is typically polyethylene terephthalate(PET) film although polyethylene napthalate (PEN)


Figure 3.4-7. Sectional views of (a) a particulate and a metalevaporated magnetic tape and (b) coated PETsubstratefor metalevaporated (ME) tapes. ME is also referred to as thin film flexiblemedia. (From Mee & Daniel,   Magnetic Recording Technology,2nd edition, McGraw-Hill, 1995. Reproduced with the permissionof the McGraw-Hill Companies.)

Figure 3.4-8. Relative size of various particulate debris. (FromRitter,  NAB Engineering Handbook,   7th edition.)

Page 6: audio recording systems

8/12/2019 audio recording systems

http://slidepdf.com/reader/full/audio-recording-systems 6/20


damage to tape and equipment. Thus, frequent cleaningof heads, guides, capstan, pinch roller required, typi-cally after each recording or playback session is imper-ative. Freon TF and Ethyl alcohol are considered themost useful cleaners exhibiting very little health haz-ard, no perceptible solubility on magnetic tapes andvirtually no effects on rubber pinch rollers or belts.Unlike Ethyl alcohol Freon TF is not flammable. Care-ful demagnetization of heads is also required for bestperformance, typically after each 8 hours of operation.Oils and salts from fingerprints will attract foreignparticles and can themselves interfere with reliablehead to tape interface.

 Hydrolysis  is a chemical reaction with water thataffects polyester based recording tapes. High tempera-ture and high humidity will accelerate hydrolysis reac-tions in any polyester based tape stock. However, fromroughly 1977–1983 an industry wide polyester binderphenomena, referred to as sticky-shed syndrome, exac-erbated the rate of hydrolysis reactivity.

Tapes from the sticky-shed eratypically exhibit slip-stick phenomena as carboxylic acid and alcohol aresloughed from the binder as debris products. Tapes of this vintage are frequently unusable due to residuebuildup that causes transports to squeal and bind. For-tunately, this phenomenon has been extensively docu-mented and can be reversed temporarily with no appar-ent damage to the tape recording. The reversal processconsists of warming (or  baking) the tapes in a convec-tion oven at 120°F for 24 hours. The tapes will thenbe usable upon cooling for several weeks before hy-drolysis again sheds sufficient amounts of debris tointerfere with transport functioning. Recommendedhumidity and temperature conditions are shown in Ta-ble 3.4-2, below.

Considerations Unique to Digital RecordingSystems

Tribology (from the Greek word   tribos   meaningrubbing) is critical to high density magnetic recordingapplications such as digital audio. Thus magnetic tapetribology includes techniques to observe and classifysurfaces, friction, lubrication, tape and head wear, con-tour effects, head-to-disk interface, disk surfaces, slid-ers, air bearings and contaminants.

Lubricants in magnetic media tend to result in dragforces proportional to media speed and head contactpressure. At higher velocities the coefficient of frictiondrops to zero at speeds near 80 ips due to the formationof a hydrodynamic air bearing. A friction rise below1 Hz is due primarily to classic stick slip motions.

Sliding friction can generate significant heat. Thetemperature rise at the head/tape interface in helicalrecorders is typically only 5–7°, but can jump to aflash temperature of 1000° centigrade when an asper-ity, such as a protruding particle travels past the head.

A great advantage of digital recordings is that systemperformance is no longer limited by performanceof thestorage medium. Since transitions are the fundamentallanguage of digital recording systems neither ac biasnor particularly high S/N are required. In fact distortedwaveforms are the norm. However, since a massiveamount of transition density must be stored for highfidelity audio, higher bandwidth and more precisionmagnetic emulsions are needed. Linear density, or kilo-bits per inch is the name of the game. Several tech-niques are employed to maximum density capabilities,as well as to minimize density requirements.

Error Control and Correction

The need for higher storage densities for digitalaudio has accelerated research and development intape composition and magnetic head design. At higherrecording densities error vulnerability requires eversmoother recording media and revolutionary designsof recording and playback heads.

Thus, in order to minimize damage and errors dueto head to media contact a load-carrying air film isformed at the interface between record head and mag-netic media. Physical contact should only occur as themedia starts and stops its motion. The air film mustbe thick enough to conceal any near-contact surfaceirregularities and thin enough to provide a reliablerecord and playback signal. Head to medium separationranges from about 50 nm to 0.3 m, and the roughnessof the head and medium surfaces ranges from 1.5 to

10 nm rms.1

Acicular magnetic particles are cigar-shaped parti-cles employed in most magnetic digital recording. Be-cause transitions are the basis of recording, saturationrecording is employed andis typically of the traditionallongitudinal format. However, for greater storage den-sity the acicular particles can be oriented perpendicu-larly to the direction of the recording medium’s travel.A balance between too low a density which requiresexcessive tape consumption and too high a densitywhich requires additional error correction to combatdrop-outs and intersymbol interference.

Thin film heads are of a substantially different de-sign from analog heads. These heads are manufacturedusing photolithography to achieve a minute, preciseshape. Multi-turn thin-film inductive record heads

(IRH) are used for recording but do not have goodplayback characteristics at slow speeds. However,magneto-resistive (MR) heads are useful due to theoutput being independent of tape speed. With MR


Table 3.4-2.Recommended storage conditions.

Storage Temperature Maximum Relative Humidity

50° C 39%40° C 47%30° C 60%20° C 79%10° C 100%

1 See Bhushan, Bharat, “Tribology of the Head-Medium Inter-face,”   Magnetic Recording Technology,  Chapter 7, McGraw-Hill 1996.

Page 7: audio recording systems

8/12/2019 audio recording systems

http://slidepdf.com/reader/full/audio-recording-systems 7/20


quency of half the sampling frequency, there will betwo samples per period. A low-pass filter is placed atthe output of every audio digitization system to removeall frequencies above the half-sampling frequency. Thisis requiredbecausesampling, through modulation, gen-erates new frequencies above the audio band. The out-put filter removes all spectra above the half-samplingfrequency. This is summarized in Figure 3.4-9.

By definition audio samples contain all the informa-tion needed to provide complete reconstruction. How-ever bandlimiting criteria must be strictly observed; atoo high frequency would not be properly encoded,and would create a kind of distortion called   aliasing.An input frequency higher than the half-sampling fre-quency would cause the digitization system to alias.If S is the sampling frequency and F is a frequencyhigher than half the sampling rate, then new frequen-cies are also created at S    F, 2S    F, 3S    F, etc.An input low-pass filter will prevent aliasing if itscutoff frequency equals the half-sampling frequency.To achieve a maximum audio bandwidth for a givensamplingrate,filters with a very sharp cutoff character-istic, brickwall filters are employed in either the analogor digital domain.

Amplitude QuantizingThe amplitude of each sample yields a number that

represents the analog value at that instant. By defini-tion, an analog waveform has an infinite number of amplitude values, however quantization selects froma finite number of digital values. Thus after sampling,the analog staircase signal is rounded to a numericalvalue closest to the analog value. The difference be-tween the original values of the signal and values after

quantization appears as error.The number of quantization steps available is deter-mined by the length of the data word in bits—thenumber of bits in a quantizer determines resolution.Sixteen bits yields 216

  65,536 increments. Everyadded bit doubles the number of increments, hencethe magnitude of the error is smaller. The accuracy of aquantizing system provides an important performancespecification. In the worst case, there will be an errorof one half the least significant bit of the quantizationword. The ratio of maximum expressible amplitudeto error determines the signal-to-error (S/E) ratio of the digitization system. The S/E relationship can beexpressed in terms of word length as S/E (dB)   6.02n 1.76 where “n” is the number of bits.

Although a 16 bit system would yield a theoretical

S/E ratio of 98 dB, as the signal amplitude decreases,the relative error increases. Consider the example of a signal with amplitudeon theorder of onequantizationstep. The signal value crosses back and forth acrossthe threshold, resulting in a square wave signal fromthe quantizer.  Dither  suppresses such quantization er-ror. Dither is a low amplitude analog noise added tothe input analog signal (similarly, digital dither mustbe employed in the context of digital computationwhen rounding occurs).

heads the head never touches the tape and thus bothhead and media life is prolonged. Both crosstalk andsignal-to-noise characteristics are excellent in suchsystems.

Isotropic recording utilizes longitudinal and verticalmodes simultaneously. In isotropic recordings the ver-tical field erases the longitudinal fields near the tape’ssurface. Thus, the tape is recorded to saturation withlongitudinal fields and is multiplexed with verticalfields near the surface. The longitudinal field is struc-tured for dominance at low frequencies and the verticalfield carries the higher frequencies. Because the headgaps in isotropic recordings are so minute there isessentially no intersymbol interference because onlya small area at the trailing edge of the gap is recorded.

Additionally, because print-through effects are oper-ationally nonexistent much thinner base thicknessesand oxide layers are commonly employed. Coercivityis much higher on digital magnetic media and typicallyranges from 800–1500 Oe versus the more typical300–400 Oe in analog recordings. Thus, digital re-cordings are deep and robust.


Using the principles of discrete time sampling andquantization, a sampled signal can be processed trans-mitted or stored and through conversion can recon-struct an accurate representation of the original ana-log signal.

Discrete Time SamplingAn analog waveform such as an acoustic pressure

function in air exists continuously in time over a con-tinuously variable amplitude range. Such an analog

function may be discrete time sampled; moreover, thesample points can be used to reconstruct the originalanalog waveform. This digitization of audio forms thebasis for the encoding and decoding of the audio sig-nals in any digital audio format.

The Nyquist theorem states that given correct, band-limited conditions, sampling can be a lossless process.However the relationship between sampling frequencyand audio frequencies must be observed. The Nyquisttheorem defines the relationship: if the sampling fre-quency is at least twice the highest audio frequency,complete waveform reconstruction can be accom-plished.

The choice of sampling frequency determines thefrequency response of the digitization system; S sam-ples per second are needed to represent a waveform

with a bandwidth of S/2 Hz. As the sampled frequen-cies become higher, given a constant sampling rate,there will be fewer samples per period. At the theoreti-cal limiting case of critical sampling, at an audio fre-


2 This section contributed by KenC. Pohlman. NAB Engineering

 Handbook, 8th edition, National Association of Broadcasters,pp. 863–875, 1992.

Page 8: audio recording systems

8/12/2019 audio recording systems

http://slidepdf.com/reader/full/audio-recording-systems 8/20


When dither is added to a signal with amplitude onthe order of a quantization step, the result is duty-cycle modulation that preserves the information of the

original signal. The average value of the quantizedsignal can move continuously between two steps, thusthe incremental effect of quantization has been allevi-ated. Audibly, the result is the original waveform, withadded noise. That is more desirable than the clippedquantization waveform. With dither, the resolution of a digitization system is below the least significant bit.

The recording section of a pulse code modulation(PCM) system, shown in Figure 3.4-10(a), consists of input amplifiers, a dither generator, input (anti-

aliasing) low-pass filters, sample- and hold-circuits,analog-to-digital converters, a multiplexer, digitalprocessing circuits for error correction and modulation,

and a storage medium such as digital tape. The repro-duction section, shown in Figure 3.4-10(b), containsprocessing circuits for demodulation and error correc-tion, a demultiplexer, digital-to-analog converters, out-put sample-and-hold circuits, output (anti-imaging)low-pass filters and output amplifiers. In most contem-porary designs, digital filters are used in both the inputand output stages. The output section forms the basisfor a compact disc player.


Figure 3.4-9. Summary of discrete-time sampling, shown in the time and frequency domains. (From Pohlmann, Principles of Digital Audio .)

Page 9: audio recording systems

8/12/2019 audio recording systems

http://slidepdf.com/reader/full/audio-recording-systems 9/20

Figure 3.4-10. Block diagram of the recording (a) and reproduction (b) sections of a linear PCM system. (From Pohlmann, Principles of Digital Audio .)


parent polycarbonate substrate. Data are representedby pits that are impressed on the top of the substrate.The pit surface is covered with a thin metal (typicallyaluminum) layer 50–100 nm thick, and a plastic layer10–30   m thick. A label 5   m thick is printed on

top. Disc physical characteristics are shown in Figure3.4-11.Pits are configured in a continuous spiral from the

inner circumference to the outer. The pit constructionof the disc is diffraction-limited; the dimensions areas small as permitted by the wave nature of light atthe wavelength of the readout laser. A pit is about0.5 um wide. The track pitch is 1.6 um. There area maximum of 20,188 revolutions across the disc’sdata area.

Compact Disk FormatThe Compact Disc (CD) format was developed to

store up to 74 minutes of stereo digital audio programmaterial of 16 bit PCM data sampled at 44.1 kHz.Total user capacity is over 650 Mb. In addition, for

successful storage, error correction, synchronization,modulation and subcoding are required.

Compact Disc Physical Design The diameter of a compact disc is 120 mm, its center

hole diameter is 15 mm and its thickness is 1.2 mm.Data are recorded in an area 35.5 mm wide. It isbounded by a lead-in area, and a lead-out area whichcontain non-audio subcode data used to control theplayer’s operation. The disc is constructed with a trans-


Page 10: audio recording systems

8/12/2019 audio recording systems

http://slidepdf.com/reader/full/audio-recording-systems 10/20


The disc rotates with a constant linear velocity(CLV) in which a uniform relative velocity is main-tained between the disc and the pickup. To accomplishthis, the rotation speed of the disc varies dependingon the radial position of the pickup. The disc rotates

at a speed of about 8 rev/s when the pickup is readingthe inner circumference, and as the pickup moves out-ward, the rotational speed gradually decreases to about3.5 rev/s. The player reads frame synchronizationwords from the data and adjusts the speed to maintaina constant data rate.

The CD standard permits a maximum of 74 min-utes, 33 seconds of audio playing time on a disc.However by modifying encoding specifications suchas track pitch and linear velocity, it is possible tomanufacture discs with over 80 minutes of music.Although the linear velocity of the pit track on agiven disc is constant, it can vary from 1.2–1.4 m/ s, depending on disc playing time. All audio compactdiscs and players must be manufactured accordingto the Red Book , the CD standards document authored

by Philips and Sony.

Compact Disc Encoding CD encoding is the process of placing audio and

other data in a frame format suitable for storage onthe disc. The information contained in a CD frameprior to modulation consists of a 27 bit sync word, 8bit subcode, 192 data bits and 64 parity bits. The inputaudio bit rate is 1.41 106 bps. Following encoding,the channel bit rate is 4.3218 106 bps. Premastereddigital audio data are typically stored on a 3/4 in. U-matic video transport via a digital audio processor witha 44.1 kHz sampling rate and 16 bit linear quantization.

A frame is encoded with six 32 bit PCM audiosampling periods, alternating left and right channel 16

bit samples. Each 32 bit sampling period is divided toyield four 8 bitaudiosymbols.The CD systememploystwo error correction techniques: interleaving to distrib-ute errors and parity to correct them. The standardizederror correction algorithm used is the Cross Interleave

 Reed-Solomon Code   (CIRC), developed specificallyfor the compact disc system. It uses two correctioncodes and three interleaving stages. With error correc-tion, over 200 errors per second can be completely cor-rected.

Subcode Data Following CIRC encoding, an 8 bit subcode symbol

is added to each frame. The 8 subcode bits (designatedas P,Q,R,S,T,U,V, and W) are used as 8 independentchannels. Only the P or Q bits are required in the audioformat; the other 6 bits are available for video or otherinformation as defined by the CDG/M (Graphics/ MIDI) format. The CD player collects subcode sym-bols from 98 consecutive frames to form a subcodeblock with eight 98 bit words; blocks are output at a75 Hz rate. A subcode block contains its own synchro-nization word, instruction and data, commands andparity. An example of P and Q data are shown inFigure 3.4-12.

The P channel contains a flag bit that can be usedto identify disc data areas. Most players use informa-tion in the more comprehensive Q channel. The Qchannel contains four types of information: control,address, data and   cyclic redundancy check code

(CRCC) for subcode error detection. The control bitsspecify several playback conditions: The number of audio channels (two/four); preemphasis (on/off); anddigital copy prohibited (yes/no). The address informa-tion consists of 4 bits designating three modes for theQ data bits. Mode 1 data are contained in the table of contents (TOC) which is read during disc initialization.The TOC stores data indicating the number of musicselections as a track number, and the starting pointsof the tracks in disc running time. In the program andlead-out areas, Mode 1 contains track numbers, indiceswithin a track, track time, and disc time. The optionalMode 2 contains the catalog number of the disc. Theoptional Mode 3 contains a country code, owner code,year of the recording and serial number.

EFM Encoding and Frame Assembly The audio, parity and subcode data are modulated

using eight-to-fourteen modulation (EFM) in whichsymbols of 8 data bits are assigned an arbitrary wordof 14 channel bits. By choosing 14 bit words with alow number and known rate of transitions, greater datadensity can be achieved. Each 14 bit word is linkedby three merging bits. The 8 bit input symbols require256 different 14 bit code patterns. To achieve pits of controlled length, only those patterns are used in whichmore than two but less than ten “0s” appear continu-ously. Two other patterns are used for subcode syn-chronization words. The selection of EFM bit patternsdefines the physical relationship of the pit dimensions.The channel stream comprises a collection of 9 pitsand 9 lands that range from 3–11T in length where Tis one period. A 3T pit ranges in length from 0.833–0.972 m and an 11T pit ranges in length from 3.054–3.560 m, depending on pit track linear velocity. Eachpit edge whether leading or trailing is a “1” and allincrements in between, whether inside or outside a pit,are “0’s,” as shown in Figure 3.4-13.

The start of a frame is marked with a 24 bit synchro-nization pattern, plus three merging bits. The total


Figure 3.4-11. Compact disc physical specifications. (From Pohl-mann, The Compact Disc .)

Page 11: audio recording systems

8/12/2019 audio recording systems

http://slidepdf.com/reader/full/audio-recording-systems 11/20



Figure 3.4-12. Typical subcode contents of the P and Q channels. (From Pohlmann,  The Compact Disc .)

number of channel bits per frame after encodingis 588, comprised of: 24 synchronization bits, 336(12 2 14) data bits, 112 (4 2 14) errorcorrection bits, 14 subcode bits, and 102 (34     3)merging bits.

Data Readout 

CD pickups use an aluminum gallium arsenide(AlGaAs) semiconductor laser generating laser lightwith a 780 nm wavelength. The beam passes throughthe substrate, is focused on the metalized pit surfaceand is reflected back. Because the disc data surface

is physically separated from the reading side of thesubstrate, dust and surface damage on the substrate donot lie in the focal plane of the reading laser beamand hence their effect is minimized. The polycarbonatesubstrate has a refractive index of 1.55; because of thebending of the beam from the change in refractiveindex, thickness of the substrate, and the numerical

aperture (0.45) of the laser pickup’s lens, the size of the laser spot is reduced from approximately 0.8 mmon the disc surface to approximately 1.7  m at the pitsurface. The laser spot on the data surface is an Airyfunction with a bright central spot and successively

Figure 3.4-13. Channel bits as represented by the pit structure. (From Pohlmann,  Principles of Digital Audio .)

Page 12: audio recording systems

8/12/2019 audio recording systems

http://slidepdf.com/reader/full/audio-recording-systems 12/20


darker rings, and spot dimensions are quoted as half-power levels.

When viewed from the laser’s perspective, the pitsappear as bumps with height between 0.11–0.13 mm.This dimension is slightly less than the laser beam’swavelength in polycarbonate of 500 nm. The heightof the bumps is thus approximately 1/4 of the laser’swavelength in the substrate. The reflective flat surfaceof a CD is called   land . Light striking land travels adistance one-half wavelength longer than light strikinga bump, as shown in Figure 3.4-14. This creates anout-of-phase condition between the part of the beamreflected from the bump, and the part reflected fromthe surrounding land. The beam thus undergoes de-structive interference, resulting in cancellation. Opti-cally, if the CD pit surface is considered as a two-dimensional reflective grating, the focused laser beamdiffracts into higher orders, resulting in interference.The disc surface data thus modulates the intensity of the reflected light beam. In this way the data physicallyencoded on the disc are recovered by the laser.

Data Decoding A CD player’s data path, shown in Figure 3.4-15,

directs the modulated light from the pickup through aseries of processing circuits, ultimately yielding astereo analog signal. Data decoding follows a proce-dure which essentially duplicates, in reverse order, theencoding process. The pickup’s photodiode array andits processing circuits output EFM data as a high fre-quency signal. The first data to be extracted from thesignal are synchronization words. This information isused to synchronize the 33 symbols of channel infor-mation in each frame, and a synchronization pulse isgenerated to aid in locating the zero crossing of the

EFM signal.

The EFM signal is demodulated so that 17 bit EFMwords again become 8 bits. A memory is used to bufferthe effect of disc rotational wow and flutter. FollowingEFM demodulation, data are sent to a CIRC decoderfor de-interleaving, and error detection and correction.The CIRC decoder accepts one frame of thirty two 8bit symbols: 24 audio symbols and 8 parity symbols.One frame of twenty four 8 bit symbols are output.Parity from two Reed-Solomon decoders is utilized.The first error correction decoder corrects random er-rors and detects burst errors andflags them. The seconddecoder primarily corrects burst errors, as well as ran-dom errors that the first decoder was unable to correct.Error concealment algorithms employing interpolationand muting circuits follow CIRC decoding.

In most cases, the digital audio data are convertedto a stereo analog signal. This reconstruction processrequires one or two D/A converters, and low-pass fil-ters to suppress high frequency image components.Rather than use an analog brickwall filter after thesignal has been converted to analog form, the digitizedsignal is processed before D/A conversion using anoversampling digital filter. An oversampling filter usessamples from the disc as input, then computes interpo-lation samples, digitally implementing the response of an analog filter.

A  finite impulse response  (FIR) transversal filter isused in most CD players. Resampling is used to in-crease the sample rate; for example, in a four-timesoversampling filter, three zero values are inserted forevery data value output from the disc. This increasesthe data rate from 44.1 kHz–176.4 kHz. Interpolationis used to generate the values of intermediate samplepoints, for example, three intermediate samples foreach original sample. These samples are computed

using coefficients derived from a low-pass filter re-sponse.The spectrum of the oversampled output waveform

contains image spectra placed at multiples of theoversampling rate; for example, in a four-timesoversampled signal, the first image is centered at 176.4kHz. Because the audio baseband and sidebands areseparated, a low order analog filter can be used toremove the images, without causing phase shift orother artifacts common to high order analog brick-wall filters.

Traditionally, D/A conversion is performed with amultibit PCM converter. In theory, a 16 bit convertercould perfectly process the 16 bit signal from the disc.However because of inaccuracies in converters, 18 bitD/A converters are often used because they can more

accurately represent the signal. Alternatively, low bit(sometimes called 1 bit) D/A converters can be used.They minimize many problems inherent in multibitconverters such as low level nonlinearity and zero-cross distortion. Low bit systems employ very highoversampling rates, noise shaping, and low bit con-version.

Also present in the audio output stage of every CDplayer is an audio deemphasis circuit. Some CDs areencoded with audio preemphasis characteristic with


Figure 3.4-14. A pit causes cancellation through destructive inter-ference.

Page 13: audio recording systems

8/12/2019 audio recording systems

http://slidepdf.com/reader/full/audio-recording-systems 13/20



Figure 3.4-15. Block diagram of a CD player with digital filtering. (From Pohlmann,   The Compact Disc.)

Page 14: audio recording systems

8/12/2019 audio recording systems

http://slidepdf.com/reader/full/audio-recording-systems 14/20


time constants of 15 and 50     sec. Upon playback,deemphasis is automatically carried out, resulting inan improvement in S/N.

Recordable CD-R FormatWith a CD-R (or CD-WO) write-once optical disc

recorder, the user may record data until the disc capac-ity is filled. Recorded CD-R discs are playable onconventional CD players. A block diagram of a CD-R recorder is shown in Figure 3.4-16. An encodercircuit accepts an input PCM signal and performsCIRC error correction encoding, EFM modulating, andother codingand directs the data stream to the recorder.The recorder accepts audio data and records up to 74minutes in real time. In addition to audio data, a com-

plete subcode table is written in the disc TOC, andappropriate flags are placed across the playing surface.

Write-once media is manufactured similarly to con-ventional playback-only discs. As with regular CDs,they employ a substrate, reflective layer and protectivetop layer. Sandwiched between the substrate and re-flective layer, however, is a recording layer comprisedof an organic dye. Together with the reflective layerit provides a typical in-groove reflectivity of 70% ormore. Unlike playback-only CDs, a pregrooved spiraltrack is used to guide the recording laser along thespiral track; this greatly simplifies recorder hardwaredesign and ensures disc compatibility. Shelf life of themedia is said to be 10 years or more at 25° centigradeand 65% relative humidity. However the dye used in


Figure 3.4-16. Block diagram of a CD-R recorder. (From Pohlmann, The Compact Disc .)

Page 15: audio recording systems

8/12/2019 audio recording systems

http://slidepdf.com/reader/full/audio-recording-systems 15/20


flected from the reverse oriented regions differs fromthat reflected from unreversed regions, as shown inFigure 3.4-17(b). To read the disc, a low powered laseris focused on the data surface, and the angle of rotationof reflected light is monitored thus recovering datafrom the laser light. To erase data a magnetic field isapplied to the disc, along with the laser heating spot.Tests indicate that MO media could be erased/recordedover 10 million times and would retain data for over10 years.

Digital Audio Tape FormatThe rotary-head digital audio tape (R-DAT or DAT)

format was originally designed as a consumer medium

to replace the analog cassette. However the format hasfound wider application as a low cost professionaldigital recording system.

Format Specifications The DAT format supports four record/playback 

modes and two playback-only modes. The standardrecord/playback, and both playback-only modes, wideand normal, are implemented on every DAT recorder.The standard mode offers 16 bit linear quantizationand 48 kHz sampling rate. Both playback-only modesuse a 44.1 kHz sampling rate, for user- and prerecordedtapes. Three other record/playback modes, called Op-tions 1, 2, and 3, all use 32 kHz sampling rates. Option1 provides 2 hour recording time with 16 bit linearquantization. Option 2 provides 4 hours of recording

time with 12 bit nonlinear quantization. Option 3 pro-vides 4 channel recording and playback, also using12 bit nonlinear quantization. These specifications aresummarized in Figure 3.4-18.

The user can write and erase nonaudio informationinto the subcode area: start ID indicating the beginningof a selection, skip ID to skip over a selection, andprogram number indicating selection order. This sub-code data permits rapid search and other functions.Although subcode data are recorded onto the tape in

these discs is vulnerable to sunlight thus discs shouldnot be exposed to bright sun over a long period.

The CD-R format is defined in the   Orange Book Standard   authored by Philips and Sony. In CD re-corders adhering to the   Orange Book I Standard , adisc must be recorded in one pass—start-stop recordingis not permitted. In recorders adhering to the   Orange

 Book II Standard , recording may be stopped andstarted. In many players, tracks may be recorded atdifferent times and replayed but because the disc lacksthe final TOC, it can be played only on a CD-R re-corder. When the entire disc is recorded the interimTOC data are transferred to a final TOC, and the discmay be played in any CD audio player. The programmemory area (PMA) located at the inner portion of thedisc contains the interim TOC record of the recordedtracks. In addition, discs contain a power calibrationarea (PCA); this allows recorders to automaticallymake test recordings to determine optimum laserpower for recording. Some recorders exceed the   Or-ange Book II Standard ; they generate an interim TOCthat allows partially recorded discs to be played onplayback-only CD players.

CD-R recorders are useful because they eliminatethe need to create an edited master tape prior to CDrecording. If a passage is not wanted, it can be markedprior to writing the final TOC so that the recorder willnot play it back. For example, dead air during a liveperformance can be marked so it is deleted wheneverthe disc is played back. The data physically continuesto exist on the disc, however.

Erasable CD-E FormatCD systems that provide for both recording and

erasing are known as  CD-E systems. Erasable optical

systems permit data to be written, read, erased andwritten again. Several recordable/erasable optical me-dia have been introduced, most notably, magnetoop-tical (MO) media. Magnetooptical recording technol-ogy combines magnetic recording and laser optics,utilizing the record/erase benefits of magnetic materi-als with the high density and contactless pickup of optical materials.

With magnetooptics, a magnetic field is used torecord data, but the applied magnetic field is muchweaker than conventional recording fields. It is notstrong enough to orient the magnetic particles. How-ever the coercivity of the particles sharply decreasesas they are heated to their Curie temperature. A laserbeam focused through an objective lens heats a spotof magnetic material and only the particles in that spot

are affected by the magnetic field from the recordingcoil, as shown in Figure 3.4-17(a). After the laserpulse is withdrawn, the temperature decreases and theorientation of the magnetic layer records the data. Inthis way, the laser beam creates a small recorded spotthus increasing recording density.

The Kerr effect may be used to read data; it describesthe slight rotation of the plane of polarization of polar-ized light as it reflects from a magnetized material.The rotation of the plane of polarization of light re-


Figure 3.4-17. Magnetooptical recording (a) and playback (b).

Page 16: audio recording systems

8/12/2019 audio recording systems

http://slidepdf.com/reader/full/audio-recording-systems 16/20



the helical scan track along with the audio signal, itis treated independently and can be rewritten withoutaltering the audio program, and entered either duringrecording or playback. With the ID codes entered intothe subcode area, desired points on the tape such asthe beginning of selections can be searched for at highspeed by detecting each ID code. During playback, if the skip ID is marked, playback is skipped to the pointat which the next start ID is marked, and playback 

begins again.In the DAT format, the recorded area is distin-guished from a blank section of tape with no recordedsignal, even if the recorded area does not contain anaudio signal. Unlike blank areas, the track format isalways encoded on the tape even if no signal is present.If these sections are mixed on a tape, search operationsmay be slowed. Hence, blank sections should beavoided. A consumer DAT deck with an interfacemeeting the specifications of the Sony Philips digitalinterface format (SPDIF) will identify when data havebeen recorded with a copy inhibit Serial Copy Manage-ment System (SCMS) flag in the subcode (ID6 in themain ID in the main data area) and will not digitallycopy that recording. In other words,SCMS permits firstgeneration digital copying, but not second generation

copying. Analog copying is not inhibited.

DAT Recorder Design From a hardware point of view, a DAT recorder

utilizes many of the same elements as a CD-R recorder:A/D and D/A converters, modulators and demodula-tors, error correction encoding and decoding. Audioinput is received in digital form, or is converted todigital by an A/D converter. Error correction code isadded and interleaving is performed. As with any heli-

cal scan system, time compression must be used toseparate the continuous input analog signal into seg-ments prior to recording, then rejoin them upon play-back with time expansion to form a continuous audiooutput signal. Subcode information is added to the bitstream, and it undergoes eight-to-ten (8/10) modula-tion. This signal is recorded via a recording amplifierand rotary transformer.

In the playback process the rotary head generates the

record waveform. Track finding signals are derivedfrom the tape andused to automatically adjust tracking.Eight-to-tendemodulation takes place andsubcodedataare separated and used for operator and servo control.A memory permits de-interleaving as well as timeexpansionandeliminationofwowandflutter.Errorcor-rection is accomplished in the context of de- interleav-ing. Finally, theaudio signal is outputas a digital signal,or through D/A converters as an analog signal.

The DAT rotary head permits slow linear tape speedwhile achieving high bandwidth. Each track is discon-tinuously recorded as the tape runs past the tilted headdrum spinning rapidly in the same direction as tapetravel. The result are diagonal tracks at an angle of slightly more than 6° from the tape edge, as shown inFigure 3.4-19. Despite the slow linear tape speed of 

8.15 mmps (1/4 in. per second), a high relative tape-to-head speed of about 3 msec (120 in. per second) isobtained. A DAT rotating drum (typically 30 mm indiameter) rotates at 2,000 rpm, typically has two headsplaced 180° apart, and a tape wrap of only 90°. Fourhead designs provide direct read after write, so therecorded signal can be monitored.

Azimuth recording (or guard-bandless recording),is used in which the drum’s two heads are angleddifferently with respect to the tape; this creates two

Figure 3.4-18. DAT standard specifications. (From Pohlmann,  Principles of Digital Audio .)

Page 17: audio recording systems

8/12/2019 audio recording systems

http://slidepdf.com/reader/full/audio-recording-systems 17/20


sync byte, ID code byte, block address code byte,parity byte and 32 data bytes. In total, there are 288bits per data block; following 8/10 modulation, this isincreased to 360 channel bits. Four 8 bit bytes areused for sync and addressing. The ID code containsinformation on pre-emphasis, sampling frequency,quantization level, tape speed, copy-inhibit flag, chan-nel number, etc. Subcode data are used primarily forprogram timing and selection numbering. The subcodecapacity is 273.1 kbps. The parity byte is the exclusiveor sum of the ID and block address bytes, and is usedto error correct them.

Since the tape is always in contact with the rotatingheads during record, playback and search modes, tape

wear necessitates sophisticated error correction. DATis thus designed to correct random and burst errors.Random errors are caused by crosstalk from an adja-cent track, traces of an imperfectly erased signal ormechanical instability. Burst errors occur from drop-outs caused by dust, scratches on the tape or by headclogging with dirt.

To facilitate error correction, each data track is splitinto halves, between left and right channels. In addi-tion, data for each channel is interleaved into even

track types, sometimes referred to as A and B, withdiffering azimuth angles between successively re-corded tracks. This   20° azimuth angle means thattheA head will read an adjacentB track at an attenuatedlevel due to phase cancellation. This reduces crosstalk between adjacent tracks, eliminates the need for aguardband between tracks and promotes high densityrecording. Erasure is accomplished by overwriting newdata to tape such that successive tracks partially writeover previous tracks. Thus the head gaps (20.4 mi-crons) are approximately 50% wider than the tracks(13.59 microns) recorded to tape.

The length of each track is 23.501 millimeters. Eachbit of data occupies 0.67 microns, with an overall

recording data density of 114 Mb per square inch.With a sampling rate of 48 kHz and 16 bit quantization,the audio data rate for two channels is 1.536 Mbps.However, error correction encoding adds extra infor-mation amounting to about 60% of the original, in-creasing the data rate to about 2.46 Mbps. Subcoderaises the overall data rate to 2.77 Mbps.

The primary types of data recorded on each track are PCM audio, subcode and automatic track finding(ATF) patterns. Each data (or sync) block contains a


Figure 3.4-19. DAT track configuration. (From Pohlmann,   Principles of Digital Audio .)

Page 18: audio recording systems

8/12/2019 audio recording systems

http://slidepdf.com/reader/full/audio-recording-systems 18/20


and odd data blocks, one for each head; half of eachchannel’s samples are recorded by each head. All of the data are encoded with a doubly-encoded Reed-Solomon error correction code. The error correctionsystem can correct any dropout error up to 2.6 mm indiameter, or a stripe 0.3 mm high. Dropouts up to8.8 mm long and 1.0 mm high can be concealed withinterpolation.

Serial InterfacingMost professional digital audio devices employ an

output protocol using the joint Audio Engineering So-ciety (AES) and European Broadcasting Union (EBU)serial transmission format for digital audio data. It isknown as the AES/EBU or AES3 format and is spec-ified in the ANSI S4.40-1992 Standard. Manufacturersof consumer electronics have adopted a derivativetransmission format that has been standardized by theInternational Electrotechnical Commission (IEC) and

is commonly referred to as the SPDIF. It is specified inthe IEC Report TC 84/WG11 and IEC Publication 958.

The AES/EBU digital audio format transmits andreceives left and right channel data using one digitalcable. The transmission rate corresponds exactly to thesource sampling frequency. One frame consists of twosubframes, labeled A (left channel) and B (right chan-nel), each with 32 bits. Each subframe contains datafor one audio channel. The first 4 bits are used forsynchronization and identifying preambles. The next24 bits carry audio data, with the MSB transmittedlast; 16 bit audio data leaves 4 bits at zero; the first 4bits in the field are set aside for auxiliary audio orother data, as shown in Figure 3.4-20.

The last 4 bits form a control field with the V, U,C, and P bits. The validity (V) bit indicates if theprevious audio sample is error free. The user (U) bitcan be used to form a block of user data associatedwith the audio channel. The channel (C) status bit isused to form a data block; for each channel, one block is formed from the channel status bit contained in 192successive frames. The parity (P) bit is used to provideeven parity foreach subframe. TheAES/EBU Standardspecifies that data are transmitted over twisted pairconductors, with 3-pin XLR connectors. There is asupplemental document, called AES-3id–1995, thatspecifies the method for transmission of AES formatteddata by unbalanced coaxial cable.

In the IEC or SPDIF serial format used in consumerdigital audio equipment the first bit of the channelstatus byte is set to “0” to signify a consumer interface,and a different channel status specification is used. Inaddition, when interfacing CD data, provision is madeto transmit the subcode data in the user bit channel.Theconsumer format also contains provision for SCMS inthe channel status bits. In particular, bit 2 is usedto flag copyrighted material, and bit 15 distinguishesbetween original and copied material. In the IEC orSPDIF format, video coaxial cable with phono plugsor fiber optic cables can be used to convey data.

Digital interconnection formats such as AES3 andSPDIF allow audio data to be transferred from onedevice to another without any generation loss whatso-ever. For example, audio data from a DAT recordingcould be conveyed to a hard disk editing system, pro-cessed there, and returned to DAT without transmis-sion error. However it is important to only connectAES/EBU outputs to AES/EBU inputs, and IEC/ SPDIF outputs to IEC/SPDIF inputs. AES/EBU andIEC/SPDIF interfaces should not be interconnected.Transmitted data could be invalid, or lead to impropermachine operation

Digital Signal Processing Digital signal processing (DSP) has improved the

performance of many existing audio functions suchas equalization and dynamic range compression, andpermits new functions such as ambience processing,dynamic noise cancellation and time alignment. DSPis a technology used to analyze, manipulate or generatesignals in thedigital domain. It uses thesame principlesas any digitization system however instead of a storagemedium such as CD or DAT, it is a processing method.

DSP Applications and Design DSP employs technology similar to that used in

computers and microprocessor systems however thereis an important distinction. A regular computer pro-cesses data, whereas a DSP system processes signals.It is accurate to say that an audio DSP system is inreality a computer dedicated to the processing of audio signals.

Some audio functions DSP can perform include:error correction, multiplexing, sample rate conversion,speech and music synthesis, data compression, filter-ing, adaptive equalization, dynamic range compression


Figure 3.4-20. AES/EBU subframe format. (From Pohlmann,   Principles of Digital Audio .)

Page 19: audio recording systems

8/12/2019 audio recording systems

http://slidepdf.com/reader/full/audio-recording-systems 19/20


As noted, DSP can be used in lieu of most conven-tional analog processing circuits. The advantages of DSP are particularly apparent when various applica-tions such as recording, mixing, equalization and edit-ing are combined in a workstation. For example, a

personal computer, combined with a DSP hardwarecard, hard disk drive, appropriate software and a DATor CD recorder forms a complete post production sys-tem. Such a system allows comprehensive signal ma-nipulation including ability to cut, paste, copy, replace,reverse, trim, invert, fade in, fade out, smooth, loop,mix, change gain and pitch, crossfade and equalize.The integrated nature of such a workstation, its lowcost, and high processing fidelity make it clearly supe-rior to analog techniques.


A continuing limitation of allcommon analog or digital

audio systems is the requirement for moving media.The sophistication of high fidelity rotation and pickupdevices inherently necessitate certain minimum sizesand production costs. Although some small solid-statestorage media exist the volume of data and high priceof memory prohibits today’s technology being appliedto professional audio recording systems.

Radical static storage approaches are being re-searched and at least one, photorefractive volume holo-graphic storage (PVHS), shows impressive promise.

and expansion, crossovers, reverberation, ambienceprocessing, time alignment, acoustic noise cancella-tion, mixing and editing and acoustic analysis. SomeDSP functions are embedded within other applications;for example, the error correction systems and oversam-pling filters found in CD players are examples of DSP.In other applications the user has control over theDSP functions.

Digital processing is more precise, repeatable andcan perform operations that are impossible with analogtechniques. Noise and distortion can be much lowerwith DSP thus audio fidelity is much higher. In addi-tion, whereas analog circuits age, lose calibration andare susceptible to damage in harsh environments, DSPcircuits do not age, cannot lose calibration and aremuch more robust. However DSP technology is anexpensive technology to develop. Hardware engineersmust design the circuit or employ a DSP chip, andsoftware engineers must write appropriate programs.Special concerns must be addressed when writing thecode needed to process the signal. For example, if anumber is simply truncated without regard to its value,a significant error could occur, and the error would becompounded as many calculations take place, eachusing truncated results. The resulting numerical errorwould be manifested as distortion in the output signal.Thus all computations on the audio signal must behighly accurate. This requires long wordlengths; DSPchips employ digital words that are 32 bits in length,or longer.

In addition, even simpleDSP operations may requireseveral intermediate calculations and complex opera-tions may require hundreds of operations. To accom-plish this, the hardware must execute the steps veryquickly. Because all computation must be accom-

plished in real time, that is, within the span of onesample period, the processing speed of the system iscrucial. A DSP chip must often process 50–100 millioninstructions per second. This allows it to run completesoftware programs on every audio sample as it passesthrough the chip.

DSP productsare morecomplicatedthan similar ana-log circuits, but DSP possesses an inherent advantageover analog technology—it is programmable. Usingsoftware,manycomplicatedfunctionscan beperformedentirelywith coded instructions.Figure3.4-21(a)showsa band-pass filter using conventional analog compo-nents. Figure 3.4-21(b) shows the same filter, repre-sented as a DSP circuit. It employs the three basic DSPoperators of delay, addition and multiplication. How-ever this DSP circuit may be realized in software terms.

Figure 3.4-21(c) shows an example of the computercode (Motorola DSP56001) needed to perform band-pass filtering with a DSP chip. There are many advan-tages to this software implementation. Whereas hard-ware circuits would require new hardware componentsandnew circuit designto changetheir processing tasks,the software implementation could be changed by al-tering parameters in the code. Moreover the programcould be written so different parameters could be em-ployed based on user control.


Figure 3.4-21. A band-pass filter represented by an analog circuit(a), digital signal processing circuit (b), and digital signal proc-essing instructions (c). (Courtesy Motorola.)

Page 20: audio recording systems

8/12/2019 audio recording systems

http://slidepdf.com/reader/full/audio-recording-systems 20/20


This random access memory device has a capacitymeasured in hundreds of gigabytes. The data is storedas three-dimensional optical holograms, with data be-ing written and retrieved as two dimensional patternsof laser light. A light sensitive crystal, (Holostore)serves as the medium for PVHS.

This form of crystal memory is non-volatile, remov-able, and achieves data transfer rates of 1 TB (terabyte)per second. Today’s fastest magnetic disk would takefive hours to resolve the same amount of data. It hasbeen reported that a removable PVHS media of some10     10     .5 cm could store 100 GB—about amonth’s worth of continuous stereo digital audio. Overthe coming decade the emergence of static opticalstorage may reduce moving magnetic media applica-tions to antiquity.


Alten, Stanley R.,   Audio in Media, Fourth Edition,Wadsworth Publishing Company, Belmont, Califor-nia, 1994.

Audio Engineering Society, “AES RecommendedPractice for Digital Audio Engineering—SerialTransmission Format for Linearly Represented Dig-ital Audio Data,” Journal of the Audio EngineeringSociety, vol. 33, no. 12, December, 1985.

Carasso, M.G., Peck, J.B.H., Sinjou, J.P., “The Com-

pact Disc Digital Audio System,”  Philips Technical Review, vol. 40, no. 6, 1982.

EBU (European Broadcasting Union). “Specificationof the Digital Audio Interface.” EBU Doc. Tech.,3250.

IEC (International Electrotechnical Commission).“Draft Standard for a Digital Audio Interface.” IECReport TC 84/WG11, November, 1986.

Jorgensen, Finn, The Complete Handbook of Magnetic Recording, 4th Edition, McGraw-Hill, 1996.

Mee, Dennis C. & Daniel, Eric D., editors,   Magnetic Recording Technology, Second Edition, McGraw-Hill, 1996.

Pohlmann, Ken C., Principles of Digital Audio, ThirdEdition, McGraw-Hill, 1995.

Pohlmann, Ken C. The Compact Disc, Second Edition.AR Editions, Madison, WI, 1992.

Pohlmann, Ken C., Principles of Digital Audio, SecondEdition. Howard W. Sams and Co., Carmel, IN,

1989.Pohlmann, Ken C., editor,   Advanced Digital Audio.Howard W. Sams and Co., Carmel, IN, 1991.

Watkinson, John,  RDAT , Focal Press, Oxford, 1991.

Additional Sources of Information

Tremaine, Howard M., Audio Cyclopedia, Howard W.Sams, Inc., now out of print but still available viaspecialty bookstores.