5 Speed setting
5.1 Introduction

Few things give more trouble than setting the speed of an anomalous recording correctly. There are many factors in the equation, and often they are contradictory. This writer therefore feels it is important, not only to take corrective action, but to document the reasons why a decision has been made. Without such documentation, users of the transferred recording will be tempted to take further corrective action themselves, which may or may not be justified - no-one knows everything!

I must (with respect) point out that “psychoacoustics” can often play a dominant role in speed-setting. Personally, I can’t do the following trick myself, but many musicians consistently and repeatedly get a sensation that something is “right” when they hear music at the correct pitch. They usually can’t articulate how they know, and since I don’t know the sensation myself, I can’t comment; but it’s my duty to point out the potential traps of this situation.

It’s a craft that musicians will have learnt. I am not saying that such musicians are necessarily “wrong”. I am, however, saying that musical pitch has changed over the years, that actual performances will have become modified for perfectly good scientific reasons, and yet hardly anybody has researched these matters. Ideally therefore, analogue sound restoration operators should make themselves aware of all the issues, and be prepared to make judgements when trying to reach “the original sound” or “the intended original sound.”

When we come to make an objective copy, there are two types of analogue media which need somewhat different philosophies. One occurs when the medium gives no indication of where a particular sound is located, the main examples being full-track magnetic tape and magnetic wire. In these cases it is impossible even to add such information retrospectively without sacrificing some of the power-bandwidth product, because there are no sprockets, no pulses, no timecode, nor any “spare space” to add them. But other cases have a “location-mechanism by default.” For example, we could refer to a particular feature being at “the 234th turn of the disc record”. It is very possible that future digital processes may use information like this; and ideally we should not sacrifice such information as we convert the sound to digital. During this chapter we shall see that it is often impossible to set a playing-speed with greater accuracy than one percent. In which cases, it may be advantageous to invoke a “digital gearbox” to lock the rotational speed of the disc with the sampling-frequency of the digital transfer, so the rotations of the disc do not evaporate from the digital copy.

Pure sound operators are sometimes unaware that a very close lock is vital in some circumstances, so I shall define that word “lock.” It means that the speed of the original medium and the speed of the transferred sound must match to a very tight tolerance (typically one part in a million). This is much tighter than most ordinary sound media can do; so we may need to create our own “digital gearbox,” especially for digital signal-processes downstream of us. And this means we may have to do some creative thinking to establish a suitable gear-ratio.

On the other hand, it is impossible to “lock” analogue media which gradually change in speed with a fixed “gearbox.” But obviously some form of gearbox is essential for a sound medium intended to accompany moving pictures, since it is always implied that “Pictures are King,” and sound must follow in synchronism, even if it’s actually the wrong speed! As an illustration of the point I am trying to make, to provide consistently reproducible sound for a film running at 24 frames per second, we could multiply the frame-rate by 2000 (making 48000), and clock our analogue-to-digital converter from that.

5.2 History of speed control

I shall start with a brief history of speed-control in the history of analogue sound recording, and ask you to bear it in mind as different situations come up.

The very earliest cylinder and disc machines were hand-cranked, but this was soon found unsatisfactory, except for demonstrating how pitch varied with speed ! Motor-drive was essential for anything better. Early recorders of the 1880s and 1890s were powered by unregulated DC electric motors from primitive electrochemical cells. Several percent of slow speed drift is the normal result.

Clockwork motors, both spring and weight powered, quickly replaced electric motors because of the attention which such early batteries demanded. But mainsprings were less reliable than falling weights, so they tended to be used only where portability was essential (location recording), and for amateur use. The centrifugal governor was adopted at the same time to regulate such motors; the one in the surviving acoustic lathe at the EMI Archive, which is weight-powered, is made to exactly the same pattern as in spring gramophones. Oddly enough, a spring would have given better results for edge-start disc-records. According to Hooke’s law, the tension in a spring is proportional to its displacement, so there was more torque at the start of the recording, precisely where it was most needed. Yet professional disc recordists actually preferred the weight system, justifying the choice with the words “Nothing is more consistent than gravity.”

The governor could be adjusted within quite wide limits (of the order of plus or minus twenty percent). Most commercial disc records were between 70 and 90rpm, with this range narrowing as time progressed. Likewise, although location or amateur cylinders might well differ widely from the contemporary standards (section 5.4), they were often quite constant within the recording itself.

In the late 1920s alternating-current electric mains became common in British urban areas, and from the early 1930s AC electric motors began to be used for both disc and film recording. These motors were affected by the frequency of the supply. During cold winters and most of the second World War, frequencies could vary.

BBC Radio got into such trouble with its programmes not running to time that it adopted a procedure for combating it, which I shall mention in detail because it provides one of the few objective ways of setting speeds that I know. It applies to “Recorded Programmes” (i.e. tape and disc recordings with a “R. P. Number”, as opposed to BBC Archive or Transcription recordings) made within a mile or two of Broadcasting House in London. (I shall be mentioning these further in section 6.52).

The various studios took line-up tone from a centrally-placed, very stable, 1kHz tone-generator (which was also used for the “six pips” of the Greenwich Time Signal). When a recording was started, a passage of this line-up tone was recorded, not only to establish the programme volume (its main purpose), but as a reference in case the frequency of the supply was wrong. When the disc or tape was played back, it was compared with the tone at the time, and the speed could be adjusted by ear with great accuracy. We can use this technique today.

If you are playing a “BBC Recorded Programme” and you have an accurate 1kHz tone-source or a frequency-counter, you can make the recording run at precisely the correct speed. This applies to recordings made at Broadcasting House, 200 Oxford Street, and Bush House; you can recognise these because the last two letters of the R. P. Reference Number prefixes are LO, OX and BU respectively. But do not use the system on other BBC recordings, made for example in the regions. The master-oscillators at these places were deliberately made different, so that when engineers were establishing the landlines for an inter-regional session, they could tell who was who from the pitch of the line-up tone. But there was an internal procedure which stated that either accurate 1kHz tone was used, or tone had to be at least five percent different. So if you find a line-up tone outside the range 950Hz - 1050Hz, ignore it for speed-correction purposes.

To continue our history of speed-stability. Transportable electrical recording machinery became available from the late 1930s which could be used away from a mains supply. It falls into three types. First we have the old DC electric motor system, whose speed was usually adjusted by a rheostat. (For example, the BBC Type C disc equipment, which a specialist can recognise from the appearance of the discs it cut. In this case an electronic circuit provided a stroboscopic indicator, although the actual speed-control was done manually by the engineer). Next we have mains equipment running from a “transverter” or “chopper”, a device which converted DC from accumulators into mains-voltage A.C. (For example, the machine used by wildlife recordist Ludwig Koch. These devices offered greater stability, but only as long as the voltage held up).

Finally we have low-voltage DC motors controlled by rapidly-acting contacts from a governor. (For example, the EMI “L2” portable tape recorder). All these systems had one thing in common. When they worked, they worked well; but when they failed, the result was catastrophic. The usual cause was a drop in the battery voltage, making the machine run at a crawl. Often no-one would notice this at the time. So you should be prepared to do drastic, and unfortunately empirical, speed correction in these cases.

It wasn’t until the “transistor age” that electronic ways of controlling the speed of a motor without consuming too much power became available, and in 1960 the first “Nagra” portable recorder used the technology. From the late 1960s electronic speed control became reliable on domestic portable equipment. Similar technology was then applied to mains equipment, and from about 1972 onwards the majority of studio motors in Britain began to be independent of the mains frequency.

But do not assume your archive’s equipment is correct without an independent check. I insist: an independent check. Do not rely on the equipment’s own tachometers or internal crystals or any other such gizmology. I regret I have had too much experience of top-of-the-range hardware running at the wrong speed, even though the hardware itself actually insists it is correct! You should always check it with something else, even if it’s only a stroboscopic indicator illuminated by the local mains supply, or a measured length of tape and a stopwatch. As an example of this problem, I shall mention the otherwise excellent Technics SL.1200 Turntable, used by broadcasters and professional disc-jockeys. This is driven from an internal crystal; but the same crystal generates the light by which the stroboscope is viewed. The arithmetic of making this work forces the stroboscope around the turntable have 183 bars, rather than the 180 normally needed for 50Hz lighting in Europe. So the actual speed may be in error, depending how you interpret the lighting conditions!

History of speed-control in visual media

I must also give you some information about the methods of speed-control for film and video. Pure sound archivists may run into difficulties here, because if you don’t understand the source of the material, you may not realise you have something at the wrong speed. And if your archive collects picture media as well, you need a general idea of the history of synchronisation techniques, as these may also affect the speed of the soundtrack. But if your archive doesn’t use any sound which has accompanied moving pictures in the past, I suggest you jump to Section 5.4.

As this isn’t a history of film and video, I am obliged to dispense with the incunabula of the subject, and start with the media which the European archivist is most likely to encounter.

In silent-film days, cameras were generally hand-cranked, and the intended speed of projection was between twelve and twenty frames per second. For the projector, the shutter had two blades, or sometimes three; this chopped up the beam and raised the apparent frequency of flicker so that it was above the persistence of vision. Moving scenes did, of course, jerk past slower than that, but this was acceptable because the brightness of cinema screens was lower in those days, and the persistence of vision of human eyes increases in dim light.

But there was a sudden and quite definite switch to a higher frame-rate when sound came along. This happened even when the sound was being recorded on disc rather than film, so it seems that the traditional story of its being necessary to allow high audio frequencies onto the film is wrong. I suspect that increasing viewing-standards in cinemas meant the deficiencies of slower speeds were very obvious to all. So, with motorised cameras now being essential if the accompanying soundtrack were to be reproduced with steady pitch, the opportunity was taken for a radical change. Anyway, nearly all sound films were intended for projection at 24 frames per second, and all studio and location film crews achieved this speed by appropriate gearing in conjunction with A.C. motors fed from a suitable A.C. supply.

There were two basic methods which were used for both recording and projection; they ran in parallel for a couple of years, and films made by one process were sometimes copied to the other. They were optical sound (which always ran at the same speed as the picture, whether on a separate piece of celluloid or not), and coarsegroove disc (always exactly 33 1/3rpm when the picture was exactly 24 frames per second). Most film crews had separate cameras and sound recorders running off the same supply, and the clapper-board system was used so the editor could match the two recordings at the editing-bench.

Because location-filming often required generous supplies of artificial light, location crews took a “power truck” with them to generate the power; but this does not mean the A.C supply was more vulnerable to change, because of a little-known oddity. The speed of 24 frames per second had the property of giving steady exposure whether the camera looked at 50Hz or 60Hz lighting. If however the lights and the camera were running off separate supplies, there was likely to be a cyclic change in the film exposure, varying from perhaps once every few seconds to several times per second. Location electricians therefore spent a significant amount of time checking that all the lights plus the picture and sound cameras were all working off the same frequency of supply, wherever the power actually came from.

Since the invention of the video camera, this fact has been “rediscovered”, because 50Hz cameras give “strobing” under 60Hz lights and vice-versa. So, since the earliest days of talkies, location power supplies have never been allowed to vary, or strobing would occur. Thus we can assume fairly confidently that feature-films for projection in the cinema are all intended to run at 24 frames per second. And whether the sound medium is sepmag, sepopt, commag, comopt, or disc, it can be assumed that 24 fps working was the norm - until television came along, anyway.

But when television did come along in the late 1940s, this perfection was ruined. Television was 25 fps in countries with 50Hz mains, and 30 fps in countries with 60Hz mains, to prevent “rolling hum-bars” appearing on the picture. This remains generally true to this day. (For the pedantic, modern colour NTSC pictures - as used in America and Japan - are now 29.97 frames per second). In Europe we are so used to 50Hz lighting and comparatively dim television screens that we do not notice the flicker; but visiting Americans often complain at our television pictures and fluorescent lighting, as they are not used to such low frequencies of flicker at home.

Before the successful invention of videotape (in America in 1956), the only way of recording television pictures was “telerecording” (US: “kinetoscoping”) - essentially filming a television screen by means of a film camera. Telerecording is still carried out by some specialists, the technique isn’t quite dead. All current television systems use “interlacing,” in which the scene is scanned in two passes called “fields” during one frame, to cut down the effect of flicker. To record both halves of the frame equally, it is necessary for the film camera to be exactly “locked” to the television screen, so that there are precisely 25 exposures per second in 50Hz countries, and 30 (or later 29.97) exposures per second in 60Hz countries.

So whether the sound medium is comopt, commag or sepmag, the speed of a telerecording soundtrack is always either 25, 30 or 29.97 frames per second. Thus, before you can handle an actual telerecording, you must know that it is a telerecording and not a conventional film, and run it on the appropriate projector. A cinema-type projector will always give the wrong speed.

The real trouble occurs when film and video techniques are mixed, for example when a cinema film is shown on television. We must not only know whether we are talking about a film or a telerecording, but we must also know the country of transmission.

In Europe, feature films have always been broadcast at 25 frames per second. Audio and video transfers from broadcast equipment are therefore a little over four percent fast, just under a semitone. Thus music is always at the wrong pitch, and all voices are appreciably squeaky. There is quite a lot of this material around, and unless you know the provenance, you may mistakenly run it at the wrong speed. Keen collectors of film music sometimes had their tape-recorders modified to run four percent faster on record, or four percent slower on playback; so once again you have to know the provenance to be certain.

Meanwhile, cinema films broadcast in 60Hz countries are replayed at the right speed, using a technique known as “three-two pulldown.” The first 24fps frame is scanned three times at 60Hz, taking one-twentieth of a second; the next frame is scanned twice, taking one-thirtieth of a second. Thus two frames take one-twelfth of a second, which is correct. But the pictures have a strange jerky motion which is very conspicuous to a European; but Americans apparently don’t notice it because they’ve always had it.

Optical films shot specifically for television purposes usually differed from the telerecording norm in America. They were generally 24 frames per second like feature films. This was so such films could be exported without the complications of picture standards-conversion. But in Europe, cameramen working for TV have generally had their cameras altered so they shoot at 25 frames per second, like telerecordings. Thus stuff shot on film for television is at the right speed in its country of origin; but when television films cross the Atlantic in either direction they end up being screened with a four percent error. Only within the last decade or so have some American television films been shot at 30 frames per second for internal purposes.

Up to this point I have been describing the conventional scenario. To fill in the picture, I’m afraid I must also mention a few ways in which this scenario is wrong, so you will be able to recognise the problems when they occur.

Although feature-films are made to world-wide standards, there was a problem when commercial videos became big business from about 1982 onwards. Some videos involving music have been made from American films (for example Elvis Presley movies), and these were sometimes transferred at 24 fps to get the pitch right. This was done by what is laughingly called “picture interpolation.” To show twenty-four frames in the time taken for twenty-five, portions of the optical frame were duplicated at various intervals; this can be seen by slow-motion analysis of the picture. The sound therefore came out right, although the pictures were scrambled. In cases of doubt, still-frame analysis of a PAL or SECAM video can be used as evidence to prove the audio is running correctly!

More often, it is considered preferable not to distort the picture. Here I cannot give you a foolproof recipe. My present experience (1999) suggests that most of the time the sound is four percent fast; but I understand some production-houses have taken to using a “Lexicon” or “harmonizer” or other device which changes pitch independently of speed (Ref. 1). Thus if the musical or vocal pitch is right and there are no video artefacts, it may mean that the tempo of the performance is wrong.

But there have now been two more twists in the story. Sometimes American television material is shot on film at 24 fps, transferred to 30 fps videotape for editing and dubbing into foreign languages, and then subjected to electronic standards-conversion before being sent to Europe. This gives the right speed of sound, but movement-anomalies on the pictures; but again, you can regard the presence of movement anomalies as evidence that the sound is right.

The second twist came with Snell and Wilcox’s “DEFT” electronic standards-converter, which has sufficient solid-state memory to recognise when “three-two pulldown” has taken place. It is then possible to reverse-engineer the effect to “two-two pulldown,” and copy steady images to a video recorder running at 48 fields per second, ready for transmission on a conventional video machine at 50Hz. Again, the steady pictures warn you something is wrong with the sound.

5.4 Setting the speed of old commercial sound records

In setting the speed of professional audio recordings, my opinion is that the first consideration (which must predominate in the absence of any other evidence) is the manufacturer’s recommended speed. For the majority of moulded commercial cylinder records, this was usually 160 revolutions per minute; for most coarsegroove records, it was about 80rpm until the late 1920s and then 78rpm until microgroove came along; for magnetic tapes, it was usually submultiples of 60 inches per second. (Early “Magnetophon” tapes ran a little faster than 30 inches per second, and this is thought to apply to EMI’s earliest master-tapes made before about 1951. Ref. 2).

Unfortunately it isn’t always easy for the modern archivist to discover what the recommended speed actually was. It does not always appear on the record itself, and if it is mentioned at all it will be in sales literature, or instructions for playback equipment made by the same company.

The recommended speeds of Edison commercial pre-recorded cylinders have been researched by John C. Fesler (Ref. 3). The results may be summarised as follows:

1888-1892: 100rpm
Mid-1892 to at least 1st November 1899: 125rpm
June 1900 to the beginning of moulded cylinders: 144rpm
All moulded cylinders (late 1902 onwards): 160rpm.

It is also known that moulded cylinders by Columbia were intended to revolve at 160rpm, and this forms the “baseline” for all moulded cylinders; so do not depart from 160rpm unless there is good reason to do so.

The following is a list of so-called 78rpm discs which weren’t anywhere near 78, all taken from “official” sources, contemporary catalogues and the like.

Berliner Gramophone Company. Instructions for the first hand-cranked gramophones recommended a playing-speed of about 100rpm for the five-inch records dating from 1890-1894, and 70rpm for the seven-inch ones dating from about 1894-1900. But these are only “ballpark” figures.
Brunswick-Cliftophone (UK) records prior to 1927 were all marked 80rpm. Since they were all re-pressings from American stampers, this would appear to fix the American Brunswicks of this time as well.
Columbia (including Phoenix, Regal, and Rena): according to official company memoranda, 80rpm for all recordings made prior to 1st September 1927, from both sides of the Atlantic; 78rpm thereafter. But I should like to expand on this. The company stated in subsequent catalogues that Columbia records should be played “at the speed recommended on the label.” This is not quite true, because sometimes old recordings were reissued from the original matrixes, and the new versions were commonly labelled “Speed 78” by the printing department in blissful ignorance that they were old recordings. The best approach for British recordings is to use the matrix numbers. The first 78rpm ones were WA6100 (ten-inch) and WAX3036 (twelve-inch). At this point I should like to remind you that I am still talking about official speeds, which may be overridden by other evidence, as we shall see in sections 5.6 onwards. Note too that Parlophone records, many of which were pressed by Columbia, were officially 78.
Edison “Diamond discs” (hill-and-dale): 80rpm.
Grammavox: 77rpm. (The Grammavox catalogue was the pre-war foundation for the better-known UK “Imperial” label; Imperial records numbered below about 900 are in fact Grammavox recordings).
Vocalion: All products of the (British) Vocalion company, including “Broadcast”, “Aco”, and “Coliseum”, and discs made by the company for Linguaphone and the National Gramophonic Society, were officially 80rpm.

Finally, there are a few anomalous discs with a specific speed printed on the label. This evidence should be adopted in the absence of any other considerations.

There also has to be a collection of “unofficial” speeds; that is to say, the results of experience which have shown when not to play 78s at 78.

It is known that for some years the US Victor company recorded its master-discs at 76rpm, so they would sound “more brilliant” when reproduced at the intended speed of 78rpm. (This seems to be a manifestation of the syndrome whereby musicians tune their instruments sharp for extra brilliance of tone). At the 1986 Conference of the Association of Recorded Sound Collections, George Brock-Nannestad presented a paper which confirmed this. He revealed the plan was mentioned in a letter from Victor to the European Gramophone Company dated 13th July 1910, when there had been an attempt to get agreement between the two companies; but the Gramophone Company evidently considered this search for artificial brilliance was wrong, and preferred to use the same speeds for recording and playback. George Brock-Nannestad said he had confirmed Victor’s practice upon several occasions prior to the mid-1920s.
Edison-Bell (UK) discs (including “Velvet Face” and “Winner”) tended to be recorded on the high side, particularly before 1927 or so; the average is about 84rpm.
Pathé recordings before 1925 were made on master-cylinders and transferred to disc or cylinder formats depending upon the demand. The speed depends on the date of the matrix or mould, not the original recording. The earliest commercial cylinders ranged from about 180rpm to as much as 200rpm, and then they slowed to 160 just as the company switched to discs in 1906. The first discs to be made from master-cylinders were about 90rpm, but this is not quite consistent; two disc copies of Caruso’s famous master-cylinders acquired by Pathé, one pressed in France and one in Belgium, have slightly different speeds. And some Pathé disc sleeves state “from 90 to 100 revolutions per minute.” But a general rule is that Pathé discs without a paper “label” (introduced about 1916) will have to run at about 90rpm, and those with a paper label at about 80. The latter include “Actuelle,” British “Grafton,” and some “Homochord.”
In 1951 His Master’s Voice issued their “Archive Series” of historic records (VA and VB prefixes). The company received vituperation from collectors and reviewers for printing “SPEED 78” in clear gold letters upon every label, despite the same records having been originally catalogued with the words “above 78” and “below 78.”

Quite often official recommended speeds varied from one record to the next. I will therefore give you some information for such inconsistent commercial records.

Odeon, pre-1914. The English branch of the Odeon Record company, whose popular label was “Jumbo”, were first to publicise the playing speeds for their disc records. They attempted to correct the previous misdemeanours of their recording-engineers in the trade magazine Talking Machine News (Vol.VI No.80, September 1908), in which the speeds of the then-current issues were tabulated.
Subsequently, Jumbo records often carried the speed on the label, in a slightly cryptic manner (e.g. “79R” meant 79 revolutions per minute), and this system spread to the parent-company’s Odeon records before the first World War. We don’t know nowadays precisely how these speeds were estimated. And, although I haven’t conducted a formal survey, my impression is that when an Odeon record didn’t carry a speed, it was often because it was horribly wrong, and the company didn’t want to admit it.
Gramophone, pre-1912. The leading record company in Europe was the Gramophone Company, makers of HMV and Zonophone records. In about 1912 they decided to set a standard of 78rpm, this being the average of their contemporary catalogue, and they also conducted listening experiments on their back-catalogue. The resulting speed-estimates were published in catalogues and brochures for some years afterwards; for modern readers, many can be found in the David and Charles 1975 facsimile reprint “Gramophone Records of the First World War.”
Experienced record-collectors soon became very suspicious of some of these recommendations. But if we ignore one or two obvious mistakes, and the slight errors which result from making voices recognisable rather than doing precise adjustments of pitch, the present writer has a theory which accounts for the most of the remaining results. Gramophones of 1912 were equipped with speed-regulators with a central “78” setting and unlabelled marks on either side. It seems there was a false assumption that one mark meant one revolution-per-minute. But the marks provided by the factory were arbitrary, and the assumption gave an error of about 60 percent; that is to say, one mark was about two-and-a-half rpm. So when the judges said “Speed 76” (differing from 78 by two), they should have said “Speed 73” (differing from 78 by five). If you’re confused, imagine how the catalogue editors felt when the truth began to dawn. It’s not surprising they decided to make the best of a bad job, and from 1928 onwards the rogue records were listed simply as “above 78” or “below 78”. Nevertheless, it is a clear indication to us today that we must do something!

Speeds were usually consistent within a recording-session. So you should not make random speed-changes between records with consecutive matrix numbers unless there is good reason to do so. But there are some exceptions. Sometimes one may find alternative takes of the same song done on the same day with piano accompaniment and orchestral accompaniment; these may appear to be at different speeds. This could be because the piano was at a different pitch from the orchestra, or because a different recording-machine was used. When a long side was being attempted, engineers would sometimes tweak the governor of the recording-machine to make the wax run slower. I would recommend you to be suspicious of any disc records made before 1925 which are recorded right up to the label edge. These might have been cut slower to help fit the song onto the disc.

I must report that so-called 78rpm disc records were hardly ever recorded at exactly 78rpm anyway. The reason lies in the different mains frequencies on either side of the Atlantic, which means that speed-testing stroboscopes gave slightly different results when illuminated from the local mains supply, because the arithmetic resulted in decimals. In America a 92-bar stroboscope suggests a speed of 78.26087rpm; in Europe a 77-bar stroboscope suggests a speed of 77.922078rpm. The vast majority of disc recording lathes then ran at these speeds, which were eventually fixed in national (but not international) standards. From now on, you should assume 78.26 for American recordings and 77.922 for European recordings whenever I use the phrase “78rpm disc.” A similar problem occurs with 45rpm discs, but not 33 1/3s; stroboscopes for this speed can be made to give exact results on either side of the Atlantic.

5.5 Musical considerations

My policy as a sound engineer is to start with the “official” speed, taking into account the known exceptions given earlier. I change this only when there’s good reason to do so. The first reason is usually that the pitch of the music is wrong.

It’s my routine always to check the pitch if I can, even on a modern recording. (Originals are often replayed on the same machine, so a speed error will cancel out; thus an independent check can reveal an engineering problem). I only omit it when I am dealing with music for films or other situations when synchronisation is more important than pitch. Copies of the two “Dictionaries of Musical Themes” may be essential (unfortunately, there isn’t an equivalent for popular music). Even so, there are a number of traps to setting the speed from the pitch of the music, which can only be skirted with specialist knowledge.

The first is that music may be transposed into other keys. Here we must think our way into the minds of the people making the recording. It isn’t easy to transpose; in fact it can only be done by planning beforehand with orchestral or band accompaniments, and it’s usually impossible with more than two or three singers. So transposition can usually be ruled out, except for established or VIP soloists accompanied by an orchestra; for example, Vera Lynn, whose deep voice was always forcing Decca’s arranger Mantovani to re-score the music a fourth lower.

Piano transposition was more frequent, though in my experience only for vocal records. Even so, it happened less often than may be supposed. Accompanist Gerald Moore related how he would rehearse a difficult song transposed down a semitone, but play in the right key on the actual take, forcing his singer to do it correctly. So transposition isn’t common, and it’s usually possible to detect when the voice-quality is compared with other records of the same artist. For the modern engineer the problem is to get sufficient examples to be able to sort the wheat from the chaff. A more insidious trap is the specialist producer who’s been listening to the recordings of a long-dead artist all his life, and who’s got it wrong from Day 1 !

A useful document is L. Heavingham Root’s article “Speeds and Keys” published in the Record Collector Volume 14 (1961; pp. 30-47 and 78-93). This gives recommended playing-speeds for vocal Gramophone Company records during the “golden years” of 1901 to 1908, but unfortunately a promised second article covering other makes never appeared. Mr. Root gave full musical reasons for his choices. Although a scientific mind would challenge them because he said he tuned his piano to C = 440Hz (when presumably he meant A = 440Hz), this author has found his recommendations reliable.

Other articles in discographies may give estimated playing-speeds accurate to four significant figures. This is caused by the use of stroboscopes for measuring the musically-correct speed, which can be converted to revolutions-per-minute to a high degree of accuracy; but musical judgement can never be more accurate than two significant figures (about one percent), for reasons which will become apparent in the next few paragraphs. Tighter accuracy is only necessary for matching the pitches of two different recordings which will be edited together or played in quick succession.

5.6 Strengths and weaknesses of “standard pitch”

The next difficulty lies in ascertaining the pitch at which musical instruments were tuned. There has always been a tendency for musicians to tune instruments sharp for “extra brilliance”, and there is some evidence that standard pitches have risen slowly but consistently over the centuries in consequence. There were many attempts to standardise pitch so different instruments could play together; but the definitive international agreement did not come until 1939, after over four decades of recording using other “standards.” You will find it fascinating to read a survey done just prior to the International Agreement.

Live broadcasts of classical music from four European countries were monitored and measured with great accuracy. (Ref. 4). There were relatively small differences between the countries, the averages varying only from A = 438.5 (England) to A = 441.2 (Germany). Of the individual concerts, the three worst examples were all solo organ recitals in winter, when the temperature made them go flat. When we discount those, they were overshadowed by pitch variations which were essentially part of the language of music. They were almost exactly one percent peak-to-peak. Then, as now, musicians hardly ever play exactly on key; objective measurement is meaningless on expressive instruments. Thus, a musically trained person may be needed to estimate what the nominal pitch actually is.

Other instruments, such as piano-accordions and vibraphones, have tuning which cannot be altered. When a band includes such instruments, everyone else has to tune to them, so pitch variations tend to be reduced. So ensembles featuring these instruments may be more accurate.

Instead of judging by ear, some operators may prefer the tuning devices which allow pop musicians to tune their instruments silently on-stage. Korg make a very wide range, some of which can also deal with obsolete musical pitches. They can indicate the actual pitch of a guitar very accurately, so it could be possible to use one to measure a recording. But they give anomalous results when there is strong vibrato or harmonics, so this facility must be used with care, and only on recordings of instruments with “fixed tuning” (by which I mean instruments such as guitars with frets, which restrict “bending” the pitch as a means of musical expression). In other cases (particularly ensembles), I consider a musical ear is more trustworthy.

5.7 Non-“standard” pitches

Before the 1939 Agreement, British concert pitch (called “New Philharmonic Pitch” or “Flat Pitch”) was A = 435 at 60 degrees Fahrenheit (in practice, about 439 at concert-hall temperatures). The International Agreement rounded this up to A = 440 at 68 degrees (20 Celsius), and that was supposed to be the end of the matter. Nevertheless, it is known that the Berlin Philharmonic and the Philadelphia orchestras use A = 444Hz today (a Decca record-producer recalled that the Vienna Philharmonic was at this pitch in 1957, with the pitch rising even further in the heat of the moment). The pianos used for concertos with such orchestras are tuned even higher. This may partly be because the pitch goes up in many wind instruments as the temperature rises, while piano strings tend to go flatter. Concert-halls in Europe and America are usually warmer than 68 Fahrenheit; it seems that only us poor Brits try using 440Hz nowadays!

The next complication is that acoustic recording-studios were deliberately kept very warm, so the wax would be soft and easy to cut. Temperatures of 90F were not uncommon, and this would make some A=440 instruments go up to at least 450. On the other hand, different instruments are affected by different amounts, those warmed by human breath much less than others. (This is why an orchestra tunes to the oboe). The human voice is hardly affected at all; so when adjusting the speed of wind-accompanied vocal records, accurate voice-quality results from not adjusting to correct tuning pitch. Thus we must make a cultural judgement: what are we aiming for, correct orchestral pitch or correct vocal quality? For some types of music, the choice isn’t easy.

There are sufficient exceptions to A = 440 to fill many pages. The most important is that British military bands and some other groups officially tuned to “Old Philharmonic Pitch” or “Sharp Pitch” before 1929, with A at a frequency which has been given variously as 452.5 and 454. Since many acoustic recordings have military band accompaniments, this shows that A = 440 can be quite irrelevant. Much the same situation occurred in the United States, but I haven’t been able to find out if it applied elsewhere. Nor do I know the difference between a “military” band and a “non-military” one; while it seems British ones switched to A = 440 in about 1964. Before that, it was apparently normal for most British wind players to possess two instruments, a “low pitch” one and a “high pitch” one, and it was clearly understood which would be needed well in advance of a session or concert.

Of course, most ethnic music has always been performed at pitches which are essentially random to a European, and the recent revival of “original instrumental technique” means that much music will differ from standard pitch anyway.

Because of their location in draughty places, pipe organs tended to be much more stable than orchestras. Many were tuned to Old Philharmonic Pitch when there was any likelihood of a military band playing with them (the Albert Hall organ was a notorious example). Because an organ is quite impossible to tune “on the night”, the speed of such location recordings can actually be set more precisely than studio ones; but for perfect results you must obviously know the location, the history of the organ, the date of the record, and the temperature!

With ethnic music, it is sometimes possible to get hold of an example of a fixed-pitch instrument and use it to set the speed of the reproduction. Many collectors of ethnic music at the beginning of the century took a pitch-pipe with them to calibrate the cylinder recordings they made. (We saw all sorts of difficulties with professionally-made cylinders at the start of section 5.4, and location-recordings varied even more widely). Unfortunately, this writer has found further difficulties here - the pitch of the pitch-pipe was never documented, the cylinder was often still getting up to speed when it was sounded, and I even worked on one collection where the pitch-pipe was evidently lost, some cylinders were made without it, and then another (different) one was found! In these circumstances, you can sometimes make cylinders play at the correct relative pitches (in my case, “nailed down” more closely by judging human voices), but you cannot go further. Sometimes, too, if you know what you’re doing, you can make use of collections of musical instruments, such as those at the Horniman or Victoria & Albert museums.

5.8 The use of vocal quality

Ultimately, however, the test must be “does it sound right?” And with some material, such as solo speech, there may be no other method. Professional tape editors like myself were very used to the noises which come from tape running at strange speeds, and we also became used to the effect of “everything clicking into place”. One of the little games I used to play during a boring editing-session was to pull the scrap tape off the machine past the head, and try my luck at aiming for fifteen inches per second by ear, then going straight into “play” to see if I’d got it right. Much of the time, I had. All the theories go for nothing if the end-result fails to “gel” in this manner. Only when one is forced to do it (because of lack of any other evidence) should one use the technique; but, as I say, this is precisely when we must document why a solution has been adopted.

The operator must, however, be aware of two potential difficulties. The first is that the average human being has become significantly bigger during the past century. It is medically known that linear dimensions have increased by about five percent. (Weights, of course, have increased by the cube of this). Thus the pitch of formants will be affected, and there would be an even greater effect with the voices of (say) African pygmies.

John R. T. Davies also uses the “vocal quality” technique. He once tried to demonstrate it to me using an American female singer. He asserted that it was quite easy to judge, because at 78rpm the voice was “pinched”, as if delivered through lips clenched against the teeth. I could hear the difference all right, but could not decide which was “right”. It wasn’t until I was driving home that I worked out why the demonstration didn’t work for me. I am not acquainted with any native American speakers, and I am not a regular cinemagoer. So my knowledge of American speech has come from television, and we saw earlier that American films are transmitted four percent faster in Europe. So I had assumed that American vocal delivery was always like that. The point I’m making is that personal judgements can be useful and decisive; but it’s vital for every individual to work out for himself precisely where the limits of his experience lie, and never to go beyond them.

Nevertheless I consider we should always take advantage of specialist experience when we can. When John R. T. Davies isn’t certain what key the jazz band is playing in, he gets out his trumpet and plays along with the improvisation to see what key gives the easiest fingering. I sometimes ask a colleague who is an expert violinist to help me. She can recognise the sound of an “open string”, and from this set the speed accurately by ear. And, although the best pianists can get their fingers round anything, I am told it is often possible to tell when a piano accompaniment has been transposed, because of the fingering. This type of “evidence”, although indisputable, clearly depends upon specialist knowledge.

5.9 Variable-speed recordings

So far, we’ve assumed that a record’s speed is constant. This is not always the case. On 78rpm disc records, the commonest problem occurs because the drag of the cutter at the outside edge of the wax was greater than at the inside edge, so the master-record tended to speed up as the groove-diameter decreased. I have found this particularly troublesome with pre-EMI Columbias, though it can crop up anywhere. It is, of course, annoying when you try to join up the sides of a multi-side work. But even if it’s only one side, you should get into the habit of skipping the pickup to the middle and seeing if the music is at the same pitch. On the other hand, some types of performances (particularly unaccompanied singers and amateur string players) tend to go flatter as time goes by; so be careful. A technique for solving this difficulty was mentioned in paragraph 6 of section 1.6; that is, to collect evidence of the performance of the disc-cutter from a number of sessions around the same date.

Until about 1940 most commercial recording lathes were weight-powered, regulated by a centrifugal governor like that on a clockwork gramophone. A well-designed machine would not have any excess power capability, because there was a limit to how much power could be delivered by a falling weight. The gearing was arranged to give just enough power to cut the grooves at the outside edge of the disc, while the governor absorbed little power itself. The friction-pad of a practical governor came on gently, because a sudden “on/off” action would cause speed oscillations; so, as it was called upon to absorb more power, the speed would rise slightly. By the time the cutter had moved in an inch or two, the governor would be absorbing several times as much power as it did at the start, and the proportion would remain in the governor’s favour. So you will find that when this type of speed-variation occurs, things are usually back to normal quite soon after the start.

The correction of fluctuating speeds is a subject which has been largely untouched so far. Most speed variation is caused by defects in mechanical equipment, resulting in the well-known “wow” and “flutter.” The former is slow speed variation, commonly less than twenty times per second, and the latter comprises faster variations.

Undoubtedly, the best time to work on these problems is at the time of transferring the sound off the original medium. Much “wow” is essentially due to the reproduction process, e.g. eccentricity or warpage of a disc. One’s first move must be to cure the source of the problem by re-centering or flattening the disc.

Unfortunately it is practically impossible to cure eccentric or warped cylinders. The only light at the end of the tunnel is to drive the cylinder by a belt wrapped round the cylinder rather than the mandrel, arranged so it is always leaving the cylinder tangentially close to the stylus. (A piece of quarter-inch tape makes by far the best belt!) If the mandrel has very little momentum of its own, and the pickup is pivoted in the same plane, the linear speed of the groove under the pickup will be almost constant. But this will not cure wow if differential shrinkage has taken place.

Another problem concerns cylinders with an eccentric bore. With moulded cylinders the only “cure” is to pack pieces of paper between mandrel and cylinder to bring it on-centre. But for direct-cut wax cylinders, the original condition should be recreated, driving the mandrel rather than the surface (Ref. 5).

However, it is possible to use one source of wow to cancel another. For example, if a disc has audible once-per-revolution recorded wow, you may be able to create an equal-but-opposite wow by deliberately orienting the disc off-centre. This way, the phases of the original wow and the artificial wow are locked together. This relationship will be lost from any copy unless you invoke special synchronisation techniques.

It is often asked, “What are the prospects for correcting wow and flutter on a digitised copy?” I am afraid I must reply “Not very good.” A great deal has been said about using computers for this purpose. Allow me to deal with the difficulties, not because I wish to be destructive, but because you have a right to know what will always be impossible.

The first difficulty is that we must make a conscious choice between the advantages and disadvantages. We saw in chapter 1 that the overall strategy should include getting the speed right before analogue-to-digital conversion, to avoid the generation of digital artefacts. Nevertheless it is possible to reduce the latter to any desired degree, either by having a high sampling-frequency or a high bit-resolution. So we can at least minimise the side-effects.

Correction of “wow” in the digital domain means we need some way of telling the processor what is happening. One way to do this objectively is to gain access to a constant frequency signal recorded at the same time, a concept we shall explore for steady-state purposes in the next section. But when the speed variations are essentially random the possibilities are limited, mainly because any constant frequency signal is comparatively weak when it occurs. To extract it requires sharp filtering, and we also need to ignore it when it is drowned by wanted sound. Unfortunately, information theory tells us we cannot detect rapid frequency changes with sharp filters. To make things worse, if the constant frequency is weak, it will be corrupted by background noise or by variations in sensitivity. Although slow wow may sometimes be correctable, I am quite sure we shall never be able to deal with flutter this way. I am sorry to be pessimistic, but this is a principle of nature; I cannot see how we shall ever be able correct random flutter from any constant frequency which happens to be recorded under the sound.

But if the wow or flutter has a consistent element, for example due to an eccentric capstan rotating twenty-five times per second in a tape recorder, then there is more hope. In principle we could tell the computer to re-compute the sampling-frequency twenty-five times per second and leave it to get on with it. The difficulty is “slippage.” Once the recording is a fiftieth of a second out-of-sync, the wow or flutter will be doubled instead of cancelled. This would either require human intervention (implying subjectivism), or software which could distinguish the effect from natural pitch variations in the wanted sound. The latter is not inconceivable, but it has not yet been done.

The computer may be assisted if the digital copy bears a fixed relationship to the rotational speed of the original. Reverting to discs, we might record a once-per-revolution pulse on a second channel. A better method is some form of rigid lock - so that one revolution of a 77.92rpm disc always takes precisely 33957 samples, for example. (The US equivalent would be a 78.26rpm disc taking 33776 samples). This would make it easier for future programmers to detect cyclic speed variations in the presence of natural pitch-changes, by accumulating and averaging statistics over many cycles. So here is another area of development for the future.

Another is to match one medium to another. Some early LPs had wow because of lower flywheel effect at the slower speed. But 78rpm versions are often better for wow, while being worse for noise. Perhaps a matching process might combine the advantages of both.

In 1990 CEDAR demonstrated a new algorithm for reducing wow on digitised recordings, which took its information from the pitch of the music being played. Only slow wow could be corrected, otherwise the process would “correct” musical vibrato! Presumably this algorithm is impotent on speech, and personally I found I could hear the side-effects of digital re-sampling. But here is hope when it’s impossible to correct the fault at source. Unfortunately, CEDAR did not market the algorithm.

I hope this discussion will help you decide what to do when the problem occurs. For the rest of this chapter, we revert to steady-state situations and situations where human beings can react fast enough.

5.10 Engineering evidence

Sometimes we can make use of technical faults to guide us about speed-setting. Alternating-current mains can sometimes get recorded - the familiar “background hum.” In Europe the mains alternates at a nominal frequency of 50Hz, and in America the frequency is 60Hz. If it is recorded, we can use it to compensate for the mechanical deficiencies of the machinery.

Before we can use the evidence intelligently, we must study the likelihood of errors in the mains frequency. Nowadays British electricity boards are supposed to give advance notice of errors likely to exceed 0.1 percent. Britain was fortunate enough to have a “National Grid” before the second World War, giving nationwide frequency stability (except in those areas not on 50Hz mains). Heavy demand would slow the generators, so they had to be speeded up under light demand if errors were not to build up in synchronous electric clocks. So the frequency might be “high” as well as “low.” Occasional bombing raids during the second World War meant that isolated pockets of Britain would be running independently of the Grid, but the overall stability is illustrated by Reference 6, which shows that over one week in 1943 the peak error-rate was only 15 seconds in three hours, or less than 0.15 percent. (There might be worse errors for very short periods of time, but these would be distinctly uncommon). After the war, the Central Electricity Generating Board was statutorily obliged to keep its 50Hz supplies within 2Hz in 1951 and 0.5Hz in 1971. My impression is that these tolerances were extremely rarely approached, let alone exceeded. However there is ample evidence of incompetent engineers blaming “the mains” for speed errors on their equipment.

An anecdote to illustrate that things were never quite as bad as that. In the years 1967-1971 I worked on a weekly BBC Radio programme lasting half-an-hour, which was recorded using A.C. mains motors on a Saturday afternoon (normally a “light current load” time), and reproduced during Monday morning (normally a “heavy load time,” because it was the traditional English wash-day). The programmes consistently overran when transmitted, but only by ten or fifteen seconds, an error of less than one percent even when the cumulative errors of recording and reproduction were taken into account. In over twenty-five years of broadcasting, I never came across another example like that. However, I have no experience of mains-supplies in other countries; so I must urge you to find the tolerances in other areas for yourself.

We do not find many cases of hum on professional recordings, but it is endemic on amateur ones, the very recordings most liable to speed errors. So the presence of hum is a useful tool to help us set the speed of an anomalous disc or tape; it can be used to get us “into the right ballpark”, if nothing else. This writer has also found a lot of recorded hum on magnetic wire recordings. This is doubly useful; apart from setting the “ballpark” speed, its frequency can be used to distinguish between wires recorded with capstan-drive and wires recorded with constant-speed takeup reels. But here is another can-of-worms; the magnetic wire itself forms a “low-reluctance” path for picking up any mains hum and carrying it to the playback head. It can be extremely difficult to hear one kind of hum in the presence of the other.

Portable analogue quarter-inch tape-recorders were used for recording film sound on location from the early 1960s. Synchronisation relied upon a reference-tone being recorded alongside the audio, usually at 50Hz for 50Hz countries and 60Hz for 60Hz countries. Back at base, this pilot-tone could be compared with the local mains frequency used for powering the film recording machines, so speed variations in the portable unit were neutralised. In principle it might be possible to extract accidental hum from any recording and use it to control a playback tape-recorder in the same way. This is another argument in favour of making an “archive copy” with warts and all; the hum could be useful in the future.

We can sometimes make use of a similar fault for setting the speed of a television soundtrack recording. The “line-scan” frequency of the picture sometimes gets recorded amongst the audio. This was 10125Hz for British BBC-1 and ITV 405-line pictures until 1987; 15625Hz for 625-line pictures throughout the world (commencing in 1963 in Britain); 15750Hz for monochrome 525-line 30Hz pictures, and 15734.25Hz for colour NTSC pictures. These varied only a slight amount.

For example, before frame-stores became common in 1985, a studio centre might slew its master picture-generator to synchronise with an outside broadcast unit. This could take up to 45 seconds in the worst possible case (to achieve almost a full frame of slewing). Even so, this amounts to less than one part in a thousand; so speed-setting from the linescan frequency can be very accurate. In my experience such high frequencies are recorded rather inefficiently, and only the human ear can extract them reliably enough to be useful; so setting the speed has to be done by ear at present.

Although my next remark refers to the digitisation procedures in Chapter 2, it is worth noting that the embedding of audio in a digital video bitstream means that there must be exactly 1920 digital samples per frame in 625-line television, and exactly 8008 samples per five frames in NTSC/525-line systems.

The 19kHz of the stereo pilot-tone of an FM radio broadcast (1st June 1961 onwards in the USA, 1962 onwards in Britain) can also get recorded. This does not vary, and can be assumed to be perfectly accurate - provided you can reproduce it.

John Allen has even suggested that the ultrasonic bias of magnetic tape recording (see section 9.3) is sometimes retained on tape well enough to be useful. (Ref. 7). We usually have no idea what its absolute frequency might be; but it has been suggested that variations caused by tape speed-errors might be extracted and used to cancel wow and flutter. I have already expressed my reasons why I doubt this, but it has correctly been pointed out that, provided it’s above the level of the hiss (it usually isn’t), this information should not be thrown away, e. g. by the anti-aliasing circuit of a digital encoder. Although it may be necessary to change the frequency down and store it on a parallel track of a multi-channel digital machine, we should do so. Again, it’s a topic for the future; but it seems just possible that a few short-term tape speed variations might be compensated objectively one day.

There is one caveat I must conclude with. The recording of ultrasonic signals is beset with problems, because the various signals may interfere with each other and result in different frequencies from what you might expect. For example, the fifth harmonic of a television linescan at 78125Hz might beat with a bias frequency of 58935Hz, resulting in a spurious signal at 19190Hz. If you did not know it was a television soundtrack, this might be mistaken for a 19kHz radio pilot-tone, and you’d end up with a one percent speed error when you thought you’d got it exactly right. So please note the difficulties, which can only be circumvented with experience and a clear understanding of the mechanics.

5.11 Timings

There’s a final way of confirming an overall speed, by timing the recording. This is useful when the accompanying documentation includes the supposed duration. Actually, the process is unreliable for short recordings, because if the producer was working with a stopwatch, you would have to allow for reaction-time, the varying perception of decaying reverberation, and any “rounding errors” which might be practised. So short-term timings would not be reliable enough. But for longer recordings, exceeding three or four minutes, the documentation can be a very useful guide to setting a playing-speed. The only trouble is that it may take a lot of trial-and-error to achieve the correct timing.

I hope this chapter will help you to assess the likelihood, quantity, and sign of a speed error on a particular recording. But I conclude with my plea once again. It seems to me that the basis of the estimate should also be documented. The very act of logging the details forces one to think the matter through and helps against omitting a vital step. And it’s only right and proper that others should be able to challenge the estimate, and to do so without going through all the work a second time.

REFERENCES

1: anon., “Lexiconning” (article), Sight and Sound Vol. 57 No. 1 (Winter 1987/8), pp. 47-8. It should be noted this article’s main complaint was a film speeded by ten percent. A Lexicon would be essential to stop a sound like “Chipmunks” at this rate, although the same technology could of course be used for a four-percent change.
2: Friedrich Engel (BASF), Letter to the Editor, Studio Sound, Vol. 28 No. 7 (July 1986), p. 147.
3: John C. Fesler, London: Hillandale News, No. 125 (April 1982), p. 21.
4: Balth van der Pol and C. C. J. Addink, “Orchestral Pitch: A Cathode-Ray Method of Measurement during a Concert” (article), Wireless World, 11th May 1939, pp. 441-2.
5: Hans Meulengracht-Madsen, “On the Transcription of Old Phonograph Wax Records” (paper), J.A.E.S., Jan/Feb 1976.
6: H. Morgan, “Time Signals” (Letter to the Editor), Wireless World, Vol. L No. 1 (January 1944), p. 26.
7: John S. Allen, “Some new possibilities in audio restoration,” (article), ARSC Journal, Volume 21 No. 1 (Spring 1990), page 44.

5 Speed setting 5.1 Introduction