Interview with Speaker Design Guru David Smith

[David Smith Designs]
[Italian version]

TNT-Audio caught up with noted speaker designer David Smith, ex-JBL, McIntosh, KEF, Snell, and PSB. Join TNT and Speaker Dave on a journey into the heart of speaker design, its history and current trends.

Rahul Athalye (RA):Tell us about how you got started in the audio business.

David Smith (DS): It was a hobby interest when I went to college. Rather than study I would spend hours in the engineering library pouring through Audio magazine and old AES journals. My interests probably came from my Dad. He was part of the first Hi Fi hobby boom post World War II. I grew up in a house with homemade Hi Fi: a JBL D130 in a Hartley Boffle (a multi damping layer infinite baffle) with a University tweeter. A Presto turntable with a Japanese copy of the Gray tonearm, a homemade amplifier from the GE transistor manual.

After college I got a job with Essex Cletron in Cleveland, an OEM supplier of loudspeaker drivers. When they moved their engineering location from Cleveland to the Martinsville plant (later to become Harman Motive) I cast around and got a job at JBL in California. There I designed a range of home products and some important studio monitos, including the 4430 and 4435. After that, KEF in the UK, then KEF/Meridian in D.C., then McIntosh in Binghamton, followed by a/d/s/ and Snell in Boston, and most recently psb in Toronto. In the last few years so much has moved to China that there are very few firms with real engineering or product development in North America, I now work in the digital cinema field for a firm in Toronto.

RA: You have worked for both professional audio (JBL) and hi-fi (KEF) companies. How do the design goals for the pro speakers differ from those for hi-fi speakers?

DS: Pro speakers tend to be more specialized. If a customer needs to fill a 500 seat auditorium with sound, then that’s what they are going to buy, nothing short of that performance will do. Conversely, a domestic audio customer will have more latitude in what he can use and can be swayed a lot by fashion. It becomes more about appealing to their emotions (“I want it” rather than “I need it”). Both pose an interesting challenge. Pro audio generally requires achieving more acoustic output and placing reliability over everything. This may make it feel at times like you are designing Mack Trucks when you would rather be designing a Ferrari. Designing studio monitors was very enjoyable to me as they are nicely in the middle of the spectrum. High output is important but accuracy is also a key ingredient.

In later years, at Snell and even, to a degree at McIntosh, custom installation was driving the market. For high end home theater, people wanted high performance product that could be buried into cabinets or put in-wall. I searched for ways to give that installation flexibility without compromise to performance. If you understand boundary conditions and aspects of achieving directivity (aimable waveguides, for instance) you can achieve high performance even when, at first glance, it looks like the installation aspects will force compromise.

RA: Can we borrow from pro speaker design to achieve better performance in the home? What should be the goals for a hi-fi speaker?

DS: There is a great deal of inherent overlap anyhow. If you look at it in pure engineering terms then it can be distilled down to: Let’s design a speaker. It needs to play X dB loud at X meters (defining power handling, distortion). The audience width and depth will be X (defining dispersion). Etc. Viewed in those terms the product isn’t pro or domestic, it just needs to meet a particular performance spec.

I see a lot of enthusiast interest in people using pro oriented systems at home, especially vintage cinema gear i.e. large horn systems. I certainly see the appeal but I don’t believe it gives them the best sound possible. There are compromises forced when you need a certain number of acoustic watts to fill up a large space. I’ve designed products with horns and compression driver (most notably the JBL biradial monitors). I’m proud of the engineering of those products but for myself I would rather have non horn systems at home. I don’t need the extra 15dB of output and would rather have cone and dome units that are inherently smoother in response.

".... I would rather have non horn systems at home. I don’t need the extra 15dB of output and would rather have cone and dome units that are inherently smoother in response."

RA: There has been a lot of research with regard to psycho-acoustics and Toole's book is a great summary of this research. You have not only worked with some of the leading researchers but also contributed to the research itself. What is your take on the findings and how they apply to the home environment?

DS: Toole’s book is a groundbreaking work, worth reading over and over. It summarizes his decades of research on the subject of loudspeakers, their measurements and the applicable psychoacoustics. I think it is the new bible on sound system priorities, covering the audibility of the many measurable aberrations found in loudspeaker, including the effects of the room. If we place a speaker in the typical lively room and do a high resolution measurement, the result is such messy resonant picture that we can’t possibly discern whether it should sound good or not. Over the years we’ve developed a lot of approaches to smoothing out the view to simplify the picture, but without having any real justification for these approaches.

Toole’s classic paper, published in 2 parts in 1982 (part 1 and part 2), had a carefully run listening panel rank order 20 loudspeakers for preference and quality and at the same time made a large variety of measurements on them. It showed which measurement correlated with listener preference and which didn’t. This is great stuff for a speaker designer because you can ask: “what should I concentrate on when designing speakers”. Everything costs money and it is a competitive market. Should I spend money to get the phase response flat? Should I add extra components to flatten the system impedance curve? How low does distortion need to be? In other words, what measurable parameters correlate with our subjective impression?

This is a fundamental difference between commercially built speakers and DIY efforts. The enthusiast constructer can pick any design aspect and beat it to death. But if you are trying to survive in the market place you need to think about “bang for the buck”. Do your cost choices give the end user an audible benefit? Or are you creating straw men to knock down, say, reducing distortion to levels way lower than audible, over a misguided belief that that aspects overrides others.

Well, what I think Toole’s early research shows is that axial frequency response is the number one criterion. Power response, or overall directivity, seems to be a poor correlation with subjective impression. Why this is important is because it shows how room curves (strongly influenced by later arriving off axis response) can be misleading. The larger the room the more power response determines the room curve and the more misleading they will be. Still, I think he has drifted slightly from some of his earlier findings. In spite of what Toole’s early works clearly showed he is now placing importance on room curves and extending that from small rooms to large rooms as well.

I’m currently on one of the SMPTE committees that is looking to replace the X Curve approach to Cinema equalization. For years we’ve known that going into a large space, plunking down a microphone, feeding pink noise into the speakers and adjusting to flat response gave bad results, it is always too bright.

Now what does that mean? Something that measures flat sounds too bright? This isn’t an issue with amplifiers or record/playback systems, where flat sounds flat. I think there are a lot of clues to the reason for this, it is tied into human hearing and our ability to focus on earlier arriving sounds while ignoring later arriving sounds, but this is still a new concept to many in the industry.

With domestic listening rooms, the difference between the anechoic performance and the steady state, in-room performance is fairly minor for upper frequencies. But as the room gets to auditorium or cinema size, the differences enlarge and the steady state curve becomes very misleading.

There are a number of studies, by Kates, Salmi, Lipshitz and Vanderkooy, Bech and others, that suggest that we judge frequency balance with largely a time windowed approach. Late arriving sound is ignored. Also, this time window is long for low frequencies and short for high frequencies. In effect it is the steady state or room response for low frequencies and typically just the direct (anechoic) response for high frequencies. At mid frequencies it might contain the first floor or back wall bounce, but later reflections are generally under the level required for audibility.

"Wouldn’t it be nice to have a measuring system that perfectly mimicked human hearing and the way we perceive frequency response? It could take all the subjectivity out of it."

Viewing perception this way answers a lot of questions, including why we need to roll off the response of a system in a large room: the early response is inherently brighter than the later response, due to rising speaker directivity, rising room absorption, even the absorption of the air. Flat steady state response would give very bright early sound, and so we reject it.

Why this has always fascinated me as a speaker designer is that it holds out the promise of our being able to design speakers that are perfectly balanced, if only we can figure out how our hearing works. Remember, if we accept as our goal that the speaker shouldn’t add anything, it should be a neutral lens for the recorded sound to come through, we still have to figure out what neutral or flat means? Is it flat anechoic response, flat in-room response, flat power response, flat time windowed response? In smaller domestic listening rooms the direct response from the speaker and the room response (including all reflections and reverberation) aren’t that far apart, perhaps 2-3dB of shelving in the room response when the direct component is flat. As rooms get bigger it becomes a major issue and steady state curves need to roll about ten dB (the Cinema X Curve).

My current approach is to design for flat anechoic response for midrange and up and then see how the low end interacts with the room, do a lot of listening and fine tuning until it seems right across a broad spectrum of software. Still, there is always a feeling that a little more tweaking could give a better result, and a suspicion that I am tuning to give a pleasant result with music I like, rather than achieving verifiable accuracy. Wouldn’t it be nice to have a measuring system that perfectly mimicked human hearing and the way we perceive frequency response? It could take all the subjectivity out of it.

[Continued on the next page]

© Copyright 2013 Rahul Athalye - -