The Inventor of Auto-Tune

auto_tune

By: Zachary Crockett

Auto-Tune — one of modern history’s most reviled inventions — was an act of mathematical genius.

The pitch correction software, which automatically calibrates out-of-tune singing to perfection, has been used on nearly every chart-topping album for the past 20 years. Along the way, it has been pilloried as the poster child of modern music’s mechanization. WhenTime Magazine declared it “one of the 50 worst inventions of the 20th century”, few came to its defense.

But often lost in this narrative is the story of the invention itself, and the soft-spoken savant who pioneered it. For inventor Andy Hildebrand, Auto-Tune was an incredibly complex product — the result of years of rigorous study, statistical computation, and the creation of algorithms previously deemed to be impossible.

Hildebrand’s invention has taken him on a crazy journey: He’s given up a lucrative career in oil. He’s changed the economics of the recording industry. He’s been sued by hip-hop artist T-Pain. And in the course of it all, he’s raised pertinent questions about what constitutes “real” music.

The Oil Engineer

oil_engineer

Andy Hildebrand was, in his own words, “not a normal kid.”

A self-proclaimed bookworm, he was constantly derailed by life’s grand mysteries, and had trouble sitting still for prolonged periods of time. School was never an interest: when teachers grew weary of slapping him on the wrist with a ruler, they’d stick him in the back of the class, where he wouldn’t bother anybody. “That way,” he says, “I could just stare out of the window.”

After failing the first grade, Hilbrebrand’s academic performance slowly began to improve. Toward the end of grade school, the young delinquent started pulling C’s; in junior high, he made his first B; as a high school senior, he was scraping together occasional A’s. Driven by a newfound passion for science, Hildebrand “decided to start working [his] ass off” — an endeavor that culminated with an electrical engineering PhD from the University of Illinois in 1976.

In the course of his graduate studies, Hildebrand excelled in his applications of linear estimation theory and signal processing. Upon graduating, he was plucked up by oil conglomerate Exxon, and tasked with using seismic data to pinpoint drill locations. He clarifies what this entailed:

“I was working in an area of geophysics where you emit sounds on the surface of the Earth (or in the ocean), listen to reverberations that come up, and, from that information, try to figure out what the shape of the subsurface is. It’s kind of like listening to a lightning bolt and trying to figure out what the shape of the clouds are. It’s a complex problem.”
Three years into Hildebrand’s work, Exxon ran into a major dilemma: the company was nearing the end of its seven-year construction timeline on an Alaskan pipeline; if they failed to get oil into the line in time, they’d lose their half-billion dollar tax write-off. Hildebrand was enlisted to fix the holdup — faulty seismic monitoring instrumentation — a task that required “a lot of high-end mathematics.” He succeeded.

“I realized that if I could save Exxon $500 million,” he recalls, “I could probably do something for myself and do pretty well.”

at2

A subsurface map of one geologic strata, color coded by elevation, created on the Landmark Graphics workstation (the white lines represent oil fields); courtesy of Andy Hildebrand
So, in 1979, Hildebrand left Exxon, secured financing from a few prominent venture capitalists (DLJ Financial; Sevin Rosen), and, with a small team of partners, founded Landmark Graphics.

At the time, the geophysical industry had limited data to work off of. The techniques engineers used to map the Earth’s subsurface resulted in two-dimensional maps that typically provided only one seismic line. With Hildebrand as its CTO, Landmark pioneered a workstation — an integrated software/hardware system — that could process and interpret thousands of lines of data, and create 3D seismic maps.

Landmark was a huge success. Before retiring in 1989, Hildebrand took the company through an IPO and a listing on NASDAQ; six years later, it was bought out by Halliburton for a reported $525 million.

“I retired wealthy forever (not really, my ex-wife later took care of that),” jokes Hildebrand. “And I decided to get back into music.”

From Oil to Music Software

at3

An engineer by trade, Hildebrand had always been a musician at heart.

As a child, he was something of a classical flute virtuoso and, by 16, he was a “card-carrying studio musician” who played professionally. His undergraduate engineering degree had been funded by music scholarships and teaching flute lessons. Naturally, after leaving Landmark and the oil industry, Hildebrand decided to return to school to study composition more intensively.

While pursuing his studies at Rice University’s Shepherd School of Music, Hildebrand began composing with sampling synthesizers (machines that allow a musician to record notes from an instrument, then make them into digital samples that could be transposed on a keyboard). But he encountered a problem: when he attempted to make his own flute samples, he found the quality of the sounds to be ugly and unnatural.

“The sampling synthesizers sounded like shit: if you sustained a note, it would just repeat forever,” he harps. “And the problem was that the machines didn’t hold much data.”

Hildebrand, who’d “retired” just a few months earlier, decided to take matters into his own hands. First, he created a processing algorithm that greatly condensed the audio data, allowing for a smoother, more natural-sounding sustain and timbre. Then, he packaged this algorithm into a piece of software (called Infinity), and handed it out to composers.

at4

A glimpse at Infinity’s interface from an old handbook; courtesy of Andy Hildebrand
Infinity improved digitized orchestral sounds so dramatically that it uprooted Hollywood’s music production landscape: using the software, lone composers were able to accurately recreate film scores, and directors no longer had a need to hire entire orchestras.

“I bankrupted the Los Angeles Philharmonic,” Hildebrand chuckles. “They were out of the [sample recording] business for eight years.” (We were unable to verify this, but The Los Angeles Times does cite that the Philharmonic entered a “financially bleak” period in the early 1990s).

Unfortunately, Hildebrand’s software was inherently self-defeating: companies sprouted up that processed sounds through Infinity, then sold them as pre-packaged soundbanks. “I sold 5 more copies, and that was it,” he says. “The market totally collapsed.”

But the inventor’s bug had taken hold of Hildebrand once more. In 1990, he formed his final company, Antares Audio Technology, with the goal of innovating the music industry’s next big piece of software. And that’s exactly what happened.

The Birth of Auto-Tune

at5
At a National Association of Music Merchants (NAMM) conference in 1995, Hildebrand sat down for lunch with a few friends and their wives. Randomly, he posed a rhetorical question — “What needs to be invented?” — and one of the women half-jokingly offered a response:

“Why don’t you make a box that will let me sing in tune?”

“I looked around the table and everyone was just kind of looking down at their lunch plates,” recalls Hildebrand, “so I thought, ‘Geez, that must be a lousy idea’, and we changed the topic.”

Hildebrand completely forgot he’d even had this conversation, and for the next six months, he worked on various other projects, none of which really took off. Then, one day, while mulling over ideas, the woman’s suggestion came back to him. “It just kind of clicked in my head,” he says, “and I realized her idea might not be too bad.”

What “clicked” for Hildebrand was that he could utilize some of the very same processing methods he’d used in the oil industry to build a pitch correction tool. Years later, he’d attempt to explain this on PBS’s NOVA network:

“Seismic data processing involves the manipulation of acoustic data in relation to a linear time varying, unknown system (the Earth model) for the purpose of determining and clarifying the influences involved to enhance geologic interpretation. Coincident (similar) technologies include correlation (statics determination), linear predictive coding (deconvolution), synthesis (forward modeling), formant analysis (spectral enhancement), and processing integrity to minimize artifacts. All of these technologies are shared amongst music and geophysical applications.”
At the time, no other pitch correction software existed. To inventors, it was a considered the “holy grail”: many had tried, and none had succeeded.

at

The major roadblock was that analyzing and correcting pitch in real-time required processing a very large amount of sound wave data. Others who’d made an attempt at creating software had used a technique called feature extraction, where they’d identify a few key “variables” in the sound waves, then correlate them with the pitch. But this method was overly-simplistic, and didn’t consider the finer minutia of the human voice. For instance, it didn’t recognize dipthongs (when the human voice transitions from one vowel to another in a continuous glide), and, as a result, created false artifacts in the sound.

Hildebrand had a different idea.

As an oil engineer, when dealing with massive datasets, he’d employed autocorrelation (an attribute of signal processing) to examine not just key variables, but all of the data, to get much more reliable estimates. He realized that it could also be applied to music:

“When you’re processing pitch, you add wave cycles to go sharp, and subtract them when you go flat. With autocorrelation, you have a clearly identifiable event that tells you what the period of repetition for repeated peak values is. It’s never fooled by the changing waveform. It’s very elegant.”

While elegant, Hildebrand’s solution required an incredibly complex, almost savant application of signal processing and statistics. When we asked him to provide a simple explanation of what happens, computationally, when a voice signal enters his software, he opened his desk and pulled out thick stacks of folders, each stuffed with hundreds of pages of mathematical equations.

“In my mind it’s not very complex,” he says, sheepishly, “but I haven’t yet found anyone I can explain it to who understands it. I usually just say, ‘It’s magic.’”

att

The equations that do autocorrelation are computationally exhaustive: for every one point of autocorrelation (each line on the chart above, right), it might’ve been necessary for Hildebrand to do something like 500 summations of multiply-adds. Previously, other engineers in the music industry had thought it was impossible to use this method for pitch correction: “You needed as many points in autocorrelation as the range in pitch you were processing,” one early-1990s programmer told us. “If you wanted to go from a low E (70 hertz) all the way up to a soprano’s high C (1,000 hertz), you would’ve needed a supercomputer to do that.”

A supercomputer, or, as it turns out, Andy Hildebrand’s math skills.

Hildebrand realized he was limited by the technology, and instead of giving up, he found a way to work within it using math. “I realized that most of the arithmetic was redundant, and could be simplified,” he says. “My simplification changed a million multiply adds into just four. It was a trick — a mathematical trick.”

With that, Auto-Tune was born.

Auto-Tune’s Underground Beginnings

atttttt

Hildebrand built the Auto-Tune program over the course of a few months in early 1996, on a specially-equipped Macintosh computer. He took the software to the National Association of Music Merchants conference, the same place where his friend’s wife had suggested the idea a year earlier. This time, it was received a bit differently.

“People were literally grabbing it out of my hands,” recalls Hildebrand. “It was instantly a massive hit.”

At the time, recording pitch-perfect vocal tracks was incredibly time-consuming for both music producers and artists. The standard practice was to do dozens, if not hundreds, of takes in a studio, then spend a few days splicing together the best bits from each take to a create a uniformly in-tune track. When Auto-Tune was released, says Hildebrand, the product practically sold itself.

With the help of a small sales team, Hildebrand sold Auto-Tune (which also came in hardware form, as a rack effect) to every major studio in Los Angeles. The studios that adopted Auto-Tune thrived: they were able to get work done more quickly (doing just one vocal take, through the program, as opposed to dozens) — and as a result, took in more clients and lowered costs. Soon, studios had to integrate Auto-Tune just to compete and survive.

Once again, Hildebrand dethroned the traditional industry.

“One of my producer friends had been paid $60,000 to manually pitch-correct Cher’s songs,” he says. “He took her vocals, one phrase at a time, transferred them onto a synth as samples, then played it back to get her pitch right. I put him out of business overnight.”

For the first three years of its existence, Auto-Tune remained an “underground secret” of the recording industry. It was used subtly and unobtrusively to correct notes that were just slightly off-key, and producers were wary to reveal its use to the public. Hildebrand explains why:

“Studios weren’t going out and advertising, ‘Hey we got Auto-Tune!’ Back then, the public was weary of the idea of ‘fake’ or ‘affected’ music. They were critical of artists like Milli Vanilli [a pop group whose 1990 Grammy Award was rescinded after it was found out they’d lip-synced over someone else’s songs]. What they don’t understand is that the method used before — doing hundreds of takes and splicing them together — was its own form of artificial pitch correction.”
This secrecy, however, was short-lived: Auto-Tune was about to have its coming out party.

The “Coming Out” of Auto-Tune

cher

When Cher’s “Believe” hit shelves on October 22, 1998, music changed forever.

The album’s titular track — a pulsating, Euro-disco ballad with a soaring chorus — featured a curiously roboticized vocal line, where it seemed as if Cher’s voice were shifting pitch instantaneously. Critics and listeners weren’t sure exactly what they were hearing. Unbeknownst to them, this was the start of something much bigger: for the first time, Auto-Tune had crept from the shadows.

In the process of designing Auto-Tune, Hildebrand had included a “dial” that controlled the speed at which pitch corrected itself. He explains:

“When a song is slower, like a ballad, the notes are long, and the pitch needs to shift slowly. For faster songs, the notes are short, the pitch needs to be changed quickly. I built in a dial where you could adjust the speed from 1 (fastest) to 10 (slowest). Just for kicks, I put a “zero” setting, which changed the pitch the exact moment it received the signal. And what that created was the ‘Auto-Tune’ effect.”
Before Cher, artists had used Auto-Tune only supplementally, to make minor corrections; the natural qualities of their voice were retained. But on the song “Believe”, Cher’s producers, Mark Taylor and Brian Rawling, made a decision to use Auto-Tune on the “zero” setting, intentionally modifying the singer’s voice to sound robotic.

YouTube Preview Image

Cher’s single sold 11 million copies worldwide, earned her a Grammy Award, and topped the charts in 23 countries. In the wake of this success, Hildebrand and his company, Antares Audio Technologies, marketed Auto-Tune as the “Cher Effect”. Many people in the music industry attributed the artist’s success to her use of Auto-Tune; soon everyone wanted to replicate it.

“Other singers and producers started looking at it, and saying ‘Hmm, we can do something like that and make some money too!’” says Hildebrand. “People were using it in all genres: pop, country, western, reggae, Bollywood. It was even used in an Islamic call to prayer.”

The secret of Auto-Tune was out — and its saga had just begun.

The T-Pain Debacle

tpain

In 2004, an unknown rapper with dreads and a penchant for top hats arrived on the Florida hip-hop scene. His name was Faheem Rashad Najm; he preferred “T-Pain.”

After recording a few “hot flows,” T-Pain was picked out of relative obscurity and signed to Akon’s record label, Konvict Muzik. Once discovered, he decided he’d rather sing than rap. He had a great singing voice, but in order to stand out, he needed a gimmick — and somewhat fortuitously, he found just that. In a 2014 interview, he explains:

“I used to watch TV a lot [and] there was always this commercial on the channel I would watch. It was one of those collaborative CDs, like a ‘Various Artists’ CD, and there was this Jennifer Lopez song, ‘If You Had My Love.’ That was the first time I heard Auto-Tune. Ever since I heard that song — and I kept hearing and kept hearing it — on this commercial, I was like, ‘Man, I gotta find this thing.’”
T-Pain — who is capable of singing very well naturally — decided to use Auto-Tune to differentiate himself from other artists. “If I was going to sing, I didn’t want to sound like everybody else,” he later told The Seattle Times. “I wanted something to make me different [and] Auto-Tune was the one.” He contacted some “hacker” friends, found a free copy of Auto-Tune floating around on the Internet, and downloaded it for free. Then, he says, “I just got right into it.”