 |
Harmony Engine
Vocal harmonies are more than just frosting on the cake, they can really make or break or song. Would “Bohemian Rhapsody” be a hit without those huge stacked harmonies? Would the oft-imitated Pixies be the band they were without the sweet harmonies of Kim Deal? Would Paul Simon’s “Love me like a Rock” been a hit without the Gospel Hummingbirds backing him up? The answers aren’t at the back, they’re no, no, and no.
For me, software that could create realistic sounding vocal harmonies has been the holy grail. The first piece of software that I ever bought besides Logic, was a piece of software called Harmony by a company that has long since gone out of business. It was awful. It sounded ridiculous, was impossible to use and constantly crashed. Little did I know at the time (almost 10 years ago now) that this was the cutting edge of what software could do. Even the “big boys” didn’t have anything that could do harmonies for any price, it just wasn’t possible. If you listen to Doug’s fascinating interview with the founder of Antares, you hear him tell how he was practically mugged by big name producers when they heard a Beta of Auto-tune that he carried around with him on a floppy. Vocal reproduction just completely baffled software developers.
Fast forward a few years and you have the introduction of Melodyne, to my ears the first piece of software that could transpose a sung note more than a whole note and still sound realistic. With Melodyne you can produce realistic sounding four part harmonies that stand up to scrutiny.
What’s the downside? It’s no picnic. It takes some serious time and elbow grease to get good results. This is no knock on Melodyne as I love that software like no other, but creating harmonies takes time. And more than time it also takes some decent knowledge of voice leading (see sidebar). Every note takes careful arranging and listening and adjusting and listening again. You can get away with parallel harmonies for a couple of notes (sometimes) and in some genres sometimes (mostly traditional country and folk) but usually you will get something that just sounds silly and amateurish. We’ve been listening to vocal harmony for a few hundred years now (one could argue that Harmony is the major contribution of Western music) so the bar is set pretty high.
So enough preamble, here is where Antares comes in with their Harmony Engine. I am sure they knew as the name associated with vocal processing, that their product could not just be a “me too” product, it needed to do something different, something better. So what did they do, they made creating vocal harmonies, good ones, easier. And not just easier, but more musical.
As they say in the manual, in the past there have been pretty much two ways harmonies were made for you, either by specifying fixed widths (either by interval or scale degrees) or by manually specifying the notes via MIDI. So your two choices were crappy or hard (and often hard would turn out crappy), but Antares offers you two more modes: Chord Degrees mode and Chord name mode. They also give you ways to make the crappy less crappy, and the hard, easy.
Before I talk about the different modes though, let me talk a little bit about the things that are common to all modes.

So as you can see here you have four voices to play with and no more. If you want more of course you can just instantiate another instance of HE, but that’s going to get tricky, so I‘d save that for later. If it’s just thickness you desire, better to do as they suggest and run the voices through AVOX Choir (or Waves Doubler, or any other plug-in that gives you the illusion of thickness without adding new notes).
First there are the global controls at the top. The first set are under the heading “Humanize” and if you have been around music software for a while you know that usually when programs say Humanize they actually mean Randomize, and that’s what the Pitch and Timing variation does, adds a little unevenness to either the pitch or the timing. So you can choose to make your virtual backup singers as tone deaf or as sloppy as you want, to avoid the dreaded “perfect” (aka unnatural) sounding vocal. The irony of all that of course is not lost that I and many others have spent countless hours “tightening” or “de-humanizing” background vocals made by humans for the modern sound found in many Pop and R&B recordings.
Naturalize is how much of the “non-pitch” content will be sent to the harmony voices. By non-pitch content I mean vibrato and what Antares calls “pitch gestures”, which would be things like if a singer “scoops” up to a note, or does the Dylan-esque drop off after each phrase. These are stylistic things that the harmony vocals should not emulate (unless you want them to, that’s what the controls for).
| Voice Leading - What's That? |
| Voice leading is just a term for creating the smoothest and "best" sounding transition from chord to chord. Here "voice"
does not necessarily refer to a human voice, but just a single musical line, the combined with the others form a chord. Four voice is basically the standard
since this cover the four vocal ranges: bass, tenor, alto, and soprano. To quote from Walter Piston's Harmony "The smooth connection of chords is primarily a
melodic process, in which the structure of chords as simultaneously-sounding horizontal parts must continually taken into account. Composers in the common practice period
were always attentive to such linear considerations, even in music that is primarily chordal." Now a lot has changed since Bach and I have composed music, but the voice leading rules still primarily apply since their primary objective is to avoid things that sound corny, and if it was corny in the 17th century, you can bet it's corny today. If you wish to learn more about voice-leading I suggesting picking up Walter Piston's book mentioned above, or if you are a guitar player John Thomas' "Voice Leading for Guitar: Moving through the changes" may be a better choice, because I think voice leading on guitar is more challenging than on a keyboard. |
Generally the lead singer can get away with more of the gestures since that’s what makes for an emotive performance, but backup vocals aren’t supposed to emote, just support. As much as everybody loves the backup vocals in Ms. Franklins version of “Respect”, we all know it’s Aretha who wants her man to give her what she deserves, not her and her backup singers. (In a long parenthetical digression I do miss the old R&B style of the backup singers actually having a separate identity, or at least their own pronoun. For example the lead singer would sing “Oh I’m so blue” and the backup singers would sing in response “oh HE’s so blue”. I think this comes from live performance where it’s much harder to overlook the fact that the lead and backup singers are in fact, not the same person. On recordings, it all gets blurrier.)
So now that we are all humanized, it’s time for a little computer trickery in the Formant or Formant and Pitch Freeze functions. Formant freeze essentially just freezes the mouths of the backup singers but not their vocal chords. So if the lead singer was singing “I’m smooth as a baby just dropped in this world”, you could freeze the formant of the backup singers so they would sing “I’m smoooooooooooooo” but still follow the pitch of the phrase as it moves. If you freeze both Formant and Pitch, the backup singers hold the note sung on the “oh” and hold it while the lead singer separates and moves along (which is very common).
So now we move into some different territory because you may be confused by the last parameter because it looks like it’s just an on/off button, would you need to get all into a lot of automation? No, now we come to a key part of understanding how the Harmony Engine works, presets. Now I personally think presets is a bad name for these, because they are not, well, pre-set, you have to set them. I think “snapshot” probably would have been better to at least get your head around how you use them. Rather than get into a lot of tricky automation you can use the “presets” to automate all of the settings that affect what kind of harmony is generated and separate presets for the sound of each voice. Between these presets you can program all these changes at once without a lot of crazy automation hooha, since it doesn’t matter whether its “I’m smoooooo” or “I’ve gone awaaaaaaaaa” it’s the same Freeze Formant settings. Sweet huh?
Now I am going to blaspheme the review bible and skip a whole bunch of settings for each voice, or at least just take a very broad swipe at them. There are a lot of settings that basically need to be used to get the voice to sound natural (or unnatural if you are into that sort of thing). You can do all the things that you can usually do with any formant-based processing, change the sound of the voice from male to female or female to mail or female to pixie or male to Satan. You get the idea. These controls are both to adjust for the pitch change you are going to make them do, and to give each voice a slightly different character, to make them sound like four distinct people.
Now people in this vocal arena are always quick to say that besides emulating real human singing you can uses it for “experimental” or “effects” uses. That’s basically because weird is easy, real is hard. To get these voices to sound real takes some tweaking. Such tweaking that one of the controls goes from “Trial” to “Error”, so nobody lets you off easy, you’re going to have to use your ears. I actually found it relatively easy to get the harmony voices to sound real on my voice, a baritone which typically tends to be much harder than a soprano. The only issue I had was adjusting things so that I didn’t have any digital artifacts. I’ll touch on this again at the end of the review. Basically I learned one trick quickly since I sometime have some “pitch issues” which was to take a copy of my vocal and Auto-Tune it hard, way harder than I would on a vocal meant to be heard. I can then add separate vibrato for each individual voice, and automatically generated levels of pitch and timing randomness. In other words I straighten it out so I can bend it later (sort of like taking something out of the oven and blowing on it. Hey wasn’t I just trying to make that hot?)
Now all of that is well and good, with emphasis on good. But here is where it starts to get really interesting.

You can see from this screenshot that we are using Chord Name mode, and it’s easy to miss how much we’re not doing here. We’re not telling it what note each voice is, we just say “how about a major 7th suspended 4th chord please” and HE (that's Harmony Engine, not the Lord) does the rest. With the Spread and Register you can tell it how far up and how far apart you want the voice and you are all done. If you’re working from a lead sheet or fake book you can just type in the chord names right from that, creating a “preset” (think snapshot) for each chord that you can switch back and forth between as the song moves through the chords.
Antares gives you fifteen slots for the different chords and if you have more than fifteen chords, well firstly you might want to just rethink that, but if you really do, you can save all the presets as a preset in your host application and switch through “banks” of chords that way. And if you have more than fifteen you’re probably used to dealing with that level of complexity (and seriously it’s really not that outlandish with five chords for the verse, five for the chorus, and five for the break, not counting any sort of intro or outro). Note that HE takes the four notes it generates and tries to spread them consistently over the four voices so you don’t have one voice that is very high and then suddenly very low. This allows you to put processing on the voice without worrying if your pixie will suddenly turn into Satan unexpectedly.
The next type of Harmony Source is really the same thing but in a different style which is scale degrees. See how it says Key/Root there? In Chord Name mode the C indicates the root of the note and it is stored with the preset along with the invension type. In Chord Degree mode this is the key, since chord degrees describe chords relative to the key. So if you were doing the old I-V-IV, you would choose Tonic, Dominant and Subdominant and the type of inversion of each. Don’t know what they are? Then don’t worry about it. You don’t need to use this mode, it is only to accommodate people who have been taught harmony this way or for people for whom the key may change often. If that’s not you, then Chord Name gives you the same results.
Note that when you move into these modes HE is not getting its pitch information from the lead vocal anymore, just the timing and phrasing, as far as its concerned the lead vocal could just be a monotone. Really quite a feat.
So I mentioned how HE makes the crappy less crappy and the hard easy? Well let’s start with less crappy.
You can use the simplest mode, the fixed interval, where you would say that Voice A is going to be +4 above the lead vocal, Voice B +7 and so on. So you have a major third and a fifth above the root. But using the presets you can switch those so that the root is D, Voice A is +7 (G), Voice B is +11 (B) so that you had the dominant in second inversion (wait, that gives me parallel thirds, hmmmm), and assign that to a different preset so you can switch back and forth, or add more variations to avoid that “blocky” sound created by strict parallel harmonies. That’s either more work or more control depending on your take on it. Certainly if you want to create harmonies that don’t form any traditional “chords” (for example if you are doing Bulgarian harmonies which often uses the major or minor second in harmonies which is a strict no-no in traditional harmony) this will be the mode you will need to use.
Now for making the hard easy. Often programs like Melodyne will let you define the pitch of a voice using MIDI. That’s great but again you are doing the work of having to voice each chord manually. In Chord via MIDI, HE gets its pitch information all from a MIDI channel (a keyboard part for example) but then realigns them based upon your Register and Spread settings. Again, a lot of the drudgery is taken out of doing this. If your keyboard part is played by a real keyboard player they have probably thrown in the inversions as well, as the proper voice leading chords tend to be the ones that are easiest to switch with your hands.
Need more control than that? Do you want to have each individual voice move while others hold the same note? Then move to MIDI Channels mode, where you use four MIDI parts, each on a separate channel to define each of the four voices. Knock yourself out. Do a barbershop quartet or a Bach Choral.
You can hear this yourself right this minute by listening to the demo “Lift Us Away” on the Antares site. The male vocal part is created using Chord Name while the female vocal part is created using MIDI Channels mode. You can hear one of the female “singers” shift notes during a held vowel. Nice.
This leads me to something that is not really a feature of the software but a great learning tool. Those demos you heard that got you all excited about the product? Well you can download versions of those projects for most DAW recording applications. I found this invaluable in determining what the limits of the software were and when was the best time to use the different modes.
Now all these great features don’t mean much if the voices sound robotic. So can you get real results? Well I am ambivalent about telling you that, lest I ruin it for you. My advice is this, listen to the demos, they are a good representation of the results you can get yourself without much tweaking or any special tricks. I was able to get well-voiced harmonies within a relatively short time after reading the well-written and easy-to-read manual. The demos also well represent the different ways HE might be used, a huge choir, an intimate close harmony, a “second singer” harmony where there is just second voice adding harmony accents, and as an effect. Pick which one you think you would likely use and listen closely. If you find those convincing, then you will be pleased with HE. I listened to them backwards. I listened to the individual voices first and then listened to the finished product, so that has forever skewed my thinking. I think one of the demos has very audible artifacts in it. You can’t tell which one? Great, then enjoy. You and HE will make beautiful harmonies together. Stop reading now. If you do hear artifacts (not counting the intentional ones) then you still will probably find HE very useful, but know that it has limitations.
For more information about this or any other product on our site, please write us here.
Or to be kept informed of all the latest news, reviews, articles, and more, click here to subscribe to the audioMIDI.com newsletter.
|