The YouTube voice. You can hear it in your head, can’t you? In fact, you’re probably using it to read this paragraph. Your inner voice suddenly sounds uncannily like Hank Green.
As YouTube has evolved from an artistic repository of classics like “Charlie Bit My Finger” and “Harry Potter Puppet Pals” into a platform where influencers and explainers can monetize their every opinion, something else has evolved as well. That’s the YouTube voice.
Listen to a few of the most popular YouTubers out there. Hank Green is a good example, but so are Tyler Oakley, Hannah Hart, and even Ryan ToysReview, the eight-year-old who earned $22 million from his YouTube videos last year. They’re all using “the voice.” It’s not particularly easy to describe: it’s kind of lilting and very casual, but it also maybe sounds a little bit like a 1940s radio star. Whatever it is, it’s catchy and it’s everywhere.
We were curious. What is this voice, exactly, and where did it come from? So we started researching, and it turns out we weren’t the first to ask the question. There’s not a lot of academic research out there about the “YouTube Voice” or the “Vlog Voice,” but plenty of linguists have weighed in, nonetheless. Here are some of our most interesting discoveries:
Origins of the Voice
So where did it start? Toronto-based speech pathologist Erin Hall says in an article for Vice that these slightly overwrought vocal patterns are probably just a byproduct of early YouTubers’ efforts to make themselves understood on video.
They’re recording themselves, and they know they need their audience to catch their every word, and so they innately slip into those patterns. “Just the reflex of taking a video of yourself, you’d probably automatically do some of those things,” says Hall.
In an interview for The Atlantic, American University Linguistics Professor Naomi Baron likens them to carnival barkers: “I guess the purest form of this style is the carnival barker. It’s less intense on YouTube, of course, where the audience chooses to click on a video—the speaker isn’t trying to grab the attention of people who just happen to be walking by, chewing on their cotton candy.” But, she says, the speaker is working hard to keep viewers’ attention even though there’s not really any action on the screen.
They’re putting on a voice—not unlike a news anchor would, Hall says—that feels most appropriate for this kind of communication.
Characteristics of the Voice
Okay, so we can call it bouncy or old-school or catchy, but what makes the voice all those things? Three characteristics came up over and over again in our research:
1. Overly Enunciated
Baron explains that a key part of the voice (the part that makes YouTubers sound a little bit like old-school radio announcers) is the speakers’ dedication to overenunciating.
You hear it in the vowels that they pronounce much more precisely than we would in everyday communication, the vowels they stretch juuuust a little bit for emphasis, and the vowels they suh-neak into words where none really exist.
And it’s not just the vowels, either. Vloggers hit their consonants just a little harder than we would in everyday conversation, popping their Ps and Ts and giving those K sounds a little extra crunch. (You can really hear this in action when John Green gets on a roll.)
2. Super Casual
While these speakers’ enunciation may be extra formal, their overall tones tend to be much more casual than you might expect to see from a professional’s video presentation.
We can chalk that up to the YouTubers’ efforts to build “real” connections with their audiences by creating the illusion that they’re talking tofans, rather than at them. In creating this illusion, YouTubers are building what psychologists call “parasocial relationships,” in which audiences invest so much emotional energy into a media figure that they begin to feel as if they know that person.
The casual tone—some variation of “Hey, y’all, what’s up?” begins countless videos—the use of “we” rather than “I” to make viewers feel they’re experiencing things together as opposed to vicariously, and, of course, the personal, subjective nature of many of these videos all contribute to that “parasocial relationship” dynamic. But it’s the casual tone that really keeps it afloat.
3. Wildly Enthusiastic
Finally, in case you haven’t noticed, YouTubers are so psychedabout everything they have to say. Sometimes this manifests as being highly bubbly; sometimes as basically shouting at the camera. Often, it means talking so fast all that overenunciation really does become necessary.
To share a cynical point of view, YouTuber Jonbehere isn’t a fan:“I think that’s a way of disguising their videos are shit,” he says in Vice. “When people yell it’s overwhelming and masks the lack of thought that goes into the video.”
Hall puts it a little more delicately: “Even if what they’re saying is standard adding a different kind of intonation makes it more engaging to listen to.” And while unbridled enthusiasm could certainly be a tool to mask a lackluster message, let’s imagine for a minute that what these YouTuber’s have to say is truly necessary, captivating, and important.
In this case, they’re playing into emotional contagion, a phenomenon whereby audiences pick up on and “catch” a speaker’s emotions, as though they were contagious. Are they overdoing it? Sure, maybe sometimes, but this goes back to the notion that, when we’re talking to our audience through a camera rather than in real life, we have to work much harder to hold their attention. If they can’t help but “catch” our enthusiasm for our message, then we’ve got ‘em.
So there you have it: the secrets behind the Vlog Voice. But here’s one final note: in all the performance of making a great YouTuber video, remember to remain authentic. You can tailor your delivery to your viewers, of course, but if shouting at the camera isn’t you, don’t shout at the camera. Today’s audiences can spot a fake a mile away, and you’re much more likely to build true, loyal fans when you’re communicating as your authentic self.