First Sound of the Future?

Author: David Hoehl - TNT USA
Published: July, 2018

Those of us devoted to the various television series in the Star Trek universe will recall that in Star Trek: Voyager, which originally aired from 1995 to 2001, the ship's doctor is a hologram (played by human actor Robert Picardo)--a somewhat dyspeptic, balding middle aged male figure called the "Emergency Medical Hologram," or "EMH"--with a penchant for singing opera. In this way, as often happens, the show has proved to be a mirror reflecting events to come--but, as also often happens, a distorted mirror, one in which the broad picture is fairly accurate but the details are wrong. And so it is that now, 17 years after the last episode of the series aired, we have concerts featuring a holographic singer, but, far from opera, the music is crowd-sourced pop.

The name "Hatsune Miku" undoubtedly is familiar to younger TNT readers who follow up-to-the-minute Internet trends, but this old fogey became aware of it only upon reading about "her" in that most quaint of sources, of all things, a printed newspaper. For those as behind the times as I am, Hatsune Miku is a holographic character described in my favorite online resource, Wikipedia, as follows: "a Vocaloid software voicebank developed by Crypton Future Media and its official moe[1] anthropomorphism, a 16-year-old girl with long, turquoise twintails. She uses Yamaha Corporation's Vocaloid 2, Vocaloid 3, and Vocaloid 4 singing synthesizing technologies. She also uses Crypton Future Media's Piapro Studio, a singing synthesizer VSTi Plugin. She was the second Vocaloid sold using the Vocaloid 2 engine and the first Japanese Vocaloid to use the Japanese version of the Vocaloid 2 engine. Her voice is modeled from Japanese voice actress Saki Fujita. Hatsune Miku's personification has been marketed as a virtual idol and has performed at concerts onstage as an animated projection (rear cast projection on a specially coated glass screen)." And what, you may ask--well, at least, I asked--is a "Vocaloid"? Following the convenient link in the article, I find that it is software that "enables users to synthesize 'singing' by typing in lyrics and melody. It uses synthesizing technology with specially recorded vocals of voice actors or singers. To create a song, the user must input the melody and lyrics. A piano roll type interface is used to input the melody and the lyrics can be entered on each note. The software can change the stress of the pronunciations, add effects such as vibrato, or change the dynamics and tone of the voice."

In other words, as I understand things what we have in Hatsune Miku is a holographic video generator mated to a voice synthesizing system that converts music and lyrics input by a user to create a virtual performance appearing to originate with an animated figure heavily endowed with the Japanese anime "cute" factor (and hair colored Yamaha's signature turquoise blue). I say "virtual" because the performer is a machine-created figure, not "live," and "performance" rather than "recording" because the music is created afresh in real time, not played back from some storage medium where it is kept in complete form. The performance may be identical to every other performance that has preceded it, but depending on how the underlying computers are programmed, it can also vary at any point.

More important than the technical legerdemain is the way in which this character has been presented to the public: under a creative commons open license, meaning like Wikipedia, the encyclopedia anyone can edit, Hatsune Miku is a singer who will sing songs anyone composes. As quoted in the Washington Post, Cien Miller, who has become a celebrated composer for the system under the Internet pseudonym Crusher-P, described the advantages of the open license this way: "You can use her voice for your music without fear of breaking any rules. It was the perfect formula, because poeple loved Miku, and if you used Miku, people would be interested in what you were making." Miller should know; at 23, she is a full-time composer in large part because of her success writing Miku songs.

The character's name is a fabrication constructed from three Japanese words to mean, roughly, "The First Sound of the Future," and so calling it may well not be an overstatement. Not limited to computer screens or 3-D movie theaters, the Hatsune Miku character has been taken on several successful concert tours in Japan, elsewhere in Asia, and now in the United States. A European tour is planned for later this year. In each, a small band of human musicians backs the hologram, which, acting as lead singer/dancer, is projected on a glass screen spanning the stage. Performance material is drawn from YouTube contributions by the general public worldwide, selected in keeping with computer-tracked popularity figures. Promoters are careful to ensure at least some songs are in the local language of each performance venue. The promoters of the current tour even held a songwriting contest, with Hatsune performing the winning song each night.

What might this model mean for the future of the music industry? Hatsune Miku may be the first rumbling of yet another disruptive technological storm on its way to sweeping over the entertainment business; then again, it may be just another geeky flash in the pan, like Dolby FM or 3-D TV. (Bear in mind, however, that even if it fades out, sometimes technologies appear too early, lie fallow for a while, and then take off, like multichannel recording, which was a flop in the '70s as "quadraphonic sound" but years later reappeared, married to video, as "surround sound" and from there hasn't looked back.[2] That video element is important when we consider the potential of Hatsune Miku.) I'm no Star Trek prognosticator, but please humor the "vintage guy" as I stray from my usual climes and muse on some possible implications if Hatsune Miku really is the "first sound of the future."

Traditionally, ever since recordings supplanted amateur home performance as the primary source of domestic music, the conventional music industry model has been a closed loop in which professionals created the product for fans to purchase. Consumers were locked into their role; beyond the decision to buy or not to buy, they had little or no input into the process of developing the content they purchased, and recording companies stood as gatekeepers determining who would and who would not get the nod for public exposure. That model has persisted even as, in recent years, the lines have begun to blur, as advances in music creation software and Internet platforms like YouTube have enabled would-be producers outside the professional enclave to reach a widespread audience.

Hatsune Miku accelerates the latter process. Now those pioneers who were setting up little YouTube channels have escaped the confines of phone, tablet, or PC screens and burst onto stage in the concert hall, where large crowds are willing to gather and pay $55 to $155 per ticket to hear fan creations "performed" by a "singer." Yes, at this point the exercise still requires a certain suspension of disbelief, and the technology has its weaknesses: Hatsune Miku looks like a cartoon, appears to have a relatively limited repertory of movements, and sings in a voice that clearly betrays its synthesized origins, and concert attendees who wish to wave brightsticks are instructed to restrict themselves to those sold at the show, the given reason being that others are too bright and would wash out or obliterate the singer! No matter. If the history of the computer age has taught us anything, it's that popular technology usually starts out looking pretty crude but steadily grows more sophisticated and, in the case of modeling reality, more convincing. For example, just compare the original Donkey Kong or Space Invaders (I'll not betray my age by referring to "Pong" or "Brickout") to any of the current generation virtual worlds. In that way, we can expect devlopers to hone and refine Hatsune Miku or something like it into an ever more convincing simulation of an actual human performer, not to mention the possibilities of creating virtual non-humans--sticking with the Star Trek theme with which I opened, I wouldn't be surprised if someone were to create an entire Klingon opera. We can also expect enhancements from the artificial intelligence developers, enabling virtual performers to leave behand the requirement they remain "on script" and, again, making them more convincing evocations of living beings.

Incidentally, I don't see the effects of this technology as being limited to popular music. The days of holograms performing in opera may be as far off as the Star Ship Enterprise, but I can think of any number of situations in which a holographic performer or even ensemble might readily take the place of living musicians. For instance, local vendors of stage equipment might start carrying hologram generators, in which case couples wanting a string quartet or little jazz combo for their wedding reception but on too tight a budget to afford one or living in a place where no such ensemble is available might choose to rent a holograpic substitute instead. (Or here's one for those who like to ponder philosophical, ethical, and theological questions: could the couple be legitimately married by an appropriately consecrated holographic clergyman?)

How about implications for recordings? I can foresee a couple of conflicting influences. On the one hand, if Hatsune Miku or something like it really catches on, I would expect life to become much harder for those who make a living cranking out mediocre pop songs for records purely for commercial reasons. Currently, the industry's voracious appetite for material ensures a place for the time-honored "hack," but in a crowdsourced model such writers will be competing with a world of amateurs, some very likely first-rate talents, who formerly would not have passed the corporate gatekeepers. On the other hand, as computers monitor the popularity of these newcomers' creations and flag the most popular for performance by virtual performers, getting attention for songs outside the most mainstream taste may become much harder--the world may become one gigantic hack composer.

Another possible disruption will come if the electronics industry develops home holographic systems. Today, consumers buy gigantic TV screens and set up "home theater" systems, or they listen to pre-recorded music on earbuds or over speakers. Imagine if they could have a life-sized holographic likeness of a human actually "performing" in their homes instead! Even more, imagine if the hologram had sufficient artificial intelligence to vary performances as a live performer would, meaning no two auditions would be exactly alike. Suddenly, the records we all know and love might come to seem very much to be "canned music."

Well, as noted, I'm no gifted technological seer, and all the foregoing speculations are likely so much pie in the sky. What is not open to question, however, is that here and now holographic projections are presenting music to paying audiences in partnership with live musicians. Like it or not, adopt it as a new standard or not, the world is changing, and how this new technology plays out will be, in the words of Star Trek's Mr. Spock, "fascinating."

[1] - "Moe" is a term from the world of Japanese anime; its rough English equivalent is "cuteness."

[2] - Obligatory TNT-Audio disclaimer: We Support REAL Stereo! We Support REAL Stereo!

Facebook logo

Copyright 2018 David Hoehl - drh@tnt-audio.com - www.tnt-audio.com