Rant on Hatsune Miku’s English Voice Bank Development: What Are These Producers Doing?

Filed under: English Miku,Other — Written by jrharbort on Sunday, March 10th, 2013 @ 11:47 pm

I know I’m not the only one who has been following the news of English Miku’s development since nearly day 1. I was just one out of 39,389 other people who also cast their vote for the English Miku campaign by liking Hatsune Miku’s Facebook page after the announcement at the 2010 New York Comic Con.

So I know I can’t be the only one who has noticed that something very wrong has been going on since the release of English Miku’s first public beta demo at NYCC2011. Since that first demo, the vocals we’ve been hearing in newer songs using English Miku’s beta voice bank seem to be getting progressively worse.

Or are they?

Please read the full story.

In case you haven’t heard yet, the demo for the English version of livetune’s Tell Your World from the album Re:Dial is now available for preview on iTunes. This is just one more track to add to the growing library of songs that use English Miku’s beta voice bank. Among some of these songs are NICE AGE, DAY TRIPPER, and livetune’s other recent English release, a remix of ZEDD’s Spectrum featuring Hatsune Miku.

While different, all the songs share one very common trait: They all sound much less comprehensible and more “nasal-y” compared to English Miku’s very first NYCC2011 demo. So what exactly is going on here? Has Crypton’s later developments of the voice bank really altered Miku’s voice to the point where it’s no longer recognizable? I don’t think so.

The problem lies in the decision of the producers to tune or alter Miku’s vocals to “suit” the needs of the song. This becomes vary apparent in the English version of Tell Your World, which has a completely different vocal type compared to the original song. The original is very widely known and considered by some to be “the best song that represents Miku and the community behind her”. There is no doubt that many fans will take notice of the drastic vocal change.

The vocal tuning also has one other negative side effect. The original demo at the NYCC2011 was still based on the older Vocaloid2 engine, not long before the project was dropped to start fresh with the Vocaloid3 engine. Even so, the demo kept Miku’s vocals clear and virtually untouched. This was the first (and only) time we were able to hear English Miku in her “pure” form. All demos afterwards by have been on the Vocaloid3 engine, but heavily tuned. Because of this, the fans have absolutely no idea what English Miku really sounds like at this point, and are also unable to tell what kind of progress Crypton has made in the quality of the voice bank since the first public demo.

This does not bode well for marketing, nor does it bode well for fans who have been following the development over the past couple years.

Is the tuning necessary because the voice bank is in English? Absolutely not. As the NYCC2011 demo and even Kaito’s recent V3 release has shown, it is entirely possible to have the Vocaloid’s voice sound the same as the original, even when having them sing in a completely different language.

In a recent video interview with Hirasawa Eiji, the composer of Miku’s very first Japanese demo song Hoshi No Kakera, he made this comment regarding Miku’s vocals:

“My first impression of Miku was that she could sing very naturally, so I wanted to make Miku sing delicate above all else. Rather than making Miku sing with an up-tempo tune, I made her sing slowly so that her vocal sounded more beautiful”.

His statements held true for Miku’s first demo song, which was quite clear and beautiful sounding. The same also held true for Miku’s first public English demo. So my final question comes down to this: When will the producers finally have the idea to create a simple, clear song that has English Miku singing with no extensive alterations made to the vocals?

Until that time finally comes, I will continue to look back at the outdated demo and try to imagine what kind of progress has been made over the past couple years. It’s the closest thing we have at this point to hearing Miku’s English voice bank in it’s original unaltered state.

7 Comments   -
  • Comment by Akita Neru | March 11, 2013 @ 10:33 am

    Crypton always bites more than it can chew. Trying to make a dozen of voicebanks simultaneously will come out wrong no matter how hard you try.
    You remember Act1 twins? English Miku will likely be the same. And Kaito will likely have a “native” voicebank, ‘cuz the 4 he offers now are fail compared to original (which was fail itself, commercial, though, because it was actually good), same story. Luka’s English bank is not finished, since it doesn’t have loads of necessary phonemes. Miku Append was first said to have 8 voices, the same were Kagamines Append, and V3 Kaito was said it’d have 6 voices not including English. Not mentioning that all the voicebanks they currently develop (and Kaito’s update) were said to be released much before this point.

  • Comment by sepul | March 13, 2013 @ 8:40 am

    I think the producers are at the fault here. They tune the voicebank so much that it sounded too much like robot.

    Personally KAITO V3 sounded very natural compared to the V1 version, even the English voice bank judging from the demo songs on the website. So hoping the English Miku would sound similarly natural.

    I think ryo and supercell would do a nice song with this one. They have made many songs that keep Miku’s voice as natural sounding as possible without adding sound effects.

  • Comment by Osher | March 19, 2013 @ 3:21 pm

    *Edited for not reading the article properly*

  • Comment by jrharbort | March 19, 2013 @ 3:39 pm

    @Osher: You seem to have misread and misunderstood the purpose of this entire article. I never claimed I desired Miku to become more popular, nor did I claim that English Miku is what made her popular. I’ve been around in this community for a very long time (since December 2007, to be exact). So I know what’s it’s like to be without it, and I’m perfectly fine with it NOT existing. However, since they ARE making it, it would be nice to see it done correctly, no matter how long it takes.

    Most of your own argument is invalidated by the fact that the vocal work for TYW and Spectrum was actually done by CircusP. But the entire process was screwed up by Crypton, because they never allowed CircusP or Livetune to have direct access to the voicebank. CircusP had to create the data using Luka’s English voicebank (which is incomplete, and a completely different vocal type), then pass this data over to Crypton. Then they plugged this data into English Miku’s voicebank, then passed the end result to Livetune. This will NOT come out with a good result, no matter how you look at it.

  • Comment by VYV2 | April 30, 2013 @ 12:44 pm

    One thing that people need to put into view is the fact that they are attempting to put a English voice for a Japanese Vocaloid when the producers and voice-makers can’t even speak English themselves correctly. Now I don’t mean by grammar, (and no offence to them) but I know that most people that speak a native language in Asia usually don’t have an easy time getting that accent out of their tone when they speak in English. I mean a synthesizer might be a bit (italics on that)easier, agree that they are not really thinking the American view of some things. The effort, I might start with a 3.5 out of 5 in the beginning, but I do believe that they are putting the effort in a different place now. They finally noticed that trying to make different more tones for Miku’s voice, or any Vocaloid’s voice, won’t work for English.(As most of you Japanese people know, you only have a fairly simple “voicebank” when it comes to talking in the language) Some of you guys might not notice, but English has some harsh accents that, even for Chinese and Japanese, doesn’t have. Now I don’t mean that everything is harder in English, I use the original Hatsune Miku CV-01 and I know that making English words on that is hard enough. But… they are making a different approach now, I can’t think of one right now, but the Crypton guys noticed that if they modify “this” tone, it matches the tone for an English tone. I find this really strange because they try to imitate a voice that they don’t even know how to say right. Well yea, they can just listen to an American vocal and then do something with that, but it just turns out “wrong.”

    So what then?

    I don’t have perfect answer, maybe Crypton is searching for a solution right now, but at this time there is no decent way to get this problem solved without a group of vocal specialists that can merge Japanese Hiragana/Katakana/Kangi tones with English tones in a fast, productive way for the people that anticipate the newest Vocaloid product to come out.

    I don’t know, I might be BS-ing this whole time, but for sure that their Append for Miku is a bit less than I had expected for her CV01-dark-ProtoTYPEβ and CV01-vivid-ProtoTYPEβ. I still give it a half-thumbs up, but I feel that they should consider normal, literally, normal average American listeners and ask for their opinions for how to make their English/append products better.

    According to Pixiv and some other sources I trust, Miku Append is a huge step in Japanese/English relations, but they are cutting short on the American input, sure they are listening, but they need more of not the “professionals” view, and just simply ask the natives (citizens) that live in the US. I have done this for my own CV-01 Hatsune Miku songs and I have made Miku’s English much more understandable. Sure, there would be something there that is still not right, but it is better than some the professional songs made by notable song-makers like Kz or Supercell.

    Well, here’s my five bucks on the topic I mean it’s longer than I had thought I would write, but I felt giving a bit more this time. [BTW I pretty sure I BS-ed some of the parts:)]

  • Comment by *(Cookie)* | July 30, 2013 @ 4:42 am

    I agree with all the comments. :)
    but about what ‘VYV2′ said, why do Crypton have to ‘just simply ask the natives (citizens) that live in the US.’ why in the US? why not in listen to people in England? I mean they speak as perfect English as American do. I love England, America and a lot of other countries for that matter. But I would think that it would be unfair if they didn’t include people’s thoughts in England as well.

    what I’m trying to say is that I agree with your point ‘VYV2′ that its important that they listen to the peoples thoughts, views and how people speak English. But I’m afraid that Miku will turn out sounding too American(thats my thought) but it would be the same as if they only listened to in England, she might sound too British. So I think that basically they should take into account that Americans and British people do sound different but both sound practically the same English.

  • Comment by digited | August 16, 2013 @ 5:18 am

    @ VY1V2
    “One thing that people need to put into view is the fact that they are attempting to put a English voice for a Japanese Vocaloid when the producers and voice-makers can’t even speak English themselves correctly.”

    ^ this, for starters. It wasn’t much better with those who knew English, as they worked with voicebank and software indirectly, apparently.

    All EMiku demos are really bad, only worth of a facepalm.

    I’m waiting for 31 Aug for sane english-speaking ppl to grab Piapro Studio and finally make something properly tuned. Crypton completely failed with presenting voicebank so far.

Leave your comment

© Mikufan.com. term papers