 |
WaKan Project Website Forums about WaKan and Japanese & Chinese language
|
| View previous topic :: View next topic |
| Author |
Message |
ncc1701
Joined: 16 Jan 2004 Posts: 77
|
Posted: Wed Mar 17, 2004 11:14 am Post subject: |
|
|
Hello.
About soundfiles, there may be an additional feature very interesting.
Microsoft distributes freely (can be downloaded from its web) what they call Agent Characters. These agents are a combination of a text-to-speech engine and a moving character in the style of Office's clip.
These agents can be installed in the system and accesed using ActiveX calls in the software. I don't know exactly who it is done. I have seen this in some HTML pages, with activex calls inside the code.
If just the text-to-speech part is invoked, it can read the text, including all the kanji-kana conversion.
I must say that the quality is not very good, but it is free. Another TTS software of VERY high quality is RealSpeak from Scansoft, but it is expensive.
An example of the RealSpeak engine is found in Voice of Japan, http://www.kanji.de/VoiceOfJapan.htm you can download a demo.
I know that MS Agent is available in Japanese, I don't know about Chinese.
RealSpeak is available in Japanese, Cantonese and Mandarin. I don't know about prices, but it is too much for a freeware software for sure
Maybe it worths a look to the free MS Agent text-to-speech, and the next version may be a talking one!
(ups, this post should be in Feature Request, but the sound files options gave me the idea).
Best regards. |
|
| Back to top |
|
 |
ndt44 Co-Admin
Joined: 23 Feb 2004 Posts: 373 Location: Tokyo, Japan
|
Posted: Wed Mar 17, 2004 2:23 pm Post subject: |
|
|
Filip,
Firstly I must once again say what a fantastic job you’ve done, the speed you’ve implemented these great new features is just amazing. Well done! I hardly had a chance to have a look at 1.47 before 1.48 came out, and 1.48’s example database function is superb.
I just had a look at the abovementioned 'Voice of Japan' application, and I must say the 22kHz version is pretty impressive. It sounds a lot like the Japanese I hear in Public Service announcements and in department stores over the P.A. in Tokyo. Not quite natural, but good for listening practice, and remarkable for the technical feat alone. At $30.00 US, I'd say it's worth it. Having said that, I can imagine a far more effective implementation of the reading function if it were integrated into a well-designed application like Wakan - though I have no idea how much the authoring tools would cost... The other offering from the same group - KanjiQuick - was quite confusing and poorly designed, and I can't say I'd recommend it. Wakan is substatially better built in my opinion, and getting better every day
I had a chance to play around with 1.48 this afternoon, so I've posted some observations / bug reports in their respective forums.
Thanks again,
-Nigel |
|
| Back to top |
|
 |
wakan Site Admin
Joined: 18 Oct 2003 Posts: 918 Location: Prague, Czech Republic
|
Posted: Wed Mar 17, 2004 9:23 pm Post subject: |
|
|
Hello everybody!
I'm delighted because of the attention 1.48 received from you. I must thank Jim Breen and all the people who prepared the example file. For me it was a only few-hour speed-programming exercise where the main bulk of work was in the conversion and indexing of the original file. But preparing such _Huge_ database must have taken a lot of time of a lot of people. It seems to me that the file was at least partially manually edited. I think I'm obliged to send Mr.Breen some praise letter for it.
Sound files: I must admit that at first I didn't think it is of much use to have a TTS engine inside wakan. Japanese language has incredibly simple phonetic system, it is no problem to understand the phonetic structure of spoken language (at least in comparison to English - when I started learning English I was not able to understand a _word_ in real English movies, but I don't have any problems with identifying the syllables in real Japanese movies). So I thought that any _synthetic_ Japanese TTS system is of little value. But Voice of Japan is probably not synthetic (guessing by the size of the download) so it actually could be useful. But for me it would still be of little educational value, I think when you want to hear the _real_ Japanese, you just watch some movies or speak with the real people. No TTS system can supersede this. On the other hand, $30 is not much money, so it is probably worth it. But first of all I'm going to look at the abovementioned MSAgent. I know that TTS engine may be VERY useful for Chinese learners - some time ago I tried listening to spoken Chinese and then writing down what I have heard (phonetically) and ... well ... I think you need a trained ear for Chinese and that's why even synthesized TTS may help a lot.
Filip |
|
| Back to top |
|
 |
wakan Site Admin
Joined: 18 Oct 2003 Posts: 918 Location: Prague, Czech Republic
|
Posted: Wed Mar 17, 2004 9:30 pm Post subject: |
|
|
One more thing...
It could be very useful to display accents for words (at least for the common ones). Does anybody know about an accent database that could be included into wakan? I wonder, how does MSAgent handle accents?
Filip |
|
| Back to top |
|
 |
tony Co-Admin
Joined: 27 Nov 2003 Posts: 747
|
Posted: Wed Mar 17, 2004 11:58 pm Post subject: Accents, TTS |
|
|
Filip--
I couldn't agree more about accents, but I have looked far and wide and not found an accessible database which includes them. I am willing to work on a text file giving accents for as many words as I can find in elementary text books (unless you know of a dictionary which indicates accents) if you provide a way to get that information from the text file into your processed EDICT file (or into a separate database which WaKan might access). Just let me know how to indicate the accents-- whether to indicate both the rising pitch and the falling pitch, and what symbols to use (presumably in the hiragana strings).
As for TTS, it would be extremely useful for learning Mandarin, but only if the system were clever enough to observe the variations in tones which occur when hanzi are pronounced in the context of words, and words in the context of sentences. I will experiment with the freeware Mandarin TTS tool which another WaKan user mentioned and see if it is good enough to be worth considering for incorporation into WaKan. If you're willing to put that much effort into something which is only of use to those of us who are learning Mandarin.
--Tony |
|
| Back to top |
|
 |
wakan Site Admin
Joined: 18 Oct 2003 Posts: 918 Location: Prague, Czech Republic
|
Posted: Thu Mar 18, 2004 7:21 am Post subject: |
|
|
Tony,
I think that the incorporation of a TTS system should not be difficult. I expect that any such system has an API like SayThisUnicodeString, SetVoiceModelAndCharacteristics, InvokeSpeechConfigurationDialog, etc. It would be no problem implementing it in dictionary and editor.
Filip |
|
| Back to top |
|
 |
wakan Site Admin
Joined: 18 Oct 2003 Posts: 918 Location: Prague, Czech Republic
|
Posted: Thu Mar 18, 2004 7:24 am Post subject: |
|
|
Accents: I'm going to look for a free accents database on the net too. I think that the accents can (for Japanese) be identified for a given word by one number: the ordinal number of syllable(kana) where the accent falls.
Filip |
|
| Back to top |
|
 |
ncc1701
Joined: 16 Jan 2004 Posts: 77
|
Posted: Thu Mar 18, 2004 8:05 am Post subject: |
|
|
I don't know if it is so easy... I read time ago a web with some examples. It seems that they have more than one accent, and that may be repeated in a word. Like "arimasen", it sounds more like "arImaSEN".
Well, I hope that some database is available freely!
Spanish is much easier with accents )
Best regards. |
|
| Back to top |
|
 |
Tom Hodgers Co-Admin
Joined: 26 Jan 2004 Posts: 224 Location: Valencia, Venezuela via Liverpool and Manchester, England
|
Posted: Thu Mar 18, 2004 10:36 pm Post subject: |
|
|
Accents: IMPORTANT! Please have a look at the following article
http://ist-socrates.berkeley.edu/~hasegawa/Accent/accent.html
Following is an abstract of part of this webpage.
Reply from the Editor: We would not deny that "accent is an issue," but we think imitating native speakers, whether in real life or on the tapes that go with your textbook, is more likely to produce natural-sounding results than attempting to fabricate the sound on your own from a notation or explanation given in writing. This is true for all matters involving pronunciation, which is exactly why our pronunciation guide has the disclaimer you mention ...
It's interesting to note that native Japanese speakers outside Tokyo speak otherwise standard Japanese (hyoojungo) with different "pitch accents" (this is what we are speaking of here, not dialect accents) and never have trouble being understood. For the student of Japanese, a flat, even intonation will always be understood, and for Americans (and some Europeans) who tend to give their words very marked pitch accents, this may be a good way to eliminate some un-Japanese sounding speech habits.
When two or three words sound exactly alike except for pitch accent, context is going to resolve the ambiguity virtually 100 percent of the time. In practical terms, accent is probably the least important aspect of Japanese pronunciation no matter what your level of language skill.
On the whole, we think most people are best off following Jack Seward's advice ... "the degree of variance in pitch is so small that the beginner is advised to voice all Japanese words ... with a steady evenness of pitch ... Sooner or later, depending on the sharpness of your ear, you will come to be able to distinguish among and mimic the existing minor variations in pitch."
and here in the conclusion:
People without hearing impairments can mimic the melody of language, but they can hardly interpret visual accent markers into the oral/aural domain without special training because visual and auditory stimuli are processed very differently in the human brain. In all likelihood, the author of the above-mentioned letter simply feels more comfortable visually with accent markers. But using such markers to speak Japanese creates pronunciations that are worse than a crude synthesizer.
I seriously think that coding or marking for pitch accent in Japanese text would undoubtedly confuse any but the most advanced learners.
Tom |
|
| Back to top |
|
 |
Tom Hodgers Co-Admin
Joined: 26 Jan 2004 Posts: 224 Location: Valencia, Venezuela via Liverpool and Manchester, England
|
Posted: Thu Mar 18, 2004 11:23 pm Post subject: |
|
|
Accents Again: Anyway, for those of you really interested in pitch accented japanese text, have a look at the following very interesting page:
http://www.epochrypha.com/japanese/
and download the Kanjilab 0.5.0.a program, the Example Sentence File and the Accumulated Vocabulary File which have a great many examples of accented text.
Filip, the WWWJDic Example Sentences Database file lower down on this page is probably the same example file that Wakan uses but I cannot remember if it is accented or not. I lost the copy I downloaded a year or so ago.
The other material available for download is also interesting.
Tom |
|
| Back to top |
|
 |
tony Co-Admin
Joined: 27 Nov 2003 Posts: 747
|
Posted: Thu Mar 18, 2004 11:44 pm Post subject: Thanks for the link! |
|
|
Tom,
Thanks for the web site address-- that's one I haven't found my way to before, and a lot of what Collin McCulley has there looks very interesting and possibly very useful.
--Tony |
|
| Back to top |
|
 |
ndt44 Co-Admin
Joined: 23 Feb 2004 Posts: 373 Location: Tokyo, Japan
|
Posted: Fri Mar 19, 2004 5:36 am Post subject: |
|
|
From my experience of learning and listening to Japanese spoken here in Tokyo, I can tell you that Tokyo 'standard' Japanese is indeed very flat, with not a great deal of pitch variation or noticible difference in pronunciation for similar words of different meaning. As the above extract mentions, it's context that determines the meaning almost 100% of the time. For me, mis-pronunciation by way of stretching a 'syllable' (or more accurately a 'mora') is still a much bigger problem, as the word I'm saying literally becomes a different word in this case. Take ビル (biru) and ビール (biiru) for example - meaning 'building' and 'beer' respectively. The local guys here think this one's pretty funny, and often have a private chuckle about it.
When I'm teaching in the classroom, I can usually pick a student from Osaka in about five seconds, as their English pronunciation is much livelier and more natural than my local Tokyo students'. Whenever I've had a chance to chat in Japanese to my Osakan students, they all sound quite bright and friendly compared to the locals, with their intonation swinging a great deal more than the Tokyoites'. Some of my Tokyo friends say they have to supress a grin when they speak to someone from Osaka as they always sound so excited, even on a dull subject. My ear isn't keen enough to pick up any particular point of difference beyond a general 'livliness' and a few Osaka-isms in the slang , but I'll try to focus on it next time.
I guess the point I'm trying to make here is that I'm not sure an accent indication feature would be terribly useful in practice, but it would be interesting from an academic point of view. If it can be implemented easily, it's probably worth it, but I would certainly recommend listening to a native speaker over using the accent guides.
Cheers,
-Nigel |
|
| Back to top |
|
 |
wakan Site Admin
Joined: 18 Oct 2003 Posts: 918 Location: Prague, Czech Republic
|
Posted: Sat Mar 20, 2004 11:21 am Post subject: |
|
|
I've read Tom's and Nigel's opinions and I think they're both right. My impression that displaying accents is highly useful was somewhat biased by my textbook where accents are given for every vocabulary word. But frankly, I don't learn them when I learn new words and in real Japanese speech I can hardly discern them. Furthermore, because my mother-tongue (Czech) does always have stress accent on the first syllable, I'm quite ignorant of accents (even when I speak English ).
So my question is: Who of you would welcome accent display for (some) words in dictionary. Would you use it for your learning needs? It is a quite small task to do (probably) but would anyone use it?
Filip |
|
| Back to top |
|
 |
tony Co-Admin
Joined: 27 Nov 2003 Posts: 747
|
Posted: Sat Mar 20, 2004 2:23 pm Post subject: (sheepishly) |
|
|
I confess that I would use it for my learning needs, despite the groundswell of opinion against it. I am sometimes too lazy to use sound recordings (my primary medium for this is cassette tapes, which are awkward to use), and I rarely have a native speaker to talk with. I do find that when I learn a new vocabulary item with the accent marks, I am less likely to make obvious pitch mistakes in sentences containing the word. Some words matter more than others in this regard.
So yes, I would like to have the option of displaying accents, but it's obviously not worth putting time into unless others would as well.
--Tony |
|
| Back to top |
|
 |
kenkyuushin
Joined: 19 Jan 2004 Posts: 69 Location: Edo, Cipango, Tokugawa Jidai
|
Posted: Sat Mar 20, 2004 2:34 pm Post subject: |
|
|
Filip,
i would use it. A text-to-speech feature would be also very very nice.
Best regards, kenkyuushin  |
|
| Back to top |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
Powered by phpBB © 2001, 2005 phpBB Group
|