mersenneforum.org  

Go Back   mersenneforum.org > Extra Stuff > Hobbies

Reply
 
Thread Tools
Old 2012-07-01, 11:32   #12
only_human
 
only_human's Avatar
 
"Gang aft agley"
Sep 2002

2×1,877 Posts
Default

There seem to be no simple answers on speech rate. There are syllable rates, word rates and information rates. This paper is recent: A Cross-Language Perspective on Speech Information Rate - 2011 (table 1)
Code:
INFORMATION DENSITY  SYLLABIC RATE  INFORMATION RATE
English
     0.91(±0.04)     6.19(±0.16)     1.08(±0.08)
French
     0.74(±0.04)     7.18(±0.12)     0.99(±0.09)
German
     0.79(±0.03)     5.97(±0.19)     0.90(±0.07)
Italian
     0.72(±0.04)     6.99(±0.23)     0.96(±0.10)
Japanese
     0.49(±0.02)     7.84(±0.09)     0.74(±0.06)
Mandarin
     0.94(±0.04)     5.18(±0.15)     0.94(±0.08)
Spanish
     0.63(±0.02)     7.82(±0.16)     0.98(±0.07)
Vietnamese
     1(reference)    5.22(±0.08)     1(reference)
The table is fairly early in the paper and out of context but is interesting all the same. The paper has a lot more to say but abstracting more from it eludes me. The conclusion:
Quote:
As a conclusion, we would like to point out that cross-language studies may be very fruitful for revealing whether memory span is a matter of syllables, words, quantity of information, or simply duration. More generally, such cross-language studies are crucial both for linguistic typology and for language cognition (see also Evans & Levinson 2009).

Last fiddled with by only_human on 2012-07-01 at 11:43
only_human is offline   Reply With Quote
Old 2012-07-01, 19:27   #13
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

11100001101012 Posts
Default

So Japanese and Spanish are fastest, followed by French. Interesting that Vietnamese is a dense (simple?) language. Most interesting of course is that English tops the information rate. What made them choose Vietnamese for the reference?
Dubslow is offline   Reply With Quote
Old 2012-07-01, 23:03   #14
only_human
 
only_human's Avatar
 
"Gang aft agley"
Sep 2002

2×1,877 Posts
Default

Quote:
Originally Posted by Dubslow View Post
So Japanese and Spanish are fastest, followed by French. Interesting that Vietnamese is a dense (simple?) language. Most interesting of course is that English tops the information rate. What made them choose Vietnamese for the reference?
It says that they picked Vietnamese as a reference because it was different ("a mostly isolating language") from the seven languages of collected information.
Quote:
Since the texts were not explicitly designed for detailed cross-language comparison, they exhibit a rather large variation in length. For instance, the lengths of the twenty English texts range from sixty-two to 104 syllables. To deal with this variation, each text was matched with its translation in an eighth language, Vietnamese(VI), different from the seven languages of the corpus. This external point of reference was used to normalize the parameters for each text in each language and consequently to facilitate the interpretation by comparison with a mostly isolating language
Information Density by language (IDL) here is a measure of information per syllable.
Quote:
The fact that Mandarin exhibits the value closest to that of Vietnamese (IDMA=0.94±0.04) is compatible with their proximity in terms of lexicon, morphology, and syntax. Furthermore, Vietnamese and Mandarin, which are the two tone languages in this sample, have the highest IDL values overall. Japanese density, by contrast, is one-half that of the Vietnamese reference(IDJA =0.49±0.02), according to our definition of density. Consequently, even in this small sample of languages, IDL exhibits a considerable range of variation, reflecting different grammars.

These grammars reflect language specific strategies for encoding linguistic information, but they ignore the temporal facet of communication. For example, if the syllabic speech rate (i.e. the average number of syllables uttered per second) is twice as fast in Japanese as in Vietnamese, the linguistic information would be transmitted at the same RATE in the two languages, since their respective information densities per syllable, IDJA and IDVI, are inversely related. In this perspective, linguistic encoding is only one part of the equation, and we propose in the next section to take the temporal dimension into account.

Last fiddled with by only_human on 2012-07-01 at 23:27
only_human is offline   Reply With Quote
Old 2012-07-02, 01:51   #15
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3×29×83 Posts
Default

I guess English being the fastest IR stems from two facts:

1) It's ID is greater than the other Indo-European languages (why?)
2) It's spoken faster than the other high density languages

It's neither the densest nor the fastest, but for whatever reason those two facts mean that English takes the title for IR (in this small comparison of course).


PS Perhaps English has a higher ID because we use less articles and our grammar is more about endings? I'm curious, how would Russian compare in that table?

Last fiddled with by Dubslow on 2012-07-02 at 01:52
Dubslow is offline   Reply With Quote
Old 2012-07-02, 02:48   #16
Uncwilly
6809 > 6502
 
Uncwilly's Avatar
 
"""""""""""""""""""
Aug 2003
101×103 Posts

203728 Posts
Default

Quote:
Originally Posted by only_human View Post
I last studied Spanish back in high school in 1980 or so but started to review it again after the public opening of a cool new language learning and translation site: www.duolingo.com
(blog.duolingo.com)
The site asked for my e-mail, then later on asked for my name and hometown. I am not going to give that site that info.
Uncwilly is offline   Reply With Quote
Old 2012-07-02, 03:26   #17
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

863610 Posts
Default

I think the article is highly biased. Just my opinion, don't throw the stone. First of all, there is no way to compare tonal languages with "normal" languages. One of the reason to have a tonal language, is the fact that you can have very short words - syllabic talking, you achieve different meanings not by changing the word with a longer one, but by varying the tone. The information per phoneme is much higher for a tonal language. It may be that a tonal language is spoken slower, to allow a better differentiation between tones, but its phonemes ARE shorter and more dense. There are mainly 3 tonal languages in the world, they are Chinese (Mandarin), Thai and Vietnamese. I can speak some Thai and some Mandarin (my daughter can speak both of them, Mandarin almost good, and Thai as fluently as a native speaker). Vietnamese lost part of its tonal "advantages" when they switched to Latin alphabet. The first book about Thai language I ever bought (10 years ago) started with "Thai is a tonal language, one of the main 3 tonal languages...." and we have this joke in our family about moving to Vietnam (we have been in China at the end of the last century, and since 12 years we are in Thai).

So, you can not really compare, is like the favorite saying of some people on this forum: "comparing apples with oranges". "The beautiful elephant jumped over the astonished monkey" (I like red foxes and lazy dogs, but I deliberately selected longer words ) would be in Thai something like "chang ti sway pim kun kwa ling pralat" (which takes 40% of the time to be said) and in Chinese even shorter ("meili de xiang yue jingya de houzi").
Same shortest syllable said on different tones means different things, like in "the horse and the dog both come to Mary", which would be in Thai much shorter: "ma ma ma mali" (no joke!, there are 4 different "ma" on 4 different tones, for "horse", "dog", "to come", and "ma-" of Mary, you can look on google translate, but I think that one translates "dog" by "sunak", which is a different type of dog, but if you look for "thai tongue twisters" you may find the right "dog").

I can speak (native) Romanian and can (almost) perfectly understand (but not really being able to speak, or speaking only a little) Italian, Spanish, Portuguese, French. I was taught 4 years of French in high school, and 4 years of Russian in the middle school, but both came into my head through an ear and went out off my head throughout the other ear. My mother was language teacher and she could fluently speak Romanian, French, Russian and Italian (no English, unfortunately).

I tell you all of this so you could see I am not completely out of this domain, and I heard this kind of debates and participate to them "for ages".

Anyhow... The article is quite biased toward western languages. Especially Latin languages, like Italian and French. They don't call them "romance" languages for nothing. Talking a lot and saying nothing (including Romanian, and you have the proof here in front of your eyes...).

They analyzed "telephone conversation". C'mon! It may have nothing to do with the language itself, but with people. I am working with Germans (in German-own companies) for more then 15 years, and I really appreciate they all say the things they want to say, directly. No matter the language.

We have customers and vendors from different part of the worlds. Italians talk half a day about nothing, before telling you what they want. Japanese are even worse, they are Italians of Asia. They talk a lot about weather and things with no significance, and they add kokomoto, mokoyoto, and other particles with no meaning (but which take half hour to pronounce) at the end of every word.

The same way as Indians are the French of Asia - they always believe they know better, try to cheat you on prices, and they will explain you for days why their reason is better then yours, even if you are not interested in it.

Language and people evolved together. There was no hen before the egg. Have any of you read Tofler's trilogy? Especially the second book, called "The third wave". The hospital as a factory, the school as a factory, the society as a factory...

The people as the language and the language as the people...
Our strength as humanity stays in us being different. We survived because we are different. Otherwise, if we would be all the same, the first malice able to kill one would kill all.
LaurV is offline   Reply With Quote
Old 2012-07-02, 04:15   #18
only_human
 
only_human's Avatar
 
"Gang aft agley"
Sep 2002

2·1,877 Posts
Default

Quote:
Originally Posted by Uncwilly View Post
The site asked for my e-mail, then later on asked for my name and hometown. I am not going to give that site that info.
I don't like that kind of thing either.

They do say this in terms:
Quote:
Use of Information Obtained by Duolingo
We may use your contact information to send you notifications regarding new services offered by Duolingo and its partners that we think you may find valuable. Duolingo may also send you service-related announcements from time to time through the general operation of the Service. Generally, you may opt out of such emails, although Duolingo reserves the right to send you notices about your account even if you opt out of all voluntary email notifications.

Profile information is used by Duolingo primarily to be presented back to and edited by you when you access the Service and to be presented to other users. In some cases, other users may be able to supplement your profile, including by submitting comments.

Duolingo may use aggregate or anonymous data collected through the Service, including Activity Data, for any purpose. This data may be used by Duolingo and shared with third parties in any manner.
So they do not promise to never contact you.

As for collecting Activity Data, I am OK with that for this specific site because it is part of why I am there. I want to be another ant in the anthill or a busy bee in the beehive. It lets me feel like I am accomplishing something beyond my ossified cranium. Of course, it is also quid pro quo.

Even so, I do not understand this younger generation that seems to be unconcerned about exposing every vital and/or banal scrap of personal information to data mining and potentially malicious scraping.

Last fiddled with by only_human on 2012-07-02 at 05:03 Reason: tweaks and afterthoughts. trimmed
only_human is offline   Reply With Quote
Old 2012-07-02, 05:04   #19
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3·29·83 Posts
Default

Quote:
Originally Posted by only_human View Post
Even so, I do not understand this younger generation that seems to be unconcerned about exposing every vital and/or banal scrap of personal information to data mining and potentially malicious scraping.
I don't like it myself. While I do use Facebook and G+, I don't do any 'liking' or +1 on external sites. I do my best to keep it limited to Facebook alone, and even then, I generally don't accept new apps anymore.


@LaurV: Your point about tonality is much of what the paper said, and why Mandarin and Vietnamese have much higher information density than the Indo-European languages, as reported in the paper...
Dubslow is offline   Reply With Quote
Old 2012-07-02, 05:56   #20
only_human
 
only_human's Avatar
 
"Gang aft agley"
Sep 2002

2×1,877 Posts
Default

Quote:
Originally Posted by LaurV View Post
Same shortest syllable said on different tones means different things, like in "the horse and the dog both come to Mary", which would be in Thai much shorter: "ma ma ma mali" (no joke!, there are 4 different "ma" on 4 different tones, for "horse", "dog", "to come", and "ma-" of Mary
And Mandarin has this well known one (Tongue Twister (绕口令) : 妈妈骑马):
Quote:
妈 妈 骑 马
mā ma qí mǎ
Mother is riding a horse.

马 慢
mǎ màn
The horse is moving slowly.

妈 妈 骂 马
mā ma mà mǎ
Mother scolds the horse.
Of course there is also this Mandarin sobriety test (Tongue twisters in many languages):
Quote:
四是四,十是十,十四是十四,四十是四十,四十四隻不識字之石獅子是死的
sì shí sì, shí shì shí, shísì shí shísì, sìshí shí sìshí, sìshísì zhi bùshízǐ zhi shíshīzǐ shì sǐ
4 is 4, 10 is 10, 14 is 14, 40 is 40, 44 illiterate stone lions are dead.

Last fiddled with by only_human on 2012-07-02 at 06:00 Reason: moved link references to be less ambiguous with quoted material
only_human is offline   Reply With Quote
Old 2012-07-02, 06:03   #21
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3·29·83 Posts
Default

One of my Opa's favorites is a Dutch one about 7 flounder from Scheveningen. (Brian?)

Last fiddled with by Dubslow on 2012-07-02 at 06:05
Dubslow is offline   Reply With Quote
Old 2012-07-02, 09:55   #22
Brian-E
 
Brian-E's Avatar
 
"Brian"
Jul 2007
The Netherlands

2·23·71 Posts
Default

Quote:
Originally Posted by Dubslow View Post
One of my Opa's favorites is a Dutch one about 7 flounder from Scheveningen. (Brian?)
Don't think I've ever heard this one, but "zeven" (seven) and the seaside resort Scheveningen very obviously lend themselves for use in a tongue twister, yes. The compound Dutch consonant "sch", pronounced as a letter "s" followed immediately by a rasping of the back of the throat, is certainly tricky when said many times in succession and combined with instances of bare "s" and "z".
I wonder what Dutch word your opa is using for "flounder". Drawing a blank there. I bet his full tongue twister includes the word "schepen", meaning "ships", though.
Brian-E is offline   Reply With Quote
Reply

Thread Tools


All times are UTC. The time now is 14:47.

Sat Aug 15 14:47:02 UTC 2020 up 2 days, 11:22, 1 user, load averages: 1.66, 1.72, 1.73

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.