Chinese most Difficult Language in the World (2)

Written by Uln on November 23rd, 2009

Last Friday I wrote a very long post where I ended up including too many ideas. The main point got a bit obscured as a result, but it was simply this: that vocabulary plays an essential role in learning a language, and that because of this Chinese is not only extremely difficult at an advanced level, but also growing more difficult with time.

I don’t suppose this is groundbreaking research, but it is interesting because most people are not aware of it, and also for its implications in the limit betwen language and politics, two fields we like to cultivate in this blog. Here is the argument in full with conclusions, for examples and details see the previous post and its comments:

  • To learn a new language the main knowledge required is in three areas: grammar, phonetics and vocabulary. Grammar and phonetics differ essentially from vocabulary in that the first two are rules applicable to infinite cases, whereas the latter is raw data. We can call them the Code and the Data elements of the language. The Code elements are finite and not growing. The Data element is practically infinite and growing, to the point that it is not completely mastered even by native speakers.
  • When studying a language, the Code elements play an essential role in the basic and intermediate levels, but at advanced level the real obstacle for communication—and therefore for progress—is Data.  For example, in German advanced students may sometimes use the wrong declension, and in Spanish they may fail to differentiate “rr/r”sounds. These things tend to not hamper communication because human languages are highly redundant. I would never understand “pero” (but) when a speaker says “perro”(dog). Ultimately,  imperfections in the Code elements amount to the same as having an accent: most of the times they are only relevant as metadata.
  • But while Code above a certain level is highly redundant, Data remains essential at every level. Borrowing from this great article: The phrase “Jacuzzi is found effective in treating Phlebitis”is meaningless when either or both of the nouns are unknown. A single missing word can often obscure the meaning of a whole paragraph or article.
  • The number of words used passively in real life far exceeds the typical standard lists of language levels. This is because semi-specialized words—such as ionic, jacuzzi or matrix—are not included in vocabulary lists as they are considered too rare. Certainly each of these words is rarely used, but there are so many of them that as a whole they are actually very often used. This Data element is so large that it cannot be memorized in a classroom, and the only way to acquire it is through many years of immersion.
  • The reason why most language learners never realize this problem is because they are “cheating”. In most languages in the World, this high level vocabulary is practically identical and it doesn’t need to be learned. There is a certain limit level for each language above which most modern words are international and the Data is no more specific of the language .
  • This limit level of vocabulary convergence is different for every language, but it doesn’t so much depend on the language family or geographical origin, rather it depends on the size and the development of the community of speakers. That is the reason why even non indo-European languages like Basque are extremely easy above the intermediate level: the community is not big enough to support complex terms, and all higher Data is adopted from International words. Most people tend to misunderstand and attach too much importance to the concept of language families, and they come up with absurd lists like this one.
  • The internationalization of vocabulary is growing with the advances in telecoms and globalization, especially since English has become the only language of scientific research. There is little point in inventing new Swedish terms in science, for example, when all the scientific community are reading/writing their papers in English. Often, in spite of political efforts to promote a local vocabulary, the economics of language revert the higher Data back to Internationalese.
  • There is only one language in the World that for historical, political and demographic reasons has remained an exception to this trend: that language is Chinese (Mandarin, Cantonese or others, the difference is irrelevant  here). It constitutes a parallel system of high level Data that has very few words in common with the rest of the Word. Japanese and Korean are partial exceptions in that they draw from both the Chinese and the International System, but modern words are increasingly International and these languages are converging with the rest.
  • In addition to this, Chinese has a ridiculously difficult writing system unique for its lack of a functional phonetic script. This compounds the vocabulary problem: not only there are more words to learn than in any other language, but each word  contains much more information as it needs to be associated with its corresponding characters.
  • Moreover, since there is no standardized way to transcribe foreign Proper Nouns, even names of places and persons tend to be “translated” into Chinese, sometimes completely departing the original phonetics and becoming Chinese Names in their own right. This adds to the already massive Data element in the Chinese language.

All this takes us to the conclusion: Chinese is the most difficult language to learn at a high level, regardless of the origin of the student.

This is particularly interesting because up to now the right answer to this question was only: “depends on your own mother tongue”.  With the possible  exception of Japanese/Korean students,  this post justifies that Chinese is actually the hardest for everyone else.  Inversely,  it is also very difficult for Chinese to learn other languages, although this is mitigated by the fact that other languages do have functional phonetic scripts.

Another interesting conclusion:  Chinese is not only difficult, it is actually growing in difficulty.

As the World grows more interconnected and technology occupies a more important part of our lives, new semi-specialized vocabulary takes an increasing part in everyday language. Expressions that refer to international concepts such as “spam”or “plasma TV” increasingly take the place of expressions referring to  local cultural heritage.  In this sense, we can say that all languages in the World are converging, while Chinese is an island diverging from all the rest.

Then there are the political conclusions that we can draw from this, but I am committed to writing shorter posts, so we will leave that for the next day. Comments and corrections are welcome to my arguments above.

Sharing is free, support my work:

  • Twitter
  • Facebook
  • email
  • Google Bookmarks
  • Digg
  • del.icio.us
  • Haohao
  • StumbleUpon
  • Technorati
  • LinkedIn
  • Netvibes
  • Reddit
  • Posterous
  • Live
  • QQ书签
  • MSN Reporter
  • 豆瓣
  • Yahoo! Buzz
  • MySpace
  • FriendFeed
  • Print



Comments so far ↓

  1. Nov
    23
    10:39
    PM
    John

    You make some good points, but I thought I’d comment on a few things.

    This Data element is so large that it cannot be memorized in a classroom, and the only way to acquire it is through many years of immersion.

    Well, extensive reading will do it too. One might argue that extensive reading (and perhaps exposure to lots of audio) is the best way to acquire all that data, and neither strictly require total immersion.

    In my own experience, learning words at the technical level is quite simple in Chinese. This is because the morphemes Chinese uses to construct the technical terms are largely the same elements found in everyday words. Whereas European languages borrow from Greek and Latin, Chinese often uses whatever makes sense. Compare:

    word, etymology; 词, 词源

    bug, insect, entomology; 虫子, 昆虫, 昆虫学

    sound, phonology, acoustics; 声音, 音位学, 声学

    This all does nothing to deny that you can’t rely on loanwords or international standard terms with Chinese, although I’m not sure that’s entirely true either. In many fields, IT especially, the English terms are used quite often.

    So the situation is not quite as dire as you portray it. :)

    [Reply to this comment]

  2. Nov
    24
    12:12
    AM
    Uln

    Thanks for commenting John.

    Re reading: I agree, that is what I am doing myself, since in Shanghai total immersion is impossible when you work in a foreign company. For the listening I try to do as much as possible on TV and with you guys on chinesepod. This might work when you are in partial immersion in Shanghai, but I find it very difficult to believe that someone sitting in his house in the US can get even close to full fluency.

    Re semi-specialized words: You are right, term formation is logical in many cases, this was also mentioned in various comments yesterday. The problem is that this kind of logic is always much easier to use when you already know the word (it helps to remember it) but often it is useless when you are attempting to say something. More importantly, however “relatively simple” this term creation might be, I am comparing it with languages that actually say “etymology”, “acoustics” and “entomology” just like in English. Chinese is necessarily harder.

    This reminds me of an example: Saturday we were in Cantina Agave, the Mexican Restaurant. It was so noisy you could barely speak, and I told my friend: “This room has very bad acoustics!” My improvised 音学 didn’t work. Back home I checked JdeFrancis Dictionary. What I should have used is: 音响效果很差. Very logical when you already know it, but very difficult to come up with it out of the blue, this never will happen to you in Spanish (“la acustica es mala, hombre!”)

    Re IT, I have seen that there are a few terms, like USB, U-pan, DVD. I guess the more complex/specific the component, the more English is used, as less general speakers need to use the word. This is especially true of English abbreviations (DVD), which are surprising exceptions because they are written in Latin script. Might be the start of a trend. Think of it, if they can use “DVD” in writing, why can’t they use “BARACK” or “RONALDINHO”?

    PS. The situation is not dire, it is fascinating and I guess this is what makes Chinese language so addictive to many of us: It is in so many ways unique and different from any other language in the World.

    What I am saying in this post is: we are not learning just a language here, we are learning the whole vocabulary of a different civilisation, perhaps the last parallel system of vocabulary that humanity still has. How cool is that!

    [Reply to this comment]

  3. Nov
    24
    12:20
    AM
    John

    Uln,

    Oops, I read your previous post before it had all those comments; sorry to repeat!

    Anyway, glad to hear you don’t see the situation as dire… Also, I actually think that thanks to the internet, both Chinese and Japanese are evolving relatively rapidly AWAY from characters (or at least away from characters with semantic meaning). Perhaps in a few hundred years, the vocabulary system will be notably less parallel.

    [Reply to this comment]

  4. Nov
    24
    2:37
    AM
    JP Villanueva

    Uln,

    Thanks for this thoughtful post.

    I don’t dispute that Chinese is a difficult language to learn, although claiming that it is the hardest language in the world is a strong claim… I feel there might be others, and for different reasons.

    Basque grammar has 17 declensions, and from what I’ve seen, pedagogy is pretty primitive. Philippine languages have focus-based morphology, primitive pedagogy… and is even harder since Filipinos are notoriously unconscious bilinguals; it’s hard to get an immersion experience when people look you in the eye and tell you “I’m not speaking English, I’m speaking Tagalog…” IN ENGLISH.

    My point is that there may be other factors that make a language “hard to learn;” Mandarin, for example, doesn’t have inflectional morphology of any sort, and it’s relatively easy to find a true ‘immersion’ experience.

    Also, Mandarin language pedagogy seems to be 50 years behind contemporary American ESL or Spanish pedagogies; people I know that have become superior speakers of Mandarin and Cantonese have done so ‘on their own;’ not because of formal instruction, but in spite of it. I wonder how difficult Mandarin would really be if there were more teachers teaching Communicative, Task-based Method at all levels. (I’m not making a claim here, I’m actually wondering).

    Finally, I want to get on my soapbox for just a second. Chinese languages may be hard to learn for us (certainly for me!) but (putting the writing system aside) for Chinese kids, they’ve got recursive subordinate clauses down well before age 5, just every other linguistically normal kid learning any language in the world. My point is that if it’s difficult, we’re probably making it more difficult than it has to be.

    [Reply to this comment]

  5. Nov
    24
    10:09
    AM
    Uln

    Hi,

    You say: “putting the writing aside…” . Of course, if Chinese was written in pinyin (phonetics) it would not be such a difficult language. And if it took loans even less. Unfortunately, the reality is different.

    I agree that the Chinese language in itself is not particularly difficult. The tones are difficult but not overwhelming, and the grammar is very simple. What makes Chinese difficult is writing and vocabulary.

    Then of course, the real discussion would be: How important is Vocabulary as opposed to Grammar/Pronunciation when learning a language? Here is where my specific conditions apply: My argument applies only for advanced students, normally (I dare say necessarily) living in immersion.

    When you are in this situation, the way you progress is by using the language for practical purposes. While it can be difficult even for advanced students of Japanese to get the right case/declension every time, this will not usually hamper the practical use of the language. But if they are missing the keywords in a conversation/article, this is a much more serious obstacle, and they need to look up thte dictionary all the time (and thanks God there are electonic dictionaries now, otherwise it would be even harder)

    This is the core of my argument, but certainly it is not a mathematical proof, and some might argue that Arabic or Korean grammar are so unimaginably difficult that they completely hamper the student even at advanced levels. Or that it’s so hard to even get to that advanced level that the student never reaches this advanced level I describe.

    I don’t know enough Korean/Arabic/etc to answer to that, although from my experience with grammatically difficult languages like Basque, people quickly get into conversations when they have enough vocabulary, and they find ways to commmunicate very well even if they always get the Subject -k endings wrong. I would be interested to hear other arguments though.

    [Reply to this comment]

  6. Nov
    24
    10:47
    AM
    Kyle

    For newcomers to the Chinese world Id like to stress the point that Uln made, “Chinese is the most difficult language to learn at a high level,”. With extra emphasis on “a high level”. I had almost zero training in Chinese and after two years of practicing in social situations was fluent enough to defend myself in a fast paced ‘angry Shanghai girlfriend’ conversation.
    As Uln has said Spoken Chinese is very simple. There is almost no grammar and as John mentioned a lot of words have links, and not just technical words. Not only that but Chinese only has 40 or 50 sounds so if you have a brain like mine its very easy to remember words because they sound the same. (If I were able to write characters I would give you some examples but since I don’t you’ll have to take my word for it.)
    I want to give Uln the heads up as he has mastered English as you can all see from this excellent article. I have not learned Chinese at a “high’ level so I cannot comment on this statement.
    I will tell you, new Chinese students, don’t be afraid! Since Mandarin is a second language for a lot of Chinese, (they will learn and speak their dialect fluently before they ever go to school) they struggle with the language as much as you and I.
    In my opinion Chinese is easier than Spanish, especially at a beginner and intermediate level, so give it a shot!

    [Reply to this comment]

    Uln Reply:

    Haha, Kyle your Chinese is amazing for someone who hasn’t studied it formally, I have seen few foreigners communicate so fluently. You are one of the examples I was thinking in the “Chinese is easy” part of the previous post.

    I agree with Kyle, and to those who want to follow his advice I say: go for it! And if you are half as talented as he is, you might be chatting freely in Chinese by the end of next year!

    [Reply to this comment]

  7. Nov
    24
    1:15
    PM
    spandrell

    Basic chinese is easy because most Chinese people’s Putonghua mostly sucks. An angry Shanghai girlfriend’s grammar and vocabulary is pretty basic. Try calming an angry Beijing girlfriend, or a Shanghainese in her dialect and you’ll find it much harder.

    And I’m with John in this one, once you master the Hanzi, most terms are just a combination of them, so its pretty straightforward. You don’t have to memorize a bunch of Greek or Latin vocabulary which is uncomprehensible at the beginning, like hepatitis vs 肝炎.

    Por cierto ULN, no me habia dado cuenta de que eres Español :) Ya veo que en este mundo hay que escribir en ingles para hacerse un nombre.

    [Reply to this comment]

    Uln Reply:

    Hola, no es por hacerme un nombre, lo que pasa es que empece en Espanyol y la gente no me hacia ni caso, nunca tenia comentarios. Es que en Espanya hay muy poca gente interesada en estas cosas.

    Ademas, escribiendo en ingles tengo la ventaja de que me puden leer muchos Chinos.

    [Reply to this comment]

  8. Nov
    24
    1:55
    PM
    safarinew

    “Chinese is the most difficult language to learn at a high level, regardless of the origin of the student.”

    i back you on that. i remember my high school classmates(we are chinese BTW) deeply adept in this language (esp. four-some-idoms) usually scare the **** out of less adept classmates. so when you feels awkward towards certain MADE PATTERNS of chinese, you are not alone.in this case and since you know chinese, i suggest you read this chinese book lashing out at english. i’d say it’ve given me the similar kind of joy like your links in your last post.

    http://www.douban.com/subject/1971488/

    i hope you get the joy too

    [Reply to this comment]

    Uln Reply:

    Will see if I can find the 盗版 on the 三轮车 down the street.

    [Reply to this comment]

  9. Nov
    24
    3:42
    PM
    Wukailong

    The word “matrix” in itself is interesting, because a lot of people in China probably don’t know what a 矩阵 is. I was wondering why The Matrix was translated as 黑客帝国 rather than 矩阵, but in retrospect it seems quite natural.

    As an aside, I often find Chinese translations of words more expressive than their English counterparts. 灭火器 is quite obviously some sort of container for the purpose, but if you told a child about a “fire extinguisher” for the first time they would probably not know what kind of thing it was. The same goes for 鼠标, in which you understand it’s not a live mouse you’re talking about.

    [Reply to this comment]

  10. Nov
    24
    7:12
    PM
    spandrell

    No si ya haces bien, la idea es que te lea gente. Ventaja que tienes de hablar bien inglés.

    Chengyu is by far the most difficult part of Chinese, and what makes the difference with other languages. I enjoy seeing TV series based in Beijing, and the speed and ease with which they spit those Chengyu is really amazing. I’m really thinking on getting a long vacation which a Chengyu dictionary and rote learn the whole thing.

    Then again, its not really necessary for real life. But I don’t feel happy myself until I get to a real native level.

    [Reply to this comment]

    Uln Reply:

    Spandrell, my advice on chengyus is: don’t try to learn them actively but just pasively. The best is to just read a lot and watch many CCTV serials where they speak like they were chengyu-automats. See my recent post about the serial “冷箭“, there were almost more chengyus than words! The 50th time I heard 引蛇出洞 and 打草惊蛇 I had to jump out the window (luckily Im ground floor)

    My technique is not to memorize them on a list, but rather everytime they come up I try my best to guess at them and then look up the dictionary. I find that most of the time, although they are very difficult to use actively, they are easy to remember passively, ie. it is easy to guess/remember the meaning when you have already come across it before.

    Besides, apart from the common ones like “不可思议“, the rest are rarely used in conversation. My guess is there are no more than a few dozens of these commonly used ones.

    [Reply to this comment]

  11. Nov
    25
    11:15
    AM
    Wukailong

    spandrell: “I’m really thinking on getting a long vacation which a Chengyu dictionary and rote learn the whole thing.”

    I’ve done that (not the whole thing, but the most common ones), though not on a long vacation. Chengyu is a bit like words – learn a thousand or so from a small dictionary, to ensure that you learn the common ones, and you should be ready to go. I doubt anyone not specifically schooled in classical Chinese knows more than 2000.

    Then of course, there’s suhua and yanyu, but these might be learned among the words.

    [Reply to this comment]

Leave a Comment





3 Trackbacks / Pingbacks

  1. Chinese the most Difficult… (and 3) | CHINAYOUREN
  2. Decompiled :: Mandarin and Wo
  3. Китай в ссылках (26.11.2009) / П.С.И. / Магазета
Get Adobe Flash playerPlugin by wpburn.com wordpress themes