Languages with (almost) no heterographs

Discussion in 'All Languages' started by Encolpius, Dec 15, 2012.

  1. Encolpius

    Encolpius Senior Member

    Praha (Prague)
    magyar (Hungarian)
    Hello, you can read an interesting article about heterographs on this site. It says that Finnish is very close to be a language with no heteregraphs.

    We all know English & French have a large number of heterographs, there are about 1500 homophones/heterographs in English
    i.e. one pronunciation and you can write it down the word with different ways: flower - flour; heal - he'll; two - too...

    Languages (very close to be with no heterographs) I know they must have only a few heterographs are Hungarian, Czech, Slovak, Spanish, German because they all have at least 1 sound with two graphems [Spanish b-v, German ss-ß, Hungarian ly-j], but I cannot recall any other examples in other languages, mainly Italian, Russian, Portugues, etc...

    Do you remember any examples of heterographs from your language [except English & French because they have many many]? Thanks
  2. OneStroke Senior Member

    Hong Kong, China
    Chinese - Cantonese (HK)
    Hmm... in Chinese, it's far from difficult to find heterographs. In fact, it's much harder to find a character that does not have a homophone. (That is, of course, different from saying that it's much harder to find a word without a homophone.)
  3. Encolpius

    Encolpius Senior Member

    Praha (Prague)
    magyar (Hungarian)
    I agree, Chinese and Japanese also have a vast number of homophones.

    Now I am doubtful almost only about Finnish and Turkish. I have found 2 homophones in Italian which is also almost perfect: anno-hanno and ceco-cieco. Are there any other examples?
    But how about Finnish or Turkish, there are really no homophones-heterographs, arent't there?
  4. ancalimon Senior Member

    Turkish. I think there are no heterographs in Turkish and most probably all Turkic dialects.

    We always mean what we say. It's just that people usually do not understand what is really meant but that's a different thing :)
  5. Encolpius

    Encolpius Senior Member

    Praha (Prague)
    magyar (Hungarian)
    Yes, I have expected that answer regarding Turkish, since I read: Turkish orthography is highly regular and a word's pronunciation is always completely identified by its spelling. That makes Turkish very unique, I wonder if one could say the same about Finnish.
  6. kirahvi Senior Member

    I can't come up with any homophonic heterographs in Finnish. The spelling doesn't always convey exactly how a word is pronounced (for example, there are some consonant doublings that aren't marked, such as hernekeitto (pea soup) /hernekkeitto/), but I don't think there's anything even close to heterographs in Finnish.

    That said, there are some heterophonic homographs.
    For example:
    koita (try, imperative) and koita (moth, partitive sg&pl)
    anna (give, imperative) and Anna (the female name)

    At the end of the imperative forms, when they stand alone as single words, there's something that resembles a glottal stop, but it quite isn't that. Maybe a weak glottal stop, if such a thing exists? In sentences the imperatives are always followed by a clear glottal stop or a consonant doubling, eg. /annaʔ ʔolla/ for "let it be!" and /juoksep pois/ for "run away!" (anna olla and juokse pois being spelling of these phrases, respectively).


    I just came up with one heterograph in Finnish:

    sian (pig, gen sg) and sijan (place, gen sg) - the pronunciation is at least in [my] casual speech the same, the glide is pretty weak.
    Last edited: Dec 17, 2012
  7. Espectro... New Member

    Actually, Czech/Slovak heterographs are not so few:

    1) In both languages there is so-called 'final-obstruent devoicing', so e.g. Czech words 'led' (ice) and 'let' (flight) are pronouced [let].

    2) Both languages have lost the difference between i/y (í/ý) in pronunciation, but not in writing, so e.g. Czech verbs 'být' (to be) and 'bít' (to beat) or 'mýt' (to wash) and 'mít' (to have) differ only if written.

    There are several other specifically Czech types of heterographs (there might be other specifically Slovak ones as well, of course):

    3) The letter 'ě' is pronounced as [je] after b/p/v/f, so e.g. words 'oběť' (victim) and 'objeď' (go/drive around!) or 'oběd' (lunch/dinner) and 'objet' (to go/drive around) are homophones (the latter case is also an example of the final devoicing phenomenon).

    4) In domestic Czech words the vowel [u:] is written as 'ú' at the beginning of words (and after a prefix) and as 'ů' elsewhere. However, in words of foreign origin it is always 'ú', so e.g. words 'kůra' (bark (of a tree)) and 'kúra' (cure, treatment) are also homophones.

    I am sure there other similar cases, but for now this is what I am able to remember...
  8. LilianaB Banned

    US New York
    I am sure Polish does not have that many homophones, and neither does Lithuanian. Russian does not have too many either. my feeling is that phonetic (predominantly phonetic) have fewer homophones, or heterographs.
  9. ThomasK Senior Member

    (near) Kortrijk, Belgium
    Belgium, Dutch
    Dutch has quite some, I'd say, because there are ei/ij words having entirely different origins and meanings. There are some t/d cases as well, I guess (words ending in t/d, but I can't find any differences in meaning right now...).
  10. Encolpius

    Encolpius Senior Member

    Praha (Prague)
    magyar (Hungarian)
    That's a very good example I have forgotten. And since Polish and Russian use final-obstruent devoicing as well, I bet, they must have more than just a few homophones as well. But Turkish also uses it for final stops -s-z, so I wonder we would be able to find some examples of homophones as well.
  11. arielipi Senior Member

    hebrew has some, though its a bit complicated:
    we have a system called niqqud, and we have the letters. letters tell the sound im supposed to pronounce, niqqud tells how to pronounce it: take the letter t, it makes the sound t, adding vowel a makes it ta, niqqud is equivalent to the a, while the letter is t.

    So, we have in original hebrew six letters that if take a dagesh(strong stress) makes a different sound - f->p, v->b, th->t, kh->c/k, g,d
    modern hebrew added a tag to some letters to add other sounds not found in original hebrew: ch, th(it was lost and returned with a tag), french j, j, w.

    all that and more we also have four matres lectionis but they are also not always like this.

    now, we dont use niqqud in everyday writing.

    So in conclusion: the sound correspondence with what we write generally respects each other, but there are some heterographs.
  12. LilianaB Banned

    US New York
    Yes, kod and kot in Polish sound the same (a zip code, or other code, and a cat). :D However, I don't think there are too many other examples of that kind in Polish, or Russian.
  13. إسكندراني

    إسكندراني Senior Member

    أرض الأنجل
    عربي (مصر)ـ | en (gb)
    Arabic has a couple, but they are few in number - and occur according to the rules and in a less random way than in English. Examples are given on the Arabic wikipedia article.
  14. Rallino Moderatoúrkos

    There are very few (max. 5-6) heterographs in Turkish.

    hal = bazaar vs. hâl = situation
    yar = chasm vs. yâr = lover

    In nominative they sound the same. But in other declensions, the words with the circumflex are pronounced with longer "a":
    halin /halin/ (of the bazaar) ; hâlin /ha:lin/ (of the situation)

    Also some personal names have two versions: one with ğ, one without. Both versions are pronounced the same: Tuba / Tuğba ; Kaan / Kağan.

    There are also some homograpic heterophones, but this thread is not about that.

    EDIT: I think BSC might be the winner in this. It's the most phonetic language I know.
    Last edited: Jan 2, 2013
  15. rusita preciosa

    rusita preciosa Modus forendi

    USA (Φιλαδέλφεια)
    Russian (Moscow)
    Yes, Russian has lots of homophones. I'd say, most of them are due to two phenomena:

    1.Softening of the final consonant (like you suggested):
    уг-лук (meadow/onion)
    плод-плот (fruit/raft)
    код-кот (code/cat)
    порoг – порок (threshold / vice, sin)
    труд - трут [labor / they rub)
    род – рот (mouth / genus, gender)

    2. Pronounciation of unstressed vowels:
    áния- кампáния (company/campaign)
    áть - придáть (to betray / to ascribe)
    о́к- бочо́к (barrel/side)

    One of the most common mistakes the native speakers make is due to un-voiced soft sign in infinitives of some reflexive verbs: бояться (to fear) – боятся (they fear)

    There is a very large number of homographs based on different stress (stressed vowels are marked here but in normal writing they are not):
    бо́льшая (larger – fem.) - больша́я (large- fem.)
    ве́рхом (along the top) — верхо́м (on horseback)

    ве́сти (news) — вести́ (to lead/conduct)
    ви́на (wines) — вина́ (guilt)
    вы́ходить (to treat/cure) — выходи́ть (to go out)
    гво́здик (little nail) — гвозди́к (carnations-gen.)
    до́рог (dear/expensive-masc.) — доро́г (routes-gen.)
    ду́хи (spirits) — духи́ (perfumes)
    ду́ша (shower-gen.) — душа́ (soul)
    е́ду (I am riding) — еду́ (food-accus.)
    жи́ла (vein)— жила́ (she lived)
    Last edited: Jan 2, 2013
  16. e2-e4 X Senior Member

    The homophones of the first group are homophones always, while the homophonism of the words in the second group depends on the region (in St Petersburg, for example, the pronunciations of unstressed е/и are sometimes very similar and sometimes quite different; I know nothing of other cities but am sure pronunciations of vowels vary among them), on the exact place of the vowel (sometimes one would give an unstressed vowel something like a slight additional stress), and on the mood of the talkers (whether they happen to talk distinctly or not). Besides, it is so difficult to tell when vowels are "identical" and when they are just similar, and there is no easy proof for that…

    There is also the third phenomenon: letters with no effect to articulation like the soft sign ("ь") after some letters ("ш", "ч"). For example, sometimes (very seldom, only when we personalise them) we distinguish between a male mouse ("мыш") and a female mouse ("мышь"), although most usually we just have the word for the mouse in the feminine gender ("мышь") and don't bother.

    As for foreign words, we usually just write them down with the cyrillic letters and read them aloud according to the rules (they are almost never ambiguous, except the stress).
    Last edited: Jan 2, 2013
  17. rusita preciosa

    rusita preciosa Modus forendi

    USA (Φιλαδέλφεια)
    Russian (Moscow)
    When we make generalizations about a language, we usually talk about the language norm spoken by the majority of native speakers. I'm sure there is a small precentage of people who speak in regional accents and dialects, or distort pronounciation to express someting else, e.g. sarcasm (I'm not sure what you mean by "mood"). In normal Russian the pronounciation of компáния- кампáния is identical, that is why so many natives confuse these two words.
  18. LilianaB Banned

    US New York
    In some Northern Russian dialects o is pronounced as o, almost always -- Archangielsk, Vologda, Kostroma, maybe as well. Other that that, such words are homophones. There might be some other dialects I don't know of where they don't appear to be homophones. Mostly older people speak those dialects, I think.
  19. Määränpää

    Määränpää Senior Member

    värisepä ("vibrate", 2nd person singular imperative + affective suffix) and väriseppä ("coloursmith", a word that doesn't exist but would be understandable)

    And of course the well-known pair haltia (elf) and haltija (possessor), but I guess they are originally the same word and the heterographic orthography has been chosen because the meanings have later evolved differently.

    Edit: Apparently haltija is the recommended orthography for both meanings, but fantasy literature stubbornly uses haltia. That's news to me!
    Last edited: Jan 24, 2013

Share This Page