Double letters in Latin words

PersoLatin

Senior Member
UK
Persian - Iran
My question is directed at Latin words used in European languages including Romance ones, so what is the reason some words have double letters in them, were these ever pronounced differently, if not why are they used?
 
  • In Italian, double consonants are still (often?) pronounced as double. So, for example, in railway stations the word “attenzione”, heralding an announcement, sounds something like “ad-tendziony”. So I presume it was the same in Latin, at least in words like this where there is an obvious etymological reason for having two consonants. (Ad-tenere, I guess.)
     
    Last edited:
    In Italian, double consonants are still (often?) pronounced as double. So, for example, in railway stations the word “attenzione”, heralding an announcement, sounds something like “ad-tendziony”.
    Thank you, I have noticed that in names too but couldn’t get a definitive answer, e.g. /n/ in Anna is definitely pronounced as double.

    Do we know if this was a rule for double consonants, in the original Latin language?
     
    In Italian, double consonants are still (often?) pronounced as double.
    In standard Italian double consonants are always pronounced as double. Also other consonants or consonant clusters are pronounced as double in good Italian, at least between vowels (including /j/): z /(t)ts, (d)dz/, gn(i) /(ɲ)ɲ/ sc(i) /(ʃ)ʃ/ gl(i) /(ʎ)ʎ/. Notably Northerners (especially Venetians) have some problems of pronouncing double consonants correctly.

    So, for example, in railway stations the word “attenzione”, heralding an announcement, sounds something like “ad-tendziony”.
    A better approximation would be at-ten-TSYOH-neh [attenˈtsjoːne].

    So I presume it was the same in Latin, at least in words like this where there is an obvious etymological reason for having two consonants. (Ad-tenere, I guess.)
    Yep, remember that z in Latin (in foreign words, i.e. Greek) always counted as two consonants, so pronounce it /zz/.
     
    “Germinated consonants”, if only I knew the correct technical tern & searched first, thank you.
    "Double(d) consonant" is not incorrect, but has two meanings. One is "gemination" which refers to pronunciation, and the other is "digraph" which refers to writing.
     
    I raised this to find some proof that in some languages that use the Latin alphabet geminated consonants (GC) are pronounced the way I'd heard them in Italian, but couldn't be sure as this didn't seem to be consistent across all Latin languages.

    I am close to finalising a web service that automatically transliterates Persian scripts to Latin, with all short vowels that are missing from Perso-Arabic script, represented correctly, as well as GC's. Persian has many words with GC's, mainly Arabic loans but some Persian words too, especially compounds. I have seen Latinised Arabic words with double consonant so the techniques is in use. I am hoping the one I am working on becomes an official script one day as well a guide for pronunciation of Perso-Arabic words.
     
    Last edited:
    Also other consonants or consonant clusters are pronounced as double in good Italian, at least between vowels (including /j/)
    I'm quite certain that /j/ is never phonetically doubled in standard Italian, and is the only such consonant apart of course from /w/. This is easily seen in situations of raddoppiamento fonosintattico.

    But the funny thing that the intervocalic /j/ is phonologically always geminate, which is seen from the fact that the masculine article that goes before it is lo/uno (as before consonant clusters, /d͡z/ and /ɲ/), and not il/un (as before single consonants and vowels): lo studente, lo zucchero, uno gnomo, lo iettatore, uno juventino. So /j/ is a geminate underlyingly, but never on the surface.

    I seem to remember that there were some Italian varieties where /j/ was always phonetically geminate (at least inside the stem), but if these exist, they're certainly rare. It was always geminate inside the stem in Latin, though only variably spelled as double, e.g. MA(I)IOR, RE(I)IECTVS. More often they spelled it with a separate letter called “the long I”.
    Yep, remember that z in Latin (in foreign words, i.e. Greek) always counted as two consonants, so pronounce it /zz/.
    Even the sound value of the Greek <ζ> is impossible to definitely establish; word-initially it seems to behave like other Greek double graphemes <ψ, ξ> and like consonant clusters - sometimes the belong only to the following syllable, sometimes they're split across two syllables.

    Certainly <z> wasn't [zː] word-initially in Latin, because it has no initial geminate consonants. It was probably pronounced as either [d͡z] or [t͡s], so as in Italian today. Inside the stem it could have been pronounced as [zː] or a geminate affricate [dd͡z], or replaced with /ss/ and later /dj/. The affricates were in any case foreign sounds, though they existed not just in Greek, but in many surrounding languages related to Latin, e.g. Oscan húrz /horts/ “hortus”. Whether it was voiced or voiceless probably varied allophonically in many of those languages.
     
    Last edited:
    I'm quite certain that /j/ is never phonetically doubled in standard Italian, and is the only such consonant apart of course from /w/. This is easily seen in situations of raddoppiamento fonosintattico.
    Right, but you should add also /z/, never geminated (although in Northern varieties it is not a true phoneme and in Southern Italian it doesn't exist, but in standard/Tuscan Italian is definitely a phoneme).

    But the funny thing that the intervocalic /j/ is phonologically always geminate, which is seen from the fact that the masculine article that goes before it is lo/uno (as before consonant clusters, /d͡z/ and /ɲ/), and not il/un (as before single consonants and vowels)
    Nope, it is always simple. In past times you could find il juventino l'Juventus.

    I seem to remember that there were some Italian varieties where /j/ was always phonetically doubled (at least inside the stem), but if these exist, they're certainly rare.
    Not very rare, if you consider the whole Central Italy (with the exception of Tuscany). Romanesco is a very clear example of a dialect with internal /-jj-/.

    It was always phonetically doubled inside the stem in Latin, though only variably spelled as double, e.g. MA(I)IOR, RE(I)IECTVS. More often they spelled it with a separate letter called “the long I”.
    Right, and those Latin words are represented in genuine Italian words with -gg(i)-.

    Certainly <z> wasn't [zː] word-initially in Latin, because it has no initial geminate consonants. It was probably pronounced as either [d͡z] or [t͡s], so as in Italian today. The affricates were in any case foreign sounds, though they existed not just in Greek, but in many surrounding languages related to Latin, such as Oscan & Umbrian, e.g. Oscan húrz /horts/ “hortus”. Whether it was voiced or voiceless in them probably varied allophonically.
    Possibly /d͡z/ initially and /zz/ internally, but z (ζ) in metre counts always as a double consonant.
     
    Last edited:
    In Italian, double consonants are still (often?) pronounced as double. So, for example, in railway stations the word “attenzione”, heralding an announcement, sounds something like “ad-tendziony”. So I presume it was the same in Latin, at least in words like this where there is an obvious etymological reason for having two consonants. (Ad-tenere, I guess.)
    What is the "y" at the end of the Italian word supposed to represent?
     
    Right, but you should add also /z/, never geminated (although in Northern varieties it is not a true phoneme and in Southern Italian it doesn't exist, but in standard/Tuscan Italian is definitely a phoneme).
    Right, I know /s/ occurs in many native words and past participles, but /z/ in Latin, French, and other borrowings in Tuscan.
    Nope, it is always simple. In past times you could find il juventino l'Juventus.
    Please, if you don't understand what I write, ask me to explain, but don't assume I'm wrong and you're right, and don't contradict my words without adducing any reasoning or explanation.

    You have not understood the entirety of the first half of my reply, where I explain that /j/ is phonologically, underlyingly geminate (“self-geminate”) like /ʎ ɲ ʃ t͡s d͡z/. This statement is supported by the evidence that I provided, which shows that /j/ takes the same forms of masculine articles as these “self-geminate” consonants, and disagreeing with that statement requires adducing either new evidence, or a different theoretical interpretation of the evidence I provided.

    As a native speaker, you can only say that phonetically, in surface pronunciation, /j/ is always single; but in fact originally you said the opposite, that it could be doubled. This is incorrect, and half of my reply to you was in order to explain this. This is why when you're replying to me that /j/ is always simple, you're twice compounding your original mistake.

    In il juventino /j/ counts as a single consonant, and the elision of /o/ in lo in l'Juventus can only be explained if <j> is phonologically a vowel, /i/. I understand from your words that this usage is outdated; in any case it doesn't contradict my analysis of the current usage, but on the contrary supports the different phonological status.
    Not very rare, if you consider the whole Central Italy (with the exception of Tuscany). Romanescois a very clear example of a dialect with internal /-jj-/.
    Yes, I've heard that about Romanesco, but as far as I remember I couldn't find many recordings where it was a clear geminate. In any case, I guess a pre-condition for its occurrence is that these dialects must lack the shift of the original geminate /jj/ into /dd͡ʒ/, as happened in most varieties north of Lazio.
    Possibly /d͡z/ initially and /zz/ internally, but z (ζ) in metre counts always as a double consonant.
    No, that was the other major point I wanted to correct, namely when I wrote:
    Even the sound value of the Greek <ζ> is impossible to definitely establish; word-initially it seems to behave like other Greek double graphemes <ψ, ξ> and like consonant clusters - sometimes the belong only to the following syllable, sometimes they're split across two syllables.
    That is, in the middle of a word it always counts as a double consonant (split across two syllables), but at the beginning of the word it can count as a single onset (belonging only to the following syllable). This is also the case in Latin, although generally the poets just avoided short vowels followed by word-initial <z>, evidently because the prescriptive rule was based on Greek and disagreed with the actual Latin pronunciation. Incidentally, the same happened with initial st-.

    Here's one clear example of a light syllable before initial <z>, even commented upon by Servius, who explains what I've just explained.
     
    Last edited:
    The y of “happy”. I don’t think standard English has a more similar word-final vowel, and in some parts of England the y of “happy” can sound more like [e] or [ɛ].
    Right, that's known as the “happY-laxing” and also as “schwee” – the two linked articles are nice and have audio illustrations. This is a feature of traditional RP, but in Yorkshire and Manchester accents this vowel tends to be even lower than the vowel of kit and more like that of dress. This is why it's more appropriately transcribed as happeh. Incidentally, Italian also doesn't distinguish [e] from [ɛ] in unstressed syllables, with Tuscan and the standard only having [e] when unstressed.
     
    Right, I know /s/ occurs in many native words and past participles, but /z/ in Latin, French, and other borrowings in Tuscan.
    Yes; weird case:/franˈt͡ʃeze/ vs. /inˈglese/ (I didn't transcribe [ŋ(g)] because it's not a phoneme in Italian).

    You have not understood the entirety of the first half of my reply
    Sorry if I was perceived as rude, it was not my intention.

    /j/ is phonologically, underlyingly geminate (“self-geminate”) like /ʎ ɲ ʃ t͡s d͡z/. This statement is supported by the evidence that I provided, which shows that /j/ takes the same forms of masculine articles as these “self-geminate” consonants, and disagreeing with that statement requires adducing either new evidence, or a different theoretical interpretation of the evidence I provided.

    As a native speaker, you can only say that phonetically, in surface pronunciation, /j/ is always single; but in fact originally you said the opposite, that it could be doubled. This is incorrect, and half of my reply to you was in order to explain this. This is why when you're replying to me that /j/ is always simple, you're twice compounding your original mistake.

    In il juventino /j/ counts as a single consonant, and the elision of /o/ in lo in l'Juventus can only be explained if <j> is phonologically a vowel, /i/. I understand from your words that this usage is outdated; in any case it doesn't contradict my analysis of the current usage, but on the contrary supports the different phonological status.
    Nowadays we say la iena /la ˈjɛna/ (borrowed word) but l'ieri, d'ieri /ˈljɛri, ˈdjɛri/; in past times elisions were more common; I'm sorry to contradict you but I don't think /j/ is "underlying geminate" in Tuscan-Standard Italian (unlike, for example, in Rome).

    I tend to explain the phenomenon thinking about a difference between native words (such as ieri) and learned words (such as iena or yogurt, where the article la/lo is selected).

    Besides, some Italian phonologist don't count /j, w/ as consonants but rather asyllabic forms ("phonostylemes") of /i, u/ (and transcribe them /i̯, u̯/, especially in case of "falling diphthongs": cfr. Muljačić).

    That is, in the middle of a word it always counts as a double consonant (split across two syllables), but at the beginning of the word it can count as a single onset (belonging only to the following syllable). This is also the case in Latin, although generally the poets just avoided short vowels followed by word-initial <z>, evidently because the prescriptive rule was based on Greek and disagreed with the actual Latin pronunciation. Incidentally, the same happened with initial st-.

    Here's one clear example of a light syllable before initial <z>, even commented upon by Servius, who explains what I've just explained.
    I admit I'm not an expert of Latin metre, you could be right in this very case.

    Incidentally, Italian also doesn't distinguish [e] from [ɛ] in unstressed syllables, with Tuscan and the standard only having [e] when unstressed.
    Yes, the only exception being compound words such as bempensante (also written benpensante) /bɛmpenˈsante/, retaining the original sound even in unstressed part of a word.
     
    Yes; weird case:/franˈt͡ʃeze/ vs. /inˈglese/ (I didn't transcribe [ŋ(g)] because it's not a phoneme in Italian).
    Inghilterra is clearly older and fundamentally native - I'd like to know how it ended up in its present form.
    Nowadays we say la iena /la ˈjɛna/ (borrowed word) but l'ieri, d'ieri /ˈljɛri, ˈdjɛri/; in past times elisions were more common; I'm sorry to contradict you but I don't think /j/ is "underlying geminate" in Tuscan-Standard Italian (unlike, for example, in Rome).

    I tend to explain the phenomenon thinking about a difference between native words (such as ieri) and learned words (such as iena or yogurt, where the article la/lo is selected).

    Besides, some Italian phonologist don't count /j, w/ as consonants but rather asyllabic forms ("phonostylemes") of /i, u/ (and transcribe them /i̯, u̯/, especially in case of "falling diphthongs": cfr. Muljačić).
    The “native vs learned” explanation would work if all borrowed words in Italian took the article la, and all native words took l' with elision. This generalisation is incorrect: the correct generalisation is that all words starting in a vowel take l', and all other words take the full form. This feeds into the wider generalisation that in Italian, vowels are elided before other vowels, but not before consonants, though there's lexicalised apocope in words ending in /n, r, l/.

    The “native vs learned” situation would be quite remarkable, and is really only possible in communities of deeply-entrenched diglossia: roughly, frequent language switching by speakers who speak both languages natively results in two different grammars inside the same language, with all borrowed words respecting the phonological system, syntax etc. of the speaker's other native language. Such cases have been described, IIRC. There are also cases of words obviously made up of foreign phonemes, which speakers try to pronounce more or less according to the original, e.g. Italian enjambement. In English, on the other hand, this is seen as very pretentious.​
    But in the normal case, speakers aren't aware where the words they use come from, whether they're borrowings or not. So when borrowings phonologically behave differently from native words, this is evidence that they're made up of different phonemes, but is not evidence that the speaker switches his entire grammar mid-sentence. Saying that the word is a borrowing is not therefore a grammatical explanation, but a historical fact that may or may not have any bearing on the grammar. Any such connection must be explained in terms of the grammar itself.

    The correct grammatical, phonological generalisation of the usage of the masculine articles in Italian is that un/il is used before words starting in a single consonant, while uno/lo is used before consonant clusters and /ʎ ɲ ʃ t͡s d͡z/.

    uno/lo also appears before /j/, but it's not a consonant cluster. Therefore it must be grouped together with /ʎ ɲ ʃ t͡s d͡z/, which are described as “self-geminate”. Some of them are actually pronounced as geminate in Tuscan, some aren't, but all of them behave phonologically as non-simple consonants. /j/ is just such a consonant – in most pronunciation it's not pronounced as geminate, but in some, like Romanesco, it's geminate both on the surface and underlyingly.

    Whether it's pronounced as geminate is irrelevant to its underlying status, which is determined by its phonological behaviour such as the form of the article it selects. But the fact that it is pronounced as geminate in Romanesco supports the conclusion that it's geminate everywhere else as well, just not pronounced as such. Additional strong support for this conclusion is the fact that /j/ can never be pronounced as geminate in those varieties where it's not already always pronounced as geminate. This makes it clear that geminating it phonologically doesn't result in a change of surface pronunciation, so that surface pronunciation cannot be used as a counter-argument.​

    Now your observation on the difference between la iena and l'ieri is interesting and valuable, and it clearly suggests that the graphic <i> in these two words represents two different phonological entities – a vowel and a consonant. We've already seen that in the case of Juventus. This isn't that surprising, because the Florentine <ie> comes from and corresponds to /ɛ/ in original Tuscan and in most varieties south of it, and so fits the classical definition of the diphthong, counting as a single vowel. la jena, on the other hand, has been borrowed differently, as a vowel-consonant sequence, and that consonant is underlyingly geminate, but simple in the surface pronunciation.
    Yes, the only exception being compound words such as bempensante (also written benpensante) /bɛmpenˈsante/, retaining the original sound even in unstressed part of a word.
    I remember reading there was variation in regards to this.
     
    Last edited:
    Inghilterra is clearly older and fundamentally native - I'd like to know how it ended up in its present form.
    Inghilterra ~ inghilese /-ese/ was the native Tuscan solution, but the learned inglese (with /-ese/ for analogy with portoghese, olandese, and similar words) replaced it. Francese (with /-eze/) belongs to the Northern influenced, "courtesan lexicon" (lessico cortese) and so it has lenition.

    The “native vs learned” explanation would work if all borrowed words in Italian took the article la, and all native words took l' with elision. This generalisation is incorrect: the correct generalisation is that all words starting in a vowel take l', and all other words take the full form. This feeds into the wider generalisation that in Italian, vowels are elided before other vowels, but not before consonants, though there's lexicalised apocope in words ending in /n, r, l/.

    The “native vs learned” situation would be quite remarkable, and is really only possible in communities of deeply-entrenched diglossia: roughly, frequent language switching by speakers who speak both languages natively results in two different grammars inside the same language, with all borrowed words respecting the phonological system, syntax etc. of the speaker's other native language. Such cases have been described, IIRC. There are also cases of words obviously made up of foreign phonemes, which speakers try to pronounce more or less according to the original, e.g. Italian enjambement. In English, on the other hand, this is seen as very pretentious.​
    But in the normal case, speakers aren't aware where the words they use come from, whether they're borrowings or not. So when borrowings phonologically behave differently from native words, this is evidence that they're made up of different phonemes, but is not evidence that the speaker switches his entire grammar mid-sentence. Saying that the word is a borrowing is not therefore a grammatical explanation, but a historical fact that may or may not have any bearing on the grammar. Any such connection must be explained in terms of the grammar itself.

    The correct grammatical, phonological generalisation of the usage of the masculine articles in Italian is that un/il is used before words starting in a single consonant, while uno/lo is used before consonant clusters and /ʎ ɲ ʃ t͡s d͡z/.

    uno/lo also appears before /j/, but it's not a consonant cluster. Therefore it must be grouped together with /ʎ ɲ ʃ t͡s d͡z/, which are described as “self-geminate”. Some of them are actually pronounced as geminate in Tuscan, some aren't, but all of them behave phonologically as non-simple consonants. /j/ is just such a consonant – in most pronunciation it's not pronounced as geminate, but in some, like Romanesco, it's geminate both on the surface and underlyingly.

    Whether it's pronounced as geminate is irrelevant to its underlying status, which is determined by its phonological behaviour such as the form of the article it selects. But the fact that it is pronounced as geminate in Romanesco supports the conclusion that it's geminate everywhere else as well, just not pronounced as such. Additional strong support for this conclusion is the fact that /j/ can never be pronounced as geminate in those varieties where it's not already always pronounced as geminate. This makes it clear that geminating it phonologically doesn't result in a change of surface pronunciation, so that surface pronunciation cannot be used as a counter-argument.​

    Now your observation on the difference between la iena and l'ieri is interesting and valuable, and it clearly suggests that the graphic <i> in these two words represents two different phonological entities – a vowel and a consonant. We've already seen that in the case of Juventus. This isn't that surprising, because the Florentine <ie> comes from and corresponds to /ɛ/ in original Tuscan and in most varieties south of it, and so fits the classical definition of the diphthong, counting as a single vowel. la jena, on the other hand, has been borrowed differently, as a vowel-consonant sequence, and that consonant is underlyingly geminate, but simple in the surface pronunciation.
    Hmm, I've read it and I must admit it works, but nevertheless it has a pair of problems:
    1. It's the first time I read something similar (my linguistic studies date back to two decades ago and were probably a bit out-of-date; besides I'm not a professional linguist, I do other jobs for a living);
    2. the solution of /j/ being underlying geminate but simple in surface is interesting but it has the fault of creating a discrepancy with /w/ (a marginal phoneme, I admit, in native lexicon it appears only in /-wɔ-/ and /-kw-/ clusters) and no Italian linguist postulated a difference between them.
    I remember reading there was variation in regards to this.
    Probably yes, but in Tuscany this is the way we pronounce it.
    I noticed you cited enjambement; it's a good example of a full learned word, almost correctly unpronounceable in Italian; in fact we tend to pronounce it /an(d)ʒambeˈman/, quite far from [ɑ̃ʒɑ̃bmɑ̃]; I honestly prefer Contini's inarcatura.
     
    Last edited:
    Inghilterra ~ inghilese /-ese/ was the native Tuscan solution, but the learned inglese (with /-ese/ for analogy with portoghese, olandese, and similar words) replaced it. Francese (with /-eze/) belongs to the Northern influenced, "courtesan lexicon" (lessico cortese) and so it has lenition.
    Oh, heh, I hadn't even noticed the discrepancy with the medial /i/. What I was wondering was how the name itself got into Tuscan and by what route. Intuitively it looks like a calque (loan-translation) from Germanic Engelland that might go back to the Migration Period, only I'm wondering if and how the Lombards for example knew of England at all without reading about it in Latin. Maybe they didn't, but loan-translated La. Anglia [terra] as Engelland and then vernacular Romance loan-translated that as Engelterra. In any case, the raising of pre-stress /e/ means it's an old Tuscan word.
    the solution of /j/ being underlying geminate but simple in surface is interesting but it has the fault of creating a discrepancy with /w/ (a marginal phoneme, I admit, in native lexicon it appears only in /-wɔ-/ and /-kw-/ clusters) and no Italian linguist postulated a difference between them.
    Theoretical symmetry and beauty is a natural human drive, but one should be careful with it and not let get that drive overshadow the real world data, which has no need to be symmetrical or beautiful. Already in Latin, the only onset (homosyllabic) clusters a /w/ appeared in were [kw] and [ŋ.gw], while /j/ couldn't appear in onset clusters at all, only heterosyllabically, as in ab-jectus. For this and other reasons, [kw] and [ŋ.gw] in Latin are often described as single labiovelar consonants /kʷ/ and /gʷ/, and this situation looks to be preserved in Italian, although in words like cuore it's difficult to decide whether we have /kʷɔre/ or /ku͡ɔre/. Most European languages have /j/ as a phoneme but not /w/, which is an allophone of /v/ – see my replies in this thread.

    Any way, in this case it appears that there is a parallelism after all. [wɔ], just like [jɛ], is a single vocalic nucleus in native words (so l'uomo), but in borrowings /w/ appears to be a consonant (il Word); additionally, variation such as il whiskey ~ lo whiskey suggests that this consonant can be underlyingly single, as normal consonants, or underlyingly geminate by analogy with /j/, at least in non-written speech.
     
    Last edited:
    Oh, heh, I hadn't even noticed the discrepancy with the medial /i/. What I was wondering was how the name itself got into Tuscan and by what route. Intuitively it looks like a calque (loan-translation) from Germanic Engelland that might go back to the Migration Period, only I'm wondering if and how the Lombards for example knew of England at all without reading about it in Latin. Maybe they didn't, but loan-translated La. Anglia [terra] as Engelland and then vernacular Romance loan-translated that as Engelterra. In any case, the raising of pre-stress /e/ means it's an old Tuscan word.
    Every etymological dictionary I have read treat Inghilterra as a Tuscan adaption of Old French Engleterre, Middle and Modern French Angleterre.

    Theoretical symmetry and beauty is a natural human drive, but one should be careful with it and not let get that drive overshadow the real world data, which has no need to be symmetrical or beautiful. Already in Latin, the only onset (homosyllabic) clusters a /w/ appeared in were [kw] and [ŋ.gw], while /j/ couldn't appear in onset clusters at all, only heterosyllabically, as in ab-jectus. For this and other reasons, [kw] and [ŋ.gw] in Latin are often described as single labiovelar consonants /kʷ/ and /gʷ/, and this situation looks to be preserved in Italian, although in words like cuore it's difficult to decide whether we have /kʷɔre/ or /ku͡ɔre/. Most European languages have /j/ as a phoneme but not /w/, which is an allophone of /v/ – see my replies in this thread.

    Any way, in this case it appears that there is a parallelism after all. [wɔ], just like [jɛ], is a single vocalic nucleus in native words (so l'uomo), but in borrowings /w/ appears to be a consonant (il Word); additionally, variation such as il whiskey ~ lo whiskey suggests that this consonant can be underlyingly single, as normal consonants, or underlyingly geminate by analogy with /j/, at least in non-written speech.
    If we consider /i̯ɛ, u̯ɔ/ as "rising (full) diphthongs" (two homosyllabic vowels: parallelism with /Vi̯, Vu̯/?) and then /kʷ, ɡʷ/ we may get rid of (marginal) phoneme /w/, while /j/ can be considered actually underlyingly geminate.
     
    In Catalan, most double "geminated" consonants are learned words or adapted foreign words.

    bb: abbàssida, subbranquial
    cc: accadi, occamisme
    dd: addictiu, luddisme
    gg: heideggerià
    l·l: instal·lar, hel·lènic, paral·lel, etc. <- By far the most common double one, usually pronounced as one single consonant [l].
    mm: gemma, summe
    nn: Anna, annex, cànnabis, tennis
    pp: grappa, proppassat
    tt: giottesc, wattímetre

    Note: I didn't count those double consonants that are only so in spelling: rr [r], ss [s], zz [ts], cc [ks] (-acció), gg [ʤ] (suggeriment)
     
    Every etymological dictionary I have read treat Inghilterra as a Tuscan adaption of Old French Engleterre, Middle and Modern French Angleterre.
    Oh, right, that explains it. It also occurs to me that the Tuscan /in-/ might simply be an adaptation by the regular correspondence en- :: in-, as in the prefix.
    If we consider /i̯ɛ, u̯ɔ/ as "rising (full) diphthongs" (two homosyllabic vowels: parallelism with /Vi̯, Vu̯/?) and then /kʷ, ɡʷ/ we may get rid of (marginal) phoneme /w/, while /j/ can be considered actually underlyingly geminate.
    More precisely, most analyses seem to agree that there's a difference between the surface/phonetic [i̯V] and [Vi̯] – the rising [i̯V] is by most accounts a true phonological diphthong, whereas the falling [Vi̯] is simply a sequence of two vowels syllabified together (in fact I have a few questions about this, but they don't belong in this thread). Canalis 2018 serves as an example of such an analysis (see scihub); but van der Veer 2006 concludes that rising surface diphthongs are underlyingly in some cases sequences of two vowels (e.g. in vialetto), and in others glide+vowel sequences (muovente, siepe), the latter fitting the classical notion of the diphthong.
    In Catalan, most double "geminated" consonants are learned words or adapted foreign words.
    How often are these geminates simplified in normal speech? I've checked forvo and only hear true geminates, but in Russian for instance, which has plenty of morphologically motivated and even some lexical geminates (+ an always-geminate phoneme щ /ɕː/), unmotivated geminates tend to be simplified, especially in borrowings from French (where these are only graphic) and Latin. Besides, Catalan geminate voiced stops could be distinguished from singletons just by escaping lenition.
     
    Last edited:
    In Catalan, most double "geminated" consonants are learned words or adapted foreign words.

    bb: abbàssida, subbranquial
    cc: accadi, occamisme
    dd: addictiu, luddisme
    gg: heideggerià
    l·l: instal·lar, hel·lènic, paral·lel, etc. <- By far the most common double one, usually pronounced as one single consonant [l].
    mm: gemma, summe
    nn: Anna, annex, cànnabis, tennis
    pp: grappa, proppassat
    tt: giottesc, wattímetre

    Note: I didn't count those double consonants that are only so in spelling: rr [r], ss [s], zz [ts], cc [ks] (-acció), gg [ʤ] (suggeriment)
    I think of geminate consonants to be more oral than written. Those double consonant groups you mention bb,gg, dd, nn, pp, tt do look rather foreign or learnèd. Would a Catalan really pronounce ab.bàsida and ad.dictiu?
    I thought in orthography, Catalan usually makes use of the mechanism t + consonant to represent acoustic geminate consonants when the words are authentic native Catalan (exceptions being of course l.l and rr, in which there are probably hundreds of examples)

    setmana (pron. sam.mana), sotmetre (sum.metra,)
    atlas (pron. al.las), Atlantic (al.lantic)
    ratlla (pron. rall.lla), vetlla (bell.lla)
    cotna (pron. con.na)
    metge (pron. mej.ja), jutge (juj.ja)

    Also the l triggers gemination
    poble (pron. pob.bla)
    regla (pron. reg.gla)
    article (pro. artic.cla)
    triple (pron. trip.pla)
     
    Last edited:
    How often are these geminates simplified in normal speech? I've checked forvo and only hear true geminates, but in Russian for instance, which has plenty of morphologically motivated and even some lexical geminates (+ an always-geminate phoneme щ /ɕː/), unmotivated geminates tend to be simplified, especially in borrowings from French (where these are only graphic) and Latin. Besides, Catalan geminate voiced stops could be distinguished from singletons just by escaping lenition.
    The most obvious geminate for Catalans, the l·l (obvious because we even call it ela geminada), is simplified by almost everyone in normal speech, except perhaps for a few words. But as the rest of geminates in learned words, they are geminated in careful reading, specially in formal contexts.

    However, the geminates I'm discussing with Merquiades below are mostly respected.

    I thought in orthography, Catalan usually makes use of the mechanism t + consonant to represent acoustic geminate consonants when the words are authentic native Catalan (exceptions being of course l.l and rr, in which there are probably hundreds of examples)

    setmana (pron. sam.mana), sotmetre (sum.metra,)
    atlas (pron. al.las), Atlantic (al.lantic)
    ratlla (pron. rall.lla), vetlla (bell.lla)
    cotna (pron. con.na)
    metge (pron. mej.ja), jutge (juj.ja)

    That's right. If we’re talking about gemination with no need of graphic double consonant, then this is indeed a very frequent phenomenon in Catalan, specially due to assimilation. Not only in combinations such as those you mention (we could add -tb- in futbol, -bm- in submarí, etc), but in most contexts in which the first consonant gets assimilated:

    [b:] - cap vaixell ‘no ship’, tot bé ‘everything ok’
    [d:] - el pot donar ‘he can give it’
    [g:] - un suc gustós ‘a tasty juice’
    [ʒ:] - hi ha més gent ‘there are more people’, un peix gegant ‘a giant fish’
    [k:] - un mag català ‘a Catalan magician’, hem fet curts ‘we’ve run out of it, we don’t have enough’
    [l:] - han fet l’amor ‘they’ve made love’
    [ʎ:] - el llop ‘the wolf’, han torrat llesques ‘they’ve toasted slices of bread’
    [m:] - no sap mentir ‘he can’t lie’, tan malament ‘so badly’
    [n:] - pot nevar ‘it may snow’
    [ɲ:] - han fet nyaps ‘they’ve done botched jobs all over’
    [p:] - tot pot ser ‘everything may happen’
    [ʃ:] - sis xais ‘six lambs’
    [t:] - solitud total ‘total loneliness’
    [z:] - tres zebres ‘three zebras’

    Also the l triggers gemination
    poble (pron. pob.bla)
    regla (pron. reg.gla)
    article (pro. artic.cla)
    triple (pron. trip.pla)

    Not all varieties do but I definitely geminate them. Ending in a schwa, though.
     
    Those double consonant groups you mention bb,gg, dd, nn, pp, tt do look rather foreign or learnèd. Would a Catalan really pronounce ab.bàsida and ad.dictiu?
    Some are geminated. Suggerir is often pronounced as /dʤ/ (well not a gemination maybe, but different from /ʒ/). And tarannà ("character") is always pronounced geminated. But most other words are pronounced with a simple consonant. There isn't gemination in native (patrimonial) words unless as a result of t+consonant as you say. Besides in the Majorcan dialect, where consonantic assimilations are more common, <ct> becomes /tt/: arquitette, or <rl> becomes /ll/: pal·là instead of parlar. In fact al·lot, where the gemination is always done, comes from arlot.

    metge (pron. mej.ja), jutge (juj.ja)
    These are pronounced /dʤ/
     
    [b:] - cap vaixell ‘no ship’, tot bé ‘everything ok’
    [d:] - el pot donar ‘he can give it’
    [g:] - un suc gustós ‘a tasty juice’
    [ʒ:] - hi ha més gent ‘there are more people’, un peix gegant ‘a giant fish’
    [k:] - un mag català ‘a Catalan magician’, hem fet curts ‘we’ve run out of it, we don’t have enough’
    [l:] - han fet l’amor ‘they’ve made love’
    [ʎ:] - el llop ‘the wolf’, han torrat llesques ‘they’ve toasted slices of bread’
    [m:] - no sap mentir ‘he can’t lie’, tan malament ‘so badly’
    [n:] - pot nevar ‘it may snow’
    [ɲ:] - han fet nyaps ‘they’ve done botched jobs all over’
    [p:] - tot pot ser ‘everything may happen’
    [ʃ:] - sis xais ‘six lambs’
    [t:] - solitud total ‘total loneliness’
    [z:] - tres zebres ‘three zebras’
    I never imagined all those would combine to become geminates. Some of those combinations become complicated to process. Hanfellamó, nossammentí, yameggén
    I think something similar occurs in some dialects of Castilian Spanish too, maybe Manchego and abouts. magguapo, marrico, lojjemelos, loccaramelos, el.lo mimmo, dedde que, settiembre, agotto, effermo, mippares, erraro, el.lago
    Not all varieties do but I definitely geminate them. Ending in a schwa, though.
    Yes, I know it's commonly called schwa and for lack of a better word I'll leave it at that. For me schwa is the "e" in French "le" or English "the". I don't how you are pronouncing it personally but the neutral vowel to me sounds somewhere closer to "a" yet not quite that. But closer. Were it not for Western dialect I think they could have written "las platjas blancas" in lieu of "les platges blanques".
    There isn't gemination in native (patrimonial) words unless as a result of t+consonant as you say. Besides in the Majorcan dialect, where consonantic assimilations are more common, <ct> becomes /tt/: arquitette, or <rl> becomes /ll/: pal·là instead of parlar. In fact al·lot, where the gemination is always done, comes from arlot.
    @Dymn So the t + consonant was originally pronounced and then after assimilated into the consonant to become a geminate? I always thought someone like Pompeu Fabra preferred reinstating a Latinate t from septimana rather than write semmana.
     
    Last edited:
    I never imagined all those would combine to become geminates. Some of those combinations become complicated to process. Hanfellamó, nossammentí, yameggén
    Bear in mind that not all speakers do them (you can also hear a -dl- combination there) and that it also depends on the speaker's speed.

    "Yameggén"? It's not a [g], it's a [ʒ]: jameʒ'ʒen
    I think something similar occurs in some dialects of Castilian Spanish too, maybe Manchego and abouts. magguapo, marrico, lojjemelos, loccaramelos, el.lo mimmo, dedde que, settiembre, agotto, effermo, mippares, erraro, el.lago
    It's a different thing, as that usually carries some trace of aspiration. (Except for "marrico", that would be close when done as in northern dialects, as when s drops in los relojes.)

    Yes, I know it's commonly called schwa and for lack of a better word I'll leave it at that. For me schwa is the "e" in French "le" or English "the". I don't how you are pronouncing it personally but the neutral vowel to me sounds somewhere closer to "a" yet not quite that. But closer. Were it not for Western dialect I think they could have written "las platjas blancas" in lieu of "les platges blanques".

    Catalan standard schwa is a schwa, pronounced in the same way as the a in English about. You might be referring to the [ɐ] sound made by many due to Spanish influence in the Barcelona area, a sound very close to [a] indeed. But that is not considered standard and is often derided.
     
    More precisely, most analyses seem to agree that there's a difference between the surface/phonetic [i̯V] and [Vi̯] – the rising [i̯V] is by most accounts a true phonological diphthong, whereas the falling [Vi̯] is simply a sequence of two vowels syllabified together (in fact I have a few questions about this, but they don't belong in this thread). Canalis 2018 serves as an example of such an analysis (see scihub);
    Yep, I know, nevertheless, /i̯ɛ, u̯ɔ/ show a more vocalic behavior than, say, /jV/ (as in iena or juventino) in the selection of articles, elision and so on. I think (just an hypothesis, eh) they could be more similar to falling /Vi̯, Vu̯/ than to pure glide + vowel sequence /jV, wV/ (the latter a marginal phoneme). Time to open a new thread?

    but van der Veer 2006 concludes that rising surface diphthongs are underlyingly in some cases sequences of two vowels (e.g. in vialetto), and in others glide+vowel sequences (muovente, siepe), the latter fitting the classical notion of the diphthong.
    Well, vialetto in Tuscany is /vi.aˈletto/ with no diphthong/glide at all; *muovente is not good Italian: we say movente and it is used especially in compound words (semovente, commovente) for two reason: vicinity to Latin and the neglected rule of dittongo mobile: a stressed /i̯ɛ, u̯ɔ/ loses /i̯, u̯/ switching to unstressed position (piède ~ pedóne, muòvere ~ movènte).
     
    Last edited:
    @Dymn So the t + consonant was originally pronounced and then after assimilated into the consonant to become a geminate? I always thought someone like Pompeu Fabra preferred reinstating a Latinate t from septimana rather than write semmana.
    Yes, I think in most cases, there was a true t, at least it was in Latin, so it had to be assimilated at some point (spatula > espatlla, cutina > cotna, septimana > setmana...). However the second sentence is true, by Fabra's time it was already geminated long ago and "week" was written semmana or senmana, he reinstated the Old Catalan spelling, which also makes visible the relationship with set "seven".
     
    Judging by several forvo recordings (setmana, cotna), some speakers treat those words that are currently spelt with <tm>, <tn> as actually consisting of these phonemes, and from their point of view the phonetic geminate [n:] is by the general assimilation rule as describded in message 25. So this isn't just a graphic convention or mechanism, as merquiades calls it in #24.
     
    Sobakus said:
    So this isn't just a graphic convention or mechanism,
    They can't be, the t's are inherited and appear in the same location in other Romance languages, see eg. Italian settimana. Even the t in vetlla is inherited, from Latin vetula, and the t in jutge, which is from the Latin d in iudex, still Present in eg. Italian giudice, and in metge which is from medicus, see eg. Italian medico.
     
    Judging by several forvo recordings (setmana, cotna), some speakers treat those words that are currently spelt with <tm>, <tn> as actually consisting of these phonemes, and from their point of view the phonetic geminate [n:] is by the general assimilation rule as describded in message 25. So this isn't just a graphic convention or mechanism, as merquiades calls it in #24.
    Oh, forgot to answer this.

    The traditional pronunciation is semmana. But it's getting less common by the day in two different ways: in the colloquial language it's increasingly pronounced semana like in Spanish, and in reading a text aloud or more careful speech the t resurfaces due to influence from the spelling. Something similar happens with tll: words like ametlla /əmɛʎʎə/ ("almond") are often pronounced either amella /əmɛʎə/ or amet-lla /əmɛdʎə/.
     
    I concur.

    One thing must be noticed. The -tn- is very rare, which is why cotna is almost always the only example. But -tll- is relatively common (ametlla, rotlle, vetlla, batlle, bitllet, motlle, espatlla, guatlla, ratllar, etc), so it's easy that one says "codna", little used to the word, but doesn't hesitate to geminate the ll in -tll-. It must be said, nonetheless, that there is dialectal variation in some of these words, and some dialects would say l·l [l:] instead of ll: for some of them.
     
    So basically the people who pronounce the t instead of a geminate in words like setmana, sotmetre, atlas, vetlla, espatlla, ametlla, cotna, jutge are basically illiterate and are over-correcting.

    I concur.

    One thing must be noticed. The -tn- is very rare, which is why cotna is almost always the only example. But -tll- is relatively common (ametlla, rotlle, vetlla, batlle, bitllet, motlle, espatlla, guatlla, ratllar, etc), so it's easy that one says "codna", little used to the word, but doesn't hesitate to geminate the ll in -tll-. It must be said, nonetheless, that there is dialectal variation in some of these words, and some dialects would say l·l [l:] instead of ll: for some of them.
    Are you saying some people pronounce amel.la, rol.le, vel.la, bal.le, bil.let, mol.le, espal.la, ral.lar? Who should do that? :confused:
    On Sardinia? I guess that one is not a Spanish influence.
     
    Last edited:
    They can't be, the t's are inherited and appear in the same location in other Romance languages, see eg. Italian settimana. Even the t in vetlla is inherited, from Latin vetula, and the t in jutge, which is from the Latin d in iudex, still Present in eg. Italian giudice, and in metge which is from medicus, see eg. Italian medico.
    They certainly can be, but aren't. What you write is known as the etymological fallacy (though this term is usually applied to meaning, not pronunciation). That's like saying that the English <gh> in tough can't be merely a graphic convention for /f/ because in Middle English and some dialects and related languages it's pronounced as /x/ or the like.

    The fact that some earlier stage of modern Catalan had, or some more conservative variety still has a stop consonant there doesn't tell us anything abound the synchronic phonological make-up of the words in question in the mental grammar of modern speakers. Even the same phonetic realisation with a geminate consonant can be, and actually is, interpreted by speakers (and phonologists) in two different ways. Some speakers clearly treat the written stops as merely etymological and the geminate consonant as underlyingly so, while others treat it as the result of assimilation of an underlying stop consonant.
     
    So basically the people who pronounce the t instead of a geminate in words like setmana, sotmetre, atlas, vetlla, espatlla, ametlla, cotna, jutge are basically illiterate and are over-correcting.
    Jutge is pronounced with /dʤ/, so with the t pronounced (and assimilated in voicedness to the next consonant).

    They're not illiterate, but yes they're overcorrecting.

    Are you saying some people pronounce amel.la, rol.le, vel.la, bal.le, bil.let, mol.le, espal.la, ral.lar? Who should do that?
    In the Balearics, and parts of Valencia and the Ebro region of Catalonia. In spelling, these varieties use ametla, rotle, etc.
     
    Are you saying some people pronounce amel.la, rol.le, vel.la, bal.le, bil.let, mol.le, espal.la, ral.lar? Who should do that? :confused:
    On Sardinia?
    Well, why the surprise? It may even be the original or most conservative of the two pronunciations.

    And yes, where Dymn said above, and in Sardinia too. That is, in L'Alguer.

    I guess that one is not a Spanish influence.
    How could? We're talking about phenomena that probably took place centuries ago.

    Real Spanish influence upon Catalan phonology has only been acting in the last sixty years. It is pervasive these days, but luckily not omnipresent yet. :p
     
    They certainly can be, but aren't. What you write is known as the etymological fallacy (though this term is usually applied to meaning, not pronunciation). That's like saying that the English <gh> in tough can't be merely a graphic convention for /f/ because in Middle English and some dialects and related languages it's pronounced as /x/ or the like.

    The fact that some earlier stage of modern Catalan had, or some more conservative variety still has a stop consonant there doesn't tell us anything abound the synchronic phonological make-up of the words in question in the mental grammar of modern speakers. Even the same phonetic realisation with a geminate consonant can be, and actually is, interpreted by speakers (and phonologists) in two different ways. Some speakers clearly treat the written stops as merely etymological and the geminate consonant as underlyingly so, while others treat it as the result of assimilation of an underlying stop consonant.
    The English gh can't be a mere graphic convention for f, because it's not always pronounced f, see through, thorough, drought, and borough, for example. In fact, I believe the f pronunciation is a minority pronunciation, only seen in a few words, such as tough and laugh.

    Sure, in modern Catalan, the t may have been retained solely to mark geminates, but that doesn't mean it didn't originate as a proper consonant that has assimilated in speech but not in writing.
     
    The English gh can't be a mere graphic convention for f, because it's not always pronounced f, see through, thorough, drought, and borough, for example. In fact, I believe the f pronunciation is a minority pronunciation, only seen in a few words, such as tough and laugh.

    Sure, in modern Catalan, the t may have been retained solely to mark geminates, but that doesn't mean it didn't originate as a proper consonant that has assimilated in speech but not in writing.
    Clearly we're speaking two different languages here. You draw a false dichotomy – mere graphic conventions include rather than exclude etymologically motivated mere graphic conventions, and the English <gh> for /f/ is precisely this. The /f/ sound at the end of tough is no different in any way whatsoever from that in any other word. This is why it's a graphic convention with no implications for the underlying phonological representation, the segments (phonemes) that the word /tʌf/ is made up of, which are /t/, /ʌ/ and /f/.

    How that convention originated (whether it's etymological or not) has no bearing on what I'm talking about, like it makes no difference where the the hydrogen molecule in the water you drink originated from. Hydrogen molecules have no memory of their origins, and neither do phonemes. The only information the speakers have reference to is what the segment (phoneme) is synchronically, at this moment in time. Confusing this synchronic state with diachronic (over time) development and claiming that diachronic development determines synchronic state is called the etymological fallacy.

    Essentially, I'm trying to linguistically determine whether the substance in question is H2O or something else, and you're saying that since the hydrogen molecule in it originated in an acid, it's not just water but acid water.
     
    Last edited:
    Now, what does actually influence the synchronic state is spelling. If what Dymn says in #30 is right, then the <t> spelling is the newer one, and before that geminates were written instead, which means the <t> spelling isn't an etymological preservation, but an etymologising innovation (except presumably in <tll>). Regardless of its origin (which is, to repeat yet again, irrelevant), this spelling is now standard, and it's this spelling that influences people's synchronic interpretation of what segments the words in question are made up of. Accordingly, many modern Catalan speakers regard these words as contaning an actual stop consonant instead of a geminate, based partly on the spelling and partly on those words' relation to other words (setmana–set).
     
    Setmana was the generalized spelling in Catalan medieval literature as well, which was quite codified since the spelling of official documents of the Royal Chancellery of the Crown of Aragon served as the written standard. But the fact that a few examples like senmana (11th century!) or sempmana can be found too shows that the geminated pronunciation must have been a very old phenomenon indeed.
     
    Back
    Top