Romanian Transliteration

IreneStr

New Member
Dutch - The Netherlands
I am currently working on a transliteration module for as many languages as possible. I'd like to include Romanian as well, but I am not sure about the transliterations.
The transliterations will be used in URLs and in case people only have a non-Romanian keyboard available. Therefore, the non-standard latin characters that are used in the Romanian alphabet should be transliterated to standard latin characters. Please imagine you are writing on an old-fashioned British typewriter without diacritics/accents. What would you write in that situation?

This is my current list:

ă --> a
â --> a
î --> i
ș --> s? sh?
ț --> t? tz?

Are the above transliterations correct? What would you write for ș and ț? Do you have additions? Have I overlooked certain characters?

Thank you!
 
  • naicul

    Member
    Romanian
    Wasn't there an effort to support internationalized domain names? Anyway, here is what you should probably use (the same is used even by some Romanian news sites, e.g. HotNews.ro - Actualitate):
    ă - a
    â - a
    î - i
    ș - s (note that some publications are also - wrongly - using the letter ş instead of ș (s with comma))
    ț - t (note that some publications are also - wrongly - using the letter ţ instead of ț (t with comma))

    And here is the full Romanian alphabet: Romanian alphabet - Wikipedia, the free encyclopedia
     

    patriota

    Senior Member
    pt-BR
    The concept of "transliteration" is about writing a language with a whole different alphabet. Romanian already uses the Latin alphabet. The word you had in mind was "transcription".

    Anyway, URLs in most languages that use Latin letters with diacritics simply use the base forms of those letters when Unicode isn't implemented. That's the case of websites in languages such as Romanian, Portuguese, and Vietnamese. Changing them to something else would be counterproductive to the user experience and search engine rankings.

    The only special case I'm aware of is German websites, which often follow some patterns, like replacing ü with ue.
     

    IreneStr

    New Member
    Dutch - The Netherlands
    Thank you patriota for your input.
    Changing a character with a diacritic into multiple characters is more frequent than you might think. It is common in many Nordic languages as well. For example Danish ø --> oe (but Faroese ø --> o), Nynorsk and Finnish å --> aa, Swedish ä --> ae. Whether just dropping diacritics or changing a character to multiple characters is better for user experience and rankings therefore depends on the language. That is why I'm working on language specific modules and I like to ask native speakers ;)
     

    jimmyy

    Senior Member
    Romanian
    I agree with patriota, I did some transliteration from cyrilics to latin language. I also believe that the transcription is less frequent. If we are talking about writting properly in a certain language, then diacritics would be used, otherwise for writing with latin characters (basic latin) let's say in an SMS or email, one would simplify and just replace å --> a with one a.

    I've worked with germans, and there was one pedantic that had a name with umlaut in it, and he has never complained when I was writing to him without the dots, especially in emails, in official documents it was different.

    I would be curious to learn in which circumstances the transcription that Irene mentioned is used.
     

    IreneStr

    New Member
    Dutch - The Netherlands
    Hi Jimmyy, Thank you for your input. The transliteration module I am working on is for website URLs. For example, if the title of your website would be 'cămaşă', it would change your URL to mywebsite.com/camasa, instead of mywebsite.com/cămaşă, because many browsers would change 'ş' and 'ă' to something illegible.
     

    jimmyy

    Senior Member
    Romanian
    Hi Jimmyy, Thank you for your input. The transliteration module I am working on is for website URLs. For example, if the title of your website would be 'cămaşă', it would change your URL to mywebsite.com/camasa, instead of mywebsite.com/cămaşă, because many browsers would change 'ş' and 'ă' to something illegible.
    Bedankt, very interesting, I think Wikipedia does such transliterations, or at least they have a sytem to handle it.
     

    naicul

    Member
    Romanian
    I agree with patriota, I did some transliteration from cyrilics to latin language. I also believe that the transcription is less frequent. If we are talking about writting properly in a certain language, then diacritics would be used, otherwise for writing with latin characters (basic latin) let's say in an SMS or email, one would simplify and just replace å --> a with one a.

    I've worked with germans, and there was one pedantic that had a name with umlaut in it, and he has never complained when I was writing to him without the dots, especially in emails, in official documents it was different.

    I would be curious to learn in which circumstances the transcription that Irene mentioned is used.
    I have a (Norwegian) colleague that signs his emails "Kaare". His name is Kåre.
     
    Top