Postby Keith_Beef » 2018-02-28 09:13

Hello, all.

I'm trying to understand how Korean text entry works, especially on computers set up principally or exclusively for working in Korean.

Please not that I don't know much Korean, and I do not have complete control over the computer I am using for my tests (no root privileges). I do not have Korean text entry correctly set up on the computer, so I have to enter text using Unicode codepoints.

I am accessing a remote service, through a Web interface, that checks if a certain term is already in a database. The terms are stored in precomposed Hangul syllables. An example term is 축구.

If I enter this using the Unicode codepoints for the jamo that make up these two syllables (U+110e, U+116e, U+11a8, U+1100, U+116e), I see 축구 as I expect, but when I type this in the web interface, the term is rejected because the text is jamo, not precomposed syllables.

What I am doing, here, is consciously entering the codepoint for the choseong (initial consonant) followed by that for the jungseong (medial vowel) and (where necessary) that for the jongseong (final consonant).

As I have understood it, from reading around the subject and from watching a Youtube video of somebody typing Korean, the normal way of typing is to enter the jamo by pressing the key corresponding to the jamo shape, and to let the input method handle the composition, even down to the level of choosing whether a consonant after a vowel is a jongseong to be kept with the preceding vowel or is a choseong beginning a new syllable.

Now to the heart of my question. Let's imagine a person sitting at a computer set up principally or exclusively for working in Korean. After typing a valid syllable, for example 축, is this kept as the three codepoints U+110e, U+116e, U+11a8 or is it composed by the input method to the Hangul syllable 축 with the codepoint U+cd95?
