Updated 2018-05-29 02:38:12 by ray2501

Things Chinese is a guide to working with Chinese in Tcl.

See Also  edit

Chinlish
taiku: a little Chinese editor
Chinese numbers
Fun with Chinese characters
Animated Kanji
fan2jian and jian2fan
Easy input of Pinyin
Pinyin, ASCII to Unicode Converter

Misc  edit

A Chinese website dedicated to Tcl/Tk: http://www.tclchina.com/ (2006-05-30) (dead, 2014-09-07)

中文成?

RS 2017-04-22: My Tcl solution for the "Chinese zodiac" task at Rosetta Code http://rosettacode.org/wiki/Chinese_zodiac#Tcl :
proc cn_zodiac year {
   set year0 [expr $year-4]
   set animals {Rat Ox Tiger Rabbit Dragon Snake Horse Goat Monkey Rooster Dog Pig}
   set elements {Wood Fire Earth Metal Water}
   set stems {jia3 yi3 bing3 ding1 wu4 ji3 geng1 xin1 ren2 gui3}
   set gan {\u7532 \u4E59 \u4E19 \u4E01 \u620A \u5DF1 \u5E9A \u8F9B \u58EC \u7678}
   set branches {zi3 chou3 yin2 mao3 chen2 si4 wu3 wei4 shen1 you3 xu1 hai4}
   set zhi {\u5B50 \u4E11 \u5BC5 \u536F \u8FB0 \u5DF3 \u5348 \u672A \u7533 \u9149 \u620C \u4EA5}
   set m10 [expr $year0%10]
   set m12 [expr $year0%12]
   set res [lindex $gan $m10][lindex $zhi $m12]
   lappend res [lindex $stems $m10]-[lindex $branches $m12]
   lappend res [lindex $elements [expr $m10/2]]
   lappend res [lindex $animals $m12] ([expr {$year0%2 ? "yin" : "yang"}])
   lappend res year [expr $year0%60+1]
   return $res
}
% cn_zodiac 1984
甲子 jia3-zi3 Wood Rat (yang) year 1
% cn_zodiac 2017
丁酉 ding1-you3 Fire Rooster (yin) year 34

RS 2007-08-29: Here's a converter from Chinese characters in Unicode to decimal GB2312 numbers:
proc c2gb str {
    set res {} 
    foreach c [split $str {}] {
        binary scan [encoding convertto gb2312 $c] cc a b
        set a [expr {($a+0x100) % 0x100 - 160}]
        set b [expr {($b+0x100) % 0x100 - 160}]
        lappend res [format %02d%02d $a $b]
    }
    set res
}
% c2gb 上海
4147 2603

ZU 2007-08-09: Could you please tell me how can I use that, wenn I use TclKit zu show some character with chinise charaters? And I have tried
 puts \u2603  ;# the second nummber above

And I get ☃. May I ask why?

AM 2007-08-31: That is interesting in itself too :) but the problem is really that Unicode expects hexadecimal numbers, whereas the above are decimal.

RS: Yes. The Unicodes of these example characters can be scanned out:
 % u2x 上海
 \u4E0A\u6D77

LV 2008-01-31: I was asked, today, whether Tcl supports the GBK or GB18030 chinese character encodings. I don't see it among the encodings listed when I type:
$ tclsh8.5
% encoding names
cp860 cp861 cp862 cp863 tis-620 cp864 cp865 cp866 gb12345 gb2312-raw cp949 cp950 cp869 
dingbats ksc5601 macCentEuro cp874 macUkraine jis0201 gb2312 euc-cn euc-jp macThai iso8859-10 
jis0208 iso2022-jp macIceland iso2022 iso8859-13 jis0212 iso8859-14 iso8859-15 cp737 
iso8859-16 big5 euc-kr macRomania macTurkish gb1988 iso2022-kr macGreek ascii cp437 macRoman 
iso8859-1 iso8859-2 iso8859-3 macCroatian koi8-r iso8859-4 ebcdic iso8859-5 cp1250 macCyrillic 
iso8859-6 cp1251 macDingbats koi8-u iso8859-7 cp1252 iso8859-8 cp1253 iso8859-9 cp1254 cp1255 
cp850 cp1256 cp932 identity cp1257 cp852 macJapan cp1258 shiftjis utf-8 cp855 cp936 symbol cp775 unicode cp857

Has anyone out there worked out the issues?

WJP 2008-01-31

iconv supports this GBK and GB18030, so it might be helpful to look at the source, or possibly just run iconv.

ray2501 2018-05-29

If you need to do conversion between Traditional and Simplified Chinese, you can check Open Chinese Convert project, OpenCC. OpenCC also provides a library, so I try to write tcl-opencc (Tcl bindings for OpenCC).