GB code conversion into UTF

xiaoxiao2021-03-06 47

A function of transforming the GB code into UTF-8 very early, with a GB to Unicode's comparison table (GB2312.txt), used to output Chinese characters in GD. Later, it was found that there will be confusion when contained in the content of the content. Later, I found the modified code and solved the problem. The two functions are now analyzed as follows. First, this is a function of Unicode to UTF-8 encoding conversion, which has no changes before and after modification: Function U2UTF8 ($ C) {for ($ I = 0; $ I > 6); $ Str. = (0x80 | $ C & 0x3f);} else IF ($ C <0x10000) {$ Str. = (0xE0 | $ C >> 12); $ Str. = (0x80 | $ C >> 6 & 0x3f); $ Str. = (0x80 | $ C & 0x3F); Else IF ($ C <0x200000) {$ Str. = (0xF0 | $ C >> 18); $ Str. = (0x80 | $ C >> 12 & 0x3f); $ Str. = (0x80 | $ C >> 6 & 0x3f); $ Str. = (0x80 | $ C & 0x3f);} Return $ Str;} Here is the rules encoded by UTF-8, which belongs to different Unicode coding segment range, perform different shift and bit and operation to convert to UTF-8 encoding. About this rule can refer to the description of http://www.utf8.org/. This is a function that the GB before modification is converted to the UTF-8 encoding, which calls the U2UTF8 function above. Function GB2UTF8 ($ GB) / * Program Writen by Sadly www.phpx.com * / {if (! Trim ($ GB)) Return $ GB; $ filename = "gb2312.txt"; $ tmp = file ($ fileename) $ CODETABLE = array (); while ($ key, $ value) = Each ($ TMP)) $ CODETABLE [HEXDEC (Substr ($ Value, 0, 6))] = SUBSTR ($ Value, 7, 6 ); $ uTF8 = ""; While ($ GB) {if (ORD (Substr ($ GB, 0, 1))> 127) {$ THIS = Substr ($ GB, 0, 2); $ GB = SUBSTR $ GB, 2, STRLEN ($ GB)); $ UTF8. = U2UTF8 (HEXDEC ($ CODETABLE [HEXDEC (Bin2Hex ($ this)) - 0x8080]));} else {$ GB = Substr ($ GB, 1, Strlen ($ GB)); $ UTF8. = U2UTF8 (Substr ($ GB, 0, 1));}}} $ RET = ""; for ($ I = 0; $ I

转载请注明原文地址:https://www.9cbs.com/read-65341.html

9cbs

New Post(0)