I'm want function UTF-8 to ASCII or utf-8 to tis620

I'm want function UTF-8 to ASCII or utf-8 to tis620
'm use SQlite insert2db by Thai language. when preview result filed is 'อ่านไม่ออก'
I'm want function for decode it to ASCII
Exemple sample code php ** but i'm connot converse to VB code
function utf8_to_tis620($string) {
$str = $string;
$res = "";
for ($i = 0; $i < strlen($str); $i++) {
if (ord($str[$i]) == 224) {
$unicode = ord($str[$i+2]) & 0x3F;
$unicode |= (ord($str[$i+1]) & 0x3F) << 6;
$unicode |= (ord($str[$i]) & 0x0F) << 12;
$res .= chr($unicode-0x0E00+0xA0);
$i += 2;
} else {
$res .= $str[$i];
}
}
return $res;
}
 

agraham

Expert
Licensed User
Thi is a literal translation of that php code. However it will not work with B4ppc because all strings are UTF16 within the .NET environment in which B4ppc runs.
B4X:
Sub  string_utf8_to_tis620(string)
   str = string;
   res = ""
   For i = 0 To StrLength(str) - 1 
      If (Asc(SubString(str,i,1)) = 224) Then
         unicode = Asc(SubString(sti+2,1)) Mod 64
         unicode = unicode + (Asc(SubString(str, i+1,1)) Mod 64) * 64 
         unicode  = unicode + (Asc(SubString(str,i,1)) Mod 16) * 4096
         res = res & Chr(unicode-3584+160)
         i = i + 2
      Else 
         res = res + SubString(str,i,1)
      End If
   Next
   Return res
End Sub
If the UTF8 string is available as a byte array within B4ppc then this might work
B4X:
Sub  bytes_utf8_to_tis620
   str = string;
   res = ""
   For i = 0 To ArrayLen(bstr()) - 1 
      If (bstr(i) = 224) Then
         unicode = bstr(i+2) Mod 64
         unicode = unicode + bstr(i+1) Mod 64 * 64 
         unicode  = unicode + bstr(i) Mod 16 * 4096
         Msgbox(unicode)
         res = res & Chr(unicode-3584+160)
         i = i + 2
      Else 
         res = res & Chr(bstr(i))
      End If
   Next
   Return res
End Sub
But this begs the question as to why you want some sort of "ASCII" data within a Unicode environment :confused:
 
Thank you very much
and I want convert this function | PHP to VB | thank you
function tis2utf8($tis) {
for( $i=0 ; $i< strlen($tis) ; $i++ ){
$s = substr($tis, $i, 1);
$val = ord($s);
if( $val < 0x80 ){
$utf8 .= $s;
} elseif ( ( 0xA1 <= $val and $val <= 0xDA ) or ( 0xDF <= $val and $val <= 0xFB ) ){
$unicode = 0x0E00 + $val - 0xA0;
$utf8 .= chr( 0xE0 | ($unicode >> 12) );
$utf8 .= chr( 0x80 | (($unicode >> 6) & 0x3F) );
$utf8 .= chr( 0x80 | ($unicode & 0x3F) );
}
}
return $utf8;
}
 

agraham

Expert
Licensed User
You appear to want to do a conversion from a single byte character string in a TIS620 codepage to UTF8. As I pointed out before this doesn't work, or may not even be necessary in a .NET based application which is UTF16. You need to analyse why you think you need this.
 
Top