Sorry your browser is not supported!

You are using an outdated browser that does not support modern web technologies, in order to use this site please update to a new browser.

Browsers supported include Chrome, FireFox, Safari, Opera, Internet Explorer 10+ or Microsoft Edge.

DarkBASIC Professional Discussion / How do I convert UTF-8 (unicode) to ASCII (Latin-1)?

Author
Message
Tyrone M
18
Years of Service
User Offline
Joined: 25th Jan 2008
Location: Minnesota, USA
Posted: 1st Nov 2014 22:33
Help please.
I'm having problems with reading string characters. I believe that what I'm inputting is in Unicode and therefore causing me problems. The problem arises when I try to perform character by character comparisons on the input strings.

How can I convert this?

You can see the "input" text and the result of a simple read string from file and print here:

http://www.djfunk.com

Thank you
Rudolpho
20
Years of Service
User Offline
Joined: 28th Dec 2005
Location: Sweden
Posted: 2nd Nov 2014 10:26
Those characters aren't part of the ASCII standard, which only covers 128 unique characters. The remaining 127 can be mapped in several different ways in DBPro and are referred to as charsets which you can set using an optional second argument to the SET TEXT FONT function.
See the "Principles/ASCII character codes" section of the DBPro help files for a list of the available ones.

Tyrone M
18
Years of Service
User Offline
Joined: 25th Jan 2008
Location: Minnesota, USA
Posted: 2nd Nov 2014 16:38
Sorry Rudolpho. I knew someone would bring up SET TEXT commands.

My problem has nothing to do with displaying text on the screen. It deals with reading & writing strings to a file..

The TEXT I'm READING FROM a FILE is apparently in UNICODE. Therefore when I WRITE the TEXT back TO a FILE certain characters are garbled.

In the output I put on www.djfunk.com the string data in the file is exactly the same was what's displayed on the screen.

There must be a way to convert the characters most likely using some sort of bit functions. I just don't know how.

Any DBP bit-twiddlers out there? Anybody? Thanks!
Rudolpho
20
Years of Service
User Offline
Joined: 28th Dec 2005
Location: Sweden
Posted: 2nd Nov 2014 20:49 Edited at: 2nd Nov 2014 21:53
Ah, I see.
The problem then is that your desired strings are in unicode rather than extended ASCII / UTF-8. The main issue with this is that unicode strings use 2 bytes per character and DBPro's strings only use single-byte characters. Therefore storing such strings would have you implement the entire range of needed string operations yourself.
As for writing a unicode text file, most text reading applications will interpret a .txt file as being in unicode if it begins with the special byte-order mark word (0xfeff) and so could yours when you are reading text files.

Edit: In line with the above, a text file will be interpreted as UTF-8 by writing the byte sequence 0xef 0xbb 0xbf as a header to the file.

mr_d
DBPro Tool Maker
19
Years of Service
User Offline
Joined: 26th Mar 2007
Location: Somewhere In Australia
Posted: 3rd Nov 2014 15:38 Edited at: 3rd Nov 2014 15:38
Hi Tyrone M, A simple option if suitable (found on the web through Google) is to use the following command to pre-process your input file:

This can easily be done using DBP's EXECUTE FILE command to generate a temporary intermediate file that you can use to read in and do your character comparisons.

Tyrone M
18
Years of Service
User Offline
Joined: 25th Jan 2008
Location: Minnesota, USA
Posted: 3rd Nov 2014 18:30
mr_d,
that is a solution that would work. The input file can be pre-processed in that manner.

Thank you
And thanks Rudolpho too.
mr_d
DBPro Tool Maker
19
Years of Service
User Offline
Joined: 26th Mar 2007
Location: Somewhere In Australia
Posted: 4th Nov 2014 02:47
that's good and you're welcome. glad this solution works for you.

Guido Italy
20
Years of Service
User Offline
Joined: 25th Dec 2005
Location:
Posted: 30th Jan 2015 21:23
hi !


please ,

how I have to use this command?

cmd /a /c type input_unicode.txt>output_ansii.txt

??

thank
Guido Italy
20
Years of Service
User Offline
Joined: 25th Dec 2005
Location:
Posted: 30th Jan 2015 21:26
more precise:

i'm italian,
how can I read from a txt file,
(For example in German) the special characters (eg accents),
and display them correctly in a "edit control" of BlueGui?
Tyrone M
18
Years of Service
User Offline
Joined: 25th Jan 2008
Location: Minnesota, USA
Posted: 30th Jan 2015 21:35
Guido,

I would use Google translate to attempt this.

The suggestion to try:
cmd /a /c type your_input_unicode.txt > your_output_ansii.txt
was to convert unicode to ascii (english I would presume). You enter this from windows at the command prompt: windows-R (that windows key

You'll probably get better answers from others.
Guido Italy
20
Years of Service
User Offline
Joined: 25th Dec 2005
Location:
Posted: 30th Jan 2015 22:27
Thank Tyrone ,

my problem is the characters "non-Latin"
and BlueGui EditorGadget

Login to post a reply

Server time is: 2026-07-05 03:49:17
Your offset time is: 2026-07-05 03:49:17