Converting Charsets

The place for programming-related topics.
Post Reply
RaMireZ
Cadet
Posts: 6
Joined: 22 Aug 2007, 10:25
Contact:

Converting Charsets

Post by RaMireZ »

Hi there,

i wa trying to solve my problem with the help of the big boards, but it was no use, so i'll try to get a solution on this rather tiny one ;)

I am currently coding an email-client. it has main functionality already. I am able to show all the stuff that has been sent with the email. but unfortunately, i don't know how to handle the charsets.
When you receive an email, it is (if necessary) encoded in a certain charset. The encoding charset is always given with header of the email, so its not an issue to find out which character set i need.
But what comes after that ? Where to do i have to convert the given text encoded in the given charset ? And how do i convert ? Please help me doing this...it is the last thing i have to work on for my email client.
User avatar
TiKu
Administrator
Administrator
Posts: 832
Joined: 28 Sep 2004, 21:10
Location: München
Contact:

Post by TiKu »

I'm not sure it is the best way to do this, but I think the MultByteToWideChar API function is what you're looking for. It converts a string to UTF-16 and you can specify the character set of the input string.
If UTF-16 isn't what you want, you may use WideCharToMultiByte to convert the result of the first conversion back to any other character set.

HTH
TiKu
Crunching for Fab36_Folding-Division at Folding@Home. Join Fab36/Fab30! - Folding@Home and BOINC
Boycott DRM! Boycott HDCP!
RaMireZ
Cadet
Posts: 6
Joined: 22 Aug 2007, 10:25
Contact:

Post by RaMireZ »

ok thx at first...

maybe this isn't bad what you said but i wonder the following:

1. Is Windows supporting UTF-16 ? (ok, as far as it seems to be implemented in the API, i think so)

2. I don't know what i am acctually doing now. Can you tell me the concept about converting from one charset to another ?
this can be independent of a programming language but I'd be happy if someone has an examle for VB.
Please tell me about the progress of charset conversion.

3. Does anyone know about tutorials for this ?

4. In which dll is that Function you were talkin' bout ?
User avatar
TiKu
Administrator
Administrator
Posts: 832
Joined: 28 Sep 2004, 21:10
Location: München
Contact:

Post by TiKu »

RaMireZ wrote:1. Is Windows supporting UTF-16 ? (ok, as far as it seems to be implemented in the API, i think so)
Actually Visual Basic 6.0 uses UTF-16 internally.
RaMireZ wrote:2. I don't know what i am acctually doing now. Can you tell me the concept about converting from one charset to another ?
this can be independent of a programming language but I'd be happy if someone has an examle for VB.
Please tell me about the progress of charset conversion.
Not tested:

Code: Select all

Private Declare Function MultiByteToWideChar Lib "kernel32.dll" (ByVal CodePage As Long, ByVal dwFlags As Long, ByVal lpMultiByteStr As Long, ByVal cbMultiByte As Long, ByVal lpWideCharStr As Long, ByVal cchWideChar As Long) As Long

Dim buffer As String
Dim bufferSize As Long

' assuming the input string is stored in a Byte-Array called arrInputBytes
' and this input string is null-terminated (i. e. ends with Chr$(0))
bufferSize = MultiByteToWideChar(inputCodePage, 0, VarPtr(arrInputBytes(0)), -1, 0, 0)
buffer = String$(bufferSize + 1, Chr$(0))
MultiByteToWideChar inputCodePage, 0, VarPtr(arrInputBytes(0)), -1, StrPtr(buffer), bufferSize
buffer = Left$(buffer, bufferSize - 1)
' now buffer contains the converted string in UTF-16 format
RaMireZ wrote:4. In which dll is that Function you were talkin' bout ?
kernel32.dll
Crunching for Fab36_Folding-Division at Folding@Home. Join Fab36/Fab30! - Folding@Home and BOINC
Boycott DRM! Boycott HDCP!
RaMireZ
Cadet
Posts: 6
Joined: 22 Aug 2007, 10:25
Contact:

Post by RaMireZ »

i am currently not at home, i am at work, so i cant try it.
Usually i wouldn't ask the following questions cause i might test it tho...

In Email Text there are some signs representing other signs. F.e.:

sometimes a "=3D" occurs. this sign needs to be replaced by "=" as it is representing this sign. Will the MultiByteToWideChar be able to "translate" that ? and when do i use the other function WideCharToMultiByte ?
User avatar
TiKu
Administrator
Administrator
Posts: 832
Joined: 28 Sep 2004, 21:10
Location: München
Contact:

Post by TiKu »

RaMireZ wrote:In Email Text there are some signs representing other signs. F.e.:

sometimes a "=3D" occurs. this sign needs to be replaced by "=" as it is representing this sign. Will the MultiByteToWideChar be able to "translate" that ?
I don't think so. This is called "quoted printable" encoding. Wikipedia has a good article about it. I think you'll first have to decode the quoted printable text to a text that uses the charset specified in the mail header. Then you can pass it to MultiByteToWideChar to convert it to UTF-16.
RaMireZ wrote:and when do i use the other function WideCharToMultiByte ?
You use it to convert a UTF-16 string to another charset. This may become helpful if you want to send e-mails.
Crunching for Fab36_Folding-Division at Folding@Home. Join Fab36/Fab30! - Folding@Home and BOINC
Boycott DRM! Boycott HDCP!
RaMireZ
Cadet
Posts: 6
Joined: 22 Aug 2007, 10:25
Contact:

Post by RaMireZ »

alright, thx a lot for now.

i will try this at home and tell you if i was succesfull.


(i knew those smaller boards are sometimes more useful ;) )
RaMireZ
Cadet
Posts: 6
Joined: 22 Aug 2007, 10:25
Contact:

Post by RaMireZ »

i'm sorry for dblposting...but i would appretiate it, it you maybe could do an example ? only if it doesn't disturb you of course ;)
User avatar
TiKu
Administrator
Administrator
Posts: 832
Joined: 28 Sep 2004, 21:10
Location: München
Contact:

Post by TiKu »

RaMireZ wrote:i'm sorry for dblposting...but i would appretiate it, it you maybe could do an example ? only if it doesn't disturb you of course ;)
An example for decoding quoted printable text and converting it to UTF-16? Sorry, my time is very limited. Also I have never done this before, so I'm in the same point of departure as you.
With the Wikipedia article, Google and the code I gave you above, it shouldn't be that difficult.
Crunching for Fab36_Folding-Division at Folding@Home. Join Fab36/Fab30! - Folding@Home and BOINC
Boycott DRM! Boycott HDCP!
RaMireZ
Cadet
Posts: 6
Joined: 22 Aug 2007, 10:25
Contact:

Post by RaMireZ »

ah ok, sorry, it sounded like you were doing this in the past.

i will tell you later or tomorrow if i was able to cope with that.
Post Reply