Converting an AnsiString to a Unicode String

delphi convert string to utf-8
javascript unicode converter
delphi utf8decode
unicode to ascii
unicode to text
character encoding converter
unicode encoding
unicode viewer

I'm converting a D2006 program to D2010. I have a value stored in a single byte per character string in my database and I need to load it into a control that has a LoadFromStream, so my plan was to write the string to a stream and use that with LoadFromStream. But it did not work. In studying the problem, I see an issue that tells me that I don't really understand how conversion from AnsiString to Unicode string works. Here is a piece of standalone code that illustrates the issue I am confused by:;

procedure TForm1.Button1Click(Sender: TObject); {$O-}
  sBuffer: String;
  oStringStream: TStringStream;
  sAnsiString: AnsiString;
  sUnicodeString: String;
  iSize2: Word;
  sAnsiString := '12345';
  oStringStream := TStringStream.Create(sBuffer);
  sUnicodeString := sAnsiString;
  iSize1 := StringElementSize(sAnsiString);
  iSize2 := StringElementSize(sUnicodeString);

If you break on the last line, and inspect the Bytes property of oStringStream, you will see that it looks like this:

Bytes (49 {$31}, 50 {$32}, 51 {$33}, 52 {$34}, 53 {$35}

I was expecting that it might look something like

(49 {$31}, 00 {$00}, 50 {$32}, 00 {$00}, 51 {$33}, 00 {$00}, 
 52 {$34}, 00 {$00}, 53 {$35}, 00 {$00} ...

Apparently my expectations are in error. But then, how to convert an AnsiString to unicode?

I'm not getting the right results out of the LoadFromStream because it is reading from the stream two bytes at a time, but the data it is receiving is not arranged that way. What is it that I should do to give the LoadFromStream a well formed stream of data based on a unicode string?

Thank you for your help.

What is the type of the oStringStream.WriteString's parameter? If it is AnsiString, you have an implicit conversion from Unicode to Ansi and that explains your example.

Updated: Now the real question is how TStringStream stores data internally. In the following code sample (Delphi 2009)

procedure TForm1.Button1Click(Sender: TObject);
  S: string;
  SS: TStringStream;

  S:= 'asdfg';
  SS:= TStringStream.Create(S);  // 1 byte per char
  Label1.Caption:= SS.DataString;

TStringStream uses internally the default system ANSI encoding (1 byte per char). The constructor and WriteString procedures convert a string argument from unicode to ANSI.

To override this behaviour you must declare the encoding explicitely in the constructor:

procedure TForm1.Button1Click(Sender: TObject);
  S: string;
  SS: TStringStream;

  S:= 'asdfg';
  SS:= TStringStream.Create(S, TEncoding.Unicode);  // 2 bytes per char
  Label1.Caption:= SS.DataString;

UTF-8 Conversion Routines - RAD Studio, Returns the number of bytes that follow a lead UTF-8 byte. System.UTF8Decode. Converts a UTF8 string to a Unicode string (WideString). System.UTF8Encode. Use WideCharToMultiByte to convert a Unicode string to an ANSI string. The MultiByteToWideChar function converts an ANSI string to a Unicode string. Use SysAllocString and SysFreeString to allocate and free BSTR data types. For more information about these string functions, see their references in the Windows Software Development Kit (SDK).

In Delphi last versions you could use TEncoding:


Unicode in RAD Studio - RAD Studio, Use WideCharToMultiByte to convert a Unicode string to an ANSI string. The MultiByteToWideChar function converts an ANSI string to a  W1060 Explicit string cast with potential data loss from 'string' to 'AnsiString' Of course, it's not expected that Delphi XE4 code has much place for the use of AnsiString . Unless you have a very specific interop requirement, then text is best held in the native data type, string .

I think you want to use:

LoadFromStream(stream, TEncoding.ASCII);

If your single byte text is not ASCII but is based on a code page, then this might work:

LoadFromStream(stream, TEncoding.GetEncoding(1252));

where the "1252" is the code page that your single byte text is based on.

Converting Unicode and ANSI Strings, AnsiToUnicode converts the ANSI string pszA to a Unicode string * and returns the Unicode string through ppszW. Space for the * the converted  Delphi 2010 AnsiString to String conversion I start off with an AnsiString and one of the elements has an ordinal value of $92. I now pass this string to a procedure whose input value is of type String. Delphi does a type conversion from AnsiString to String as you would expect, but the value of $92 has now been converted to a Unicode value of

The stream format largely depends on the TStringStream.Encoding. In your exemple, the used codepage should be the same as sBuffer (See implentation from TStringStream.Create).

Since oStringStream.WriteString(sUnicodeStream); seems to save as single bytes, I'd assume sBuffer is an Ansistring or a RawByteString.

Now... why do the reading fails... You have yet to supply us an example of how you do read back in that stream.

How To Convert from ANSI to Unicode & Unicode to ANSI for OLE, string-related functions are overloaded, providing you with both AnsiString and. UnicodeString versions. In fact, converting existing String declarations to  #include <atlconv .h> //An example for converting from ANSI to UNICODE //use this first USES_CONVERSION; //An ANSI string LPSTR lpsz_ANSI_String = "An ANSI String"; //ANSI string being converted to a UNICODE string LPWSTR lpUnicodeStr = A2W( lpsz_ANSI_String ) //Another example for converting from UNICODE to ANSI //Use this first USES

[PDF] Delphi Unicode Migration for Mere Mortals: Stories and , There will be a warning about converting from AnsiString to UnicodeString  Delphi 2009 is Unicode A Utility to Avoid Search and Replace . A Rough Tool from Roger Connell. String to AnsiString Conversion. With the introduction of Unicode support in Delphi 2009 code gear changed the generic definitions of String, Char and PChar and while they still offer compiler switches to support old short string code (Delphi One string type) they did not provide a switch to support

Unicode Support in Lazarus, String types like UnicodeString, AnsiString, WideString and UTF8String are TEncoding provides method GetBytes for converting string to TBytes (array of  I passed the function a unicode string (i.e. UTF-16 encoded data) and converted it to an AnsiString, with the understanding that the bytes in the AnsiString represented characters from the specified code-page.

Strings, Conversely, you can convert a String object into a byte array of non-Unicode The example that follows converts characters between UTF-8 and Unicode. Converting your string to bytes would require some encoding. There are various libraries that do this, so it depends on the framework you are using. As an alternative, you could use wofstream to write wchar_t characters to the stream.