//
you're reading...
Interop Marshaling

Returning Strings from a C++ API to C#

1. Introduction.

1.1 APIs that return strings are very common. However, the internal nature of such APIs, as well as the use of such APIs in managed code, require special attention. This blog will demonstrate both concerns.

1.2 I will present several techniques for returning an unmanaged string to managed code. But before that I shall first provide an in-depth explanation on the low-level activities that goes on behind the scenes. This will pave the way towards easier understanding of the codes presented later in this blog.

2. Behind the Scenes.

2.1 Let’s say we want to declare and use an API written in C++ with the following signature :

char* __stdcall StringReturnAPI01();

This API is to simply return a NULL-terminated character array (a C string).

2.2 To start with, note that a C string has no direct representation in managed code. Hence we simply cannot return a C string and expect the CLR to be able to transform it into a managed string.

2.3 The managed string is non-blittable. It can have several representations in unmanaged code : e.g. C-style strings (ANSI and Unicode-based) and BSTRs. Hence, it is important that you specify this information in the declaration of the unmanaged API, e.g. :

[DllImport("<path to DLL>", CharSet = CharSet.Ansi, CallingConvention = CallingConvention.StdCall)]
[return: MarshalAs(UnmanagedType.LPStr)]
public static extern string StringReturnAPI01();

In the above declaration, note that the following line :

[return: MarshalAs(UnmanagedType.LPStr)]

indicates that the return value from the API is to be treated as a NULL-terminated ANSI character array (i.e. a typical C-style string).

2.4 Now this unmanaged C-style string return value will then be used by the CLR to create a managed string object. This is likely achieved by using the Marshal.PtrToStringAnsi() method with the incoming string pointer treated as an IntPtr.

2.5 Now a very important concept which is part and parcel of the whole API calling operation is memory ownership. This is an important concept because it determines who is responsible for the deallocation of this memory. Now the StringReturnAPI01() API supposedly returns a string. The string should thus be considered equivalent to an “out” parameter, It is owned by the receiver of the string, i.e. the C# client code. More precisely, it is the CLR’s Interop Marshaler that is the actual receiver.

2.6 Now being the owner of the returned string, the Interop Marshaler is at liberty to free the memory associated with the string. This is precisely what will happen. When the Interop Marshaler has used the returned string to construct a managed string object, the NULL-terminated ANSI character array pointed to by the returned character pointer will be deallocated.

2.7 Hence it is very important to note the general protocol : the unmanaged code will allocate the memory for the string and the managed side will deallocate it. This is the same basic requirement of “out” parameters.

2.8 Towards this protocol, there are 2 basic ways that memory for an unmanaged string can be allocated (in unmanaged code) and then automatically deallocated by the CLR (more specifically, the interop marshaler) :

  • CoTaskMemAlloc()/Marshal.FreeCoTaskMem().
  • SysAllocString/Marshal.FreeBSTR().

Hence if the unmanaged side used CoTaskMemAlloc() to allocate the string memory, the CLR will use the Marshal.FreeCoTaskMem() method to free this memory.

The SysAllocString/Marshal.FreeBSTR() pair will only be used if the return type is specified as being a BSTR. This is not relevant to the example given in point 2.1 above. I will demonstrate a use of this pair in section 5 later.

2.9 N.B. : Note that the unmanaged side must not use the “new” keyword or the “malloc()” C function to allocate memory. The Interop Marshaler will not be able to free the memory in these situations. This is because the “new” keyword is compiler dependent and the “malloc” function is C-library dependent. CoTaskMemAlloc(), and SysAllocString() on the other hand, are Windows APIs which are standard.

Another important note is that although GlobalAlloc() is also a standard Windows API and it has a counterpart managed freeing method (i.e. Marshal.FreeHGlobal()), the Interop Marshaler will only use the Marshal.FreeCoTaskMem() method for automatic memory freeing of NULL-terminated strings allocated in unmanaged code. Hence do not use GlobalAlloc() unless you intend to free the allocated memory by hand using Marshal.FreeHGlobal() (an example of this is give in section 6 below).

3. Sample Code.

3.1 In this section, based on the principles presented in section 2, I shall present sample codes to demonstrate how to return a string from an unmanaged API and how to declare such an API in managed code.

3.2 The following is a listing of the C++ function which uses CoTaskMemAlloc() :

extern "C" __declspec(dllexport) char*  __stdcall StringReturnAPI01()
{
    char szSampleString[] = "Hello World";
    ULONG ulSize = strlen(szSampleString) + sizeof(char);
    char* pszReturn = NULL;

    pszReturn = (char*)::CoTaskMemAlloc(ulSize);
    // Copy the contents of szSampleString
    // to the memory pointed to by pszReturn.
    strcpy(pszReturn, szSampleString);
    // Return pszReturn.
    return pszReturn;
}

3.4 The C# declaration and sample call :

[DllImport("<path to DLL>", CharSet = CharSet.Ansi, CallingConvention = CallingConvention.StdCall)]
[return: MarshalAs(UnmanagedType.LPStr)]
public static extern string StringReturnAPI01();

static void CallUsingStringAsReturnValue()
{
  string strReturn01 = StringReturnAPI01();
  Console.WriteLine("Returned string : " + strReturn01);
}

3.5 Note the argument used for the MarshalAsAttribute : UnmanagedType.LPStr. This indicates to the Interop Marshaler that the return string from StringReturnAPI01() is a pointer to a NULL-terminated ANSI character array.

3.6 What happens under the covers is that the Interop Marshaler uses this pointer to construct a managed string. It likely uses the Marshal.PtrToStringAnsi() method to perform this. The Interop Marshaler will then use the Marshal.FreeCoTaskMem() method to free the character array.

4. Using a BSTR.

4.1 In this section, I shall demonstrate here how to allocate a BSTR in unmanaged code and return it in managed code together with memory deallocation.

4.2 Here is a sample C++ code listing :

extern "C" __declspec(dllexport) BSTR  __stdcall StringReturnAPI02()
{
  return ::SysAllocString((const OLECHAR*)L"Hello World");
}

4.3 And the C# declaration and usage :

[DllImport("<path to DLL>", CharSet = CharSet.Ansi, CallingConvention = CallingConvention.StdCall)]
[return: MarshalAs(UnmanagedType.BStr)]
public static extern string StringReturnAPI02();

static void CallUsingBSTRAsReturnValue()
{
  string strReturn = StringReturnAPI02();
  Console.WriteLine("Returned string : " + strReturn);
}

Note the argument used for the MarshalAsAttribute : UnmanagedType.BStr. This indicates to the Interop Marshaler that the return string from StringReturnAPI02() is a BSTR.

4.4 The Interop Marshaler then uses the returned BSTR to construct a managed string. It likely uses the Marshal.PtrToStringBSTR() method to perform this. The Interop Marshaler will then use the Marshal.FreeBSTR() method to free the BSTR.

5. Unicode Strings.

5.1 Unicode strings can be returned easily too as the following sample code will demonstrate.

5.2 Here is a sample C++ code listing :

extern "C" __declspec(dllexport) wchar_t*  __stdcall StringReturnAPI03()
{
  // Declare a sample wide character string.
  wchar_t  wszSampleString[] = L"Hello World";
  ULONG  ulSize = (wcslen(wszSampleString) * sizeof(wchar_t)) + sizeof(wchar_t);
  wchar_t* pwszReturn = NULL;

  pwszReturn = (wchar_t*)::CoTaskMemAlloc(ulSize);
  // Copy the contents of wszSampleString
  // to the memory pointed to by pwszReturn.
  wcscpy(pwszReturn, wszSampleString);
  // Return pwszReturn.
  return pwszReturn;
}

5.3 And the C# declaration and usage :

[DllImport("<path to DLL>", CharSet = CharSet.Ansi, CallingConvention = CallingConvention.StdCall)]
[return: MarshalAs(UnmanagedType.LPWStr)]
public static extern string StringReturnAPI03();

static void CallUsingWideStringAsReturnValue()
{
  string strReturn = StringReturnAPI03();
  Console.WriteLine("Returned string : " + strReturn);
}

The fact that a wide charactered string is now returned requires the use of the UnmanagedType.LPWStr argument for the MarshalAsAttribute.

5.4 The Interop Marshaler uses the returned wide-charactered string to construct a managed string. It likely uses the Marshal.PtrToStringUni() method to perform this. The Interop Marshaler will then use the Marshal.FreeCoTaskMem() method to free the wide-charactered string.

6. Low-Level Handling Sample 1.

6.1 In this section, I shall present some code that will hopefully cement the reader’s understanding of the low-level activities that had been explained in section 2 above.

6.2 Instead of using the Interop Marshaler to perform the marshaling and automatic memory deallocation, I shall demonstrate how this can be done by hand in managed code.

6.3 I shall use a new API which resembles the StringReturnAPI01() API which returns a NULL-terminated ANSI character array :

extern "C" __declspec(dllexport) char*  __stdcall PtrReturnAPI01()
{
  char   szSampleString[] = "Hello World";
  ULONG  ulSize = strlen(szSampleString) + sizeof(char);
  char*  pszReturn = NULL;

  pszReturn = (char*)::GlobalAlloc(GMEM_FIXED, ulSize);
  // Copy the contents of szSampleString
  // to the memory pointed to by pszReturn.
  strcpy(pszReturn, szSampleString);
  // Return pszReturn.
  return pszReturn;
}

6.4 And the C# declaration :

[DllImport("<path to DLL>", CharSet = CharSet.Ansi, CallingConvention = CallingConvention.StdCall)]
public static extern IntPtr PtrReturnAPI01();

Note that this time, I have indicated that the return value is an IntPtr. There is no [return : …] declaration and so no unmarshaling will be performed by the Interop Marshaler.

6.5 And the C# low-level call :

static void CallUsingLowLevelStringManagement()
{
  // Receive the pointer to ANSI character array
  // from API.
  IntPtr pStr = PtrReturnAPI01();
  // Construct a string from the pointer.
  string str = Marshal.PtrToStringAnsi(pStr);
  // Free the memory pointed to by the pointer.
  Marshal.FreeHGlobal(pStr);
  pStr = IntPtr.Zero;
  // Display the string.
  Console.WriteLine("Returned string : " + str);
}

This code demonstrates an emulation of the Interop Marshaler in unmarshaling a NULL-terminated ANSI string. The returned pointer from PtrReturnAPI01() is used to construct a managed string. The pointer is then freed. The managed string remains intact with a copy of the returned string.

The only difference between this code and the actual one by the Interop Marshaler is that the GlobalAlloc()/Marshal.FreeHGlobal() pair is used. The Interop Marshaler always uses Marshal.FreeCoTaskMem() and expects the unmanaged code to use ::CoTaskMemAlloc().

7. Low-Level Handling Sample 2.

7.1 In this final section, I shall present one more low-level string handling technique similar to the one presented in section 6 above.

7.2 Again we do not use the Interop Marshaler to perform the marshaling and memory deallocation. Additionally, we will also not release the memory of the returned string.

7.3 I shall use a new API which simply returns a NULL-terminated Unicode character array which has been allocated in a global unmanaged memory :

wchar_t gwszSampleString[] = L"Global Hello World";

extern "C" __declspec(dllexport) wchar_t*  __stdcall PtrReturnAPI02()
{
  return gwszSampleString;
}

This API returns a pointer to the pre-allocated global Unicode string “gwszSampleString”. Because it is allocated in global memory and may be shared by various functions in the DLL, it is crucial that it is not deleted.

7.4 The C# declaration for PtrReturnAPI02() is listed below :

[DllImport("<path to DLL>", CharSet = CharSet.Ansi, CallingConvention = CallingConvention.StdCall)]
public static extern IntPtr PtrReturnAPI02();

Again, there is no declaration for interop marshaling (no use of the [return : …] declaration). The returned IntPtr is returned as is.

7.5 And a sample C# code to manage the returned IntPtr :

static void CallUsingLowLevelStringManagement02()
{
  // Receive the pointer to Unicde character array
  // from API.
  IntPtr pStr = PtrReturnAPI02();
  // Construct a string from the pointer.
  string str = Marshal.PtrToStringUni(pStr);
  // Display the string.
  Console.WriteLine("Returned string : " + str);
}

Here, the returned IntPtr is used to construct a managed string from an unmanaged NULL-terminated Unicode string. The memory of the unmanaged Unicode string is then left alone and is not deleted.

Note that because a mere IntPtr is returned, there is no way to know whether the returned string is ANSI or Unicode. In fact, there is no way to know whether the IntPtr actually points to a NULL-terminated string at all. This knowledge has to be known in advance.

7.6 Furthermore, the returned IntPtr must not point to some temporary string location (e.g. one allocated on the stack). If this was so, the temporary string may be deleted once the API returns. The following is an example :

extern "C" __declspec(dllexport) char* __stdcall PtrReturnAPI03()
{
  char szSampleString[] = "Hello World";
  return szSampleString;
}

By the time this API returns, the string contained in “szSampleString” may be completely wiped out or be filled with random data. The random data may not contain any NULL character until many bytes later. A crash may ensue a C# call like the following :

IntPtr pStr = PtrReturnAPI03();
// Construct a string from the pointer.
string str = Marshal.PtrToStringAnsi(pStr);
Advertisements

About Lim Bio Liong

I've been in software development for nearly 20 years specializing in C , COM and C#. It's truly an exicting time we live in, with so much resources at our disposal to gain and share knowledge. I hope my blog will serve a small part in this global knowledge sharing network. For many years now I've been deeply involved with C development work. However since circa 2010, my current work has required me to use more and more on C# with a particular focus on COM interop. I've also written several articles for CodeProject. However, in recent years I've concentrated my time more on helping others in the MSDN forums. Please feel free to leave a comment whenever you have any constructive criticism over any of my blog posts.

Discussion

18 thoughts on “Returning Strings from a C++ API to C#

  1. Despite the name, Marshal.FreeHGlobal is not compatible with GlobalAlloc/GlobalFree.

    Posted by Ben Voigt (Visual C++ MVP) | February 20, 2012, 3:26 pm
  2. Really good article, tnx vary much

    Posted by Casper | October 19, 2012, 1:51 am
  3. its really a nice blog. It saved my lot of time.

    Posted by Jeneesh | March 13, 2013, 2:31 pm
  4. Great article. Thanks!

    Posted by Jim | March 14, 2013, 2:48 am
  5. Hi Lim, what about to return a content of a string class object? Let’s say:

    extern “C” __declspec(dllexport) const char* __stdcall PtrStringClass() {
    string str(“Hello World”);
    return str.c_str();
    }

    Is it possible?

    Great article!

    Posted by Luciano | June 28, 2013, 12:35 am
  6. On IOS platform, is there any function to replace this windows API ::CoTaskMemAlloc? Thanks.

    Posted by aa | October 8, 2013, 8:18 am
  7. Superb article, excellent summary. Thank you Lim. Question for you if possible. I have a 3rd party DLL which I have to use in my C# project. I have no source for the DLL, and only know that the function definition from its .H. Absolutely nothing more than this (e.g. length of string returned unknown…but for the sake of argument let’s say less than 1024).

    char * FUNCTYPE ExplainSiteCodeErr(int ret);

    So I have been using the following wrapper, where upon reading your article I also tried the commented out line with the CharSet, CallingConvention, and return specified.

    //[DllImport(“SK32MMTD.dll”, CharSet = CharSet.Ansi, CallingConvention = CallingConvention.StdCall)]
    [DllImport(“SK32MMTD.dll”)]
    //[return: MarshalAs(UnmanagedType.LPStr)]
    public static extern string ExplainSiteCodeErr(Int32 nReturn);

    With either DllImport statement, I get errors with the likely scenario when ExplainSiteCodeErr() returns a NULL or unallocated string (no way for me to know…as unfortunately this 3rd party DLL is used for encryption and fails immediately on loading if a debugger is attached…which makes everything even more fun to resolve!).

    Anyhow, I am wondering if you have any ideas on how to catch the returning char* and carefully handle it. I did see this response by Bradley Grainger’s which looked interesting (and he conveyed credibility in knowing his stuff)…but not sure if there is any way to leverage a StringBuilder object that could help.
    http://stackoverflow.com/questions/10856127/passing-string-from-native-c-dll-to-c-sharp-app

    Thanks!

    Posted by David Carr | October 29, 2013, 9:31 pm
    • Hello David,

      1. >> Anyhow, I am wondering if you have any ideas on how to catch the returning char* and carefully handle it…

      1.1 The first thing that comes to mind is a custom marshaler that handles the output char* returned from the ExplainSiteCodeErr() API.

      1.2 Let me do some experiments and revert to you on this.

      2, >> but not sure if there is any way to leverage a StringBuilder object that could help…

      2.1 I do not think a StringBuilder will help.

      2.2 A StringBuilder is best used in situations where an external DLL function takes a character pointer as parameter and then proceeds to fill this buffer with character values.

      2.3 In the case of ExplainSiteCodeErr(), a character pointer is returned.

      – Bio.

      Posted by Lim Bio Liong | November 3, 2013, 9:34 am
  8. I have a 3rd party C dll that I only have the header file for. The dll has a function that returns a const char*, the header declaration follows. The comments on the header declaration states ‘@return A constant string pointer to an internal static string containing the desired information’

    I1_API
    const char* I1_GetGlobalOptionD(const char *key);

    When I use this C# code the marshaling works

    [DllImport(“i1Pro”, EntryPoint = “I1_GetGlobalOptionD”, CharSet = CharSet.Ansi, CallingConvention = CallingConvention.Cdecl)]
    public static extern IntPtr I1_GetGlobalOptionD([InAttribute()] [MarshalAsAttribute(UnmanagedType.LPStr)] string key);

    IntPtr pStr = I1_GetGlobalOptionD(“SDKVersion”);
    string str = Marshal.PtrToStringAnsi(pStr);

    If I try using the MarshalAs(UnmanagedType.LPStr) attribute and string return type in the p/invoke declaration the application crashes without error.

    [DllImport(“i1Pro”, EntryPoint = “I1_GetGlobalOptionD”, CharSet = CharSet.Ansi, CallingConvention = CallingConvention.Cdecl)]
    [return: MarshalAs(UnmanagedType.LPStr)]
    public static extern string I1_GetGlobalOptionD([InAttribute()] [MarshalAsAttribute(UnmanagedType.LPStr)] string key);

    string str = I1_GetGlobalOptionD(“SDKVersion”); //application crashes

    If possible I’d like to have the Interop do the marshaling to simplify the C# code.

    Posted by fkhan | March 5, 2014, 5:00 pm
  9. Great article, but I am having a bit of trouble with the CoTaskMemAlloc call. I am receiving a linker error: LNK2019: unresolved external symbol__imp__CoTaskMemAllow@4 referenced in function _testString@0.

    I tried to manually add the Ole32.dll library in the project options (from the C:\Windows\system32 folder), but that failed as well.

    Any thoughts?

    Posted by James Kern | July 22, 2014, 11:19 pm
  10. Hi,

    I’m passing a delegate from C# code to ‘C’ code which expects a function pointer by doing PInvoke.
    When I’m invoking that delegate from ‘C’ code,I’m passing a “Char *” buffer as IntPtr to callback method encapsulated by delegate in C# code and expects the delegate to write the value in the memory buffer pointed by IntPtr.
    Some value is getting set to the address pointed by IntPtr but when I’m printing that value to log I’m not able to get the value(which is a string) in the readable format in the ‘C’ code.
    Following is the code that may be helpful in highlighting the issue.

    ‘C’ code
    =========
    SWSCANCODE_API INT32 DummyFunction( void )
    {
    f = fopen(“D:\\CallRecog.txt”, “w”);
    fprintf (f, “%s”,”DummyFunction called “);

    char *pRegKey = “HKEY_LOCAL_MACHINE\\HARDWARE\\DESCRIPTION\\System\\SystemBiosVersion”;
    char *pBuffer =”I don’t know what is happening”;
    int iBufSize = 32;
    fprintf (f, “%s”,”DummyFunction called “);

    if(pSWGlobals->pGetRegKeyValue( pRegKey, pBuffer, iBufSize )==TRUE)
    {
    fprintf (f, “%s”,”DummyFunction:No Problem occured in calling delegate “);
    fprintf(f,”%s”,pBuffer);
    }
    else
    {
    fprintf (f, “%s”,”DummyFunction:Problem occured in calling delegate “);
    }
    fclose(f);
    return 12;
    }

    C# code
    =========

    [DllImport(“SW.dll”, EntryPoint = “DummyFunction”, CallingConvention = CallingConvention.Cdecl)]
    public static extern int DummyFunction();

    static void Main(string[] args)
    {
    int i = DummyFunction();
    }

    //This function will be called by pSWGlobals->pGetRegKeyValue( pRegKey, pBuffer, iBufSize )
    // as it stores the address of this callback function
    //I’ve verified the function call and argument value from ‘C’ code is coming fine here

    //Below is just small part of this function but the ‘pBuffer’ is not touched anywhere else in code
    public static int GetRegKeyValue(IntPtr pRegKey, [In,Out]IntPtr pBuffer, int iBufSize)
    {
    string regKeyValueResult = Marshal.PtrToStringAnsi(pBuffer);
    regKeyValueResult = “Bruce”
    Marshal.WriteIntPtr(pBuffer, Marshal.StringToHGlobalAuto(regKeyValueResult));
    }

    Below is the output
    ===============
    DummyFunction called DummyFunction:No Problem occured in calling delegate x†:[Some weired characters]

    Please help me in getting the correct value to pass from C# to ‘C’ code,the memory for which is allocated from unmanaged or ‘C’ code?

    Posted by Bruce Seth | September 26, 2014, 8:54 am
    • Hello Bruce,

      1. If I read you correctly, you appear to want to pass pBuffer to the delegate in C# and have it fill this buffer with some ASCII character string.

      2. First of all, please note that in the C code for DummyFunction(), pBuffer cannot be declared as a char* :

      char *pBuffer = “I don’t know what is happening”;

      instead, it must be declared as a true buffer :

      char pBuffer[] = “I don’t know what is happening”;

      in the above case, pBuffer is a writeable char buffer which has been initialized.

      3. Next, using Marshal.WriteIntPtr() in the way you did in GetRegKeyValue() will cause the 4 byte address value of the string buffer allocated by Marshal.StringToHGlobalAuto(regKeyValueResult) to be written to the first 4 bytes of pBuffer.

      4. This is the reason why you saw some unreadable characters back in the C code.

      5. What you want is to copy the ANSI characters of the regKeyValueResult string to pBuffer :

      [return: MarshalAs(UnmanagedType.Bool)]
      public static bool GetRegKeyValue(IntPtr pRegKey, IntPtr pBuffer, int iBufSize)
      {
      string regKeyValueResult = Marshal.PtrToStringAnsi(pBuffer);
      regKeyValueResult = “Bruce”;

      byte[] byte_array = ASCIIEncoding.ASCII.GetBytes(regKeyValueResult);

      // Copy the ASCII bytes of regKeyValueResult to the destination buffer pointed to by pBuffer.
      Marshal.Copy(byte_array, 0, pBuffer, byte_array.Length);
      // Remember to add a terminating NULL character.
      Marshal.WriteByte(pBuffer, byte_array.Length, (byte)0);

      return true;
      }

      – Bio.

      Posted by Lim Bio Liong | September 27, 2014, 10:46 am
  11. Fantastic article, thanks; I wish all Windows programming related articles were done with this care and thoroughness!

    Posted by William Webber | March 19, 2015, 1:40 am

Trackbacks/Pingbacks

  1. Pingback: Returning an Array of Strings from C++ to C# Part 1 « limbioliong - August 14, 2011

  2. Pingback: Using the StringBuilder in Unmanaged API Calls. « limbioliong - November 1, 2011

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: