//
you're reading...
Interop Marshaling

Understanding Custom Marshaling Part 2

1. Introduction.

1.1 This article is a continuation of Understanding Custom Marshaling Part 1.

1.2 In part 1, we learned how to code custom marshaling for the purpose of passing an object (a managed string) from managed code to unmanaged.

1.3 Here in part 2, we study how custom marshaling is done in the other direction : from the unmanaged world to the managed.

1.4 We shall continue to use the StringMarshaler class that we created in part 1.

2. Client Application.

2.1 Just as in part 1, we need a client application and a DLL to which we make a p/invoke call.

2.2 We will use the same client application and the DLL that we used in part 1 but will embellish both with new functions.

2.3 In order to show marshaling from the unmanaged to the managed world, the unmanaged DLL function needs to return a pointer to some object.

2.4 There are 2 ways the pointer can be returned :

  • as an “out” parameter.
  • as a direct return value.

2.5 In the example of this part 2, we shall be returning via an “out” parameter. Then later in part 3, we shall examine how to return directly.

2.6 We will also specifically be returning a pointer to a C-style null-terminated string. Towards this end, we shall create the following new API :

void __stdcall GetString(/*[out]*/ char** ppszStringReceiver)
{
	char lpszString [] = "The quick brown fox jumps over the lazy dog";

	*ppszStringReceiver = (char*)::CoTaskMemAlloc(sizeof(lpszString));

	strcpy_s(*ppszStringReceiver, sizeof(lpszString), lpszString);
}

Notice that the parameter “ppszStringReceiver” is declared as a double pointer to a char. In essence, it is a pointer to a buffer which is large enough to hold a number (a memory address) that points to a C-style string. This is the normal convention of how a C-style string pointer is retuned in C/C++. Hence when an API with such a parameter is used in .NET, the CLR must also pass a pointer to a such a memory address container buffer, i.e. a memory stub, as parameter.

For C/C++, such a memory stub buffer may be intrinsically temporary (e.g. a variable declared in a function body) or otherwise (e.g. a global variable). For a .NET language like C#, as we shall later see, the CLR will allocate the memory stub on behalf of the client application. A pointer to this buffer will be passed as the parameter. This is illustrated in the diagram below :

GetString_Parameter_Memory_View

As a runtime process working alongside the CLR, our client application need not be concerned about whether this buffer is temporary or otherwise. We can assume that it will be properly managed by the CLR.

2.7 The C# DllImport declarations for the above function is listed below :

[DllImport("TestDLL.dll", EntryPoint = "GetString", CallingConvention = CallingConvention.StdCall)]
private static extern void GetString
    ([Out][MarshalAs(UnmanagedType.CustomMarshaler, MarshalTypeRef = typeof(StringMarshaler))] out string strReceiver);

Just as we saw in part 1, we use the MarshalAsAttribute to indicate the custom marshaler.

2.8 As for the client application, we shall add a new function which is listed below :

static void DoTest_GetString()
{
    string str;

    GetString(out str);

    Console.WriteLine(str);
}

It is a simple function which makes a test call to the GetString() API and then displays the output of the API.

3. Client Application In Action : DoTest_GetString().

3.1 Let’s look at DoTest_GetString() and break it down :

  • A managed string “str” is defined.
  • The GetString() API is called with “str” passed as an “out” parameter.
  • After we obtain a value for “str” from GetString(), we display it on the console output.

3.2 The following is the sequence of steps that will be taken that will lead to control being passed to the GetString() API and then back to the C# function :

  • The static StringMarshaler::GetInstance() method will be called by the CLR in order to obtain an instance of the StringMarshaler class.
  • Control will then reach the actual GetString() function (inside TestDLL.dll). It will be invoked with a double pointer to a C-style string.
  • With reference to point 2.6, what this means is that in the GetString() API, the “ppszStringReceiver” parameter is a pointer to a memory stub (a small buffer meant to hold an address) which is expected to be filled with another pointer value : that which points to a C-style string.
  • This memory stub (to which “ppszStringReceiver” points) is allocated by the CLR. We can safely assume that it will later be de-allocated by the CLR when it is no longer needed.
  • Now this stub will initially be filled with zero values. It is meant to be eventually filled by the GetString() API with the address to the required C-style string.
  • The following diagram illustrates this :

GetString_Parameter_Memory_View_2

  • The following is a snapshot of GetString() in action with a view of the memory of parameter “ppszStringReceiver” at runtime :

GetString_Parameter_Memory_View

  • As we can see from the diagram above, “ppszStringReceiver” contains value “0x0032ef40” and when we look at the contents in memory at this address, it contains 0x00000000.
  • GetString() then allocates a memory buffer via CoTaskMemAlloc(). The size of this buffer is that of the C-stye string contained in the “lpszBuffer” buffer.
  • The address returned from CoTaskMemAlloc() is then assigned to the memory buffer pointed to by “ppszStringReceiver” as shown in the diagram below :

CoTaskMemAlloc_ReturnPtr

  • The memory buffer assigned by CoTaskMemAlloc() is then assigned the string value contained in “lpszBuffer” as shown in the diagram below :

strcpy_s_result

  • When GetString() returns, control is passed to the StringMarshaler::MarshalNativeToManaged() function.
  • The parameter to this function is “pNativeData” (an IntPtr) and it is the address of the buffer allocated by CoTaskMemAlloc() in the GetString() API and which contains the string “The quick brown fox jumps over the lazy dog” :

MarshalNativeToManaged_pNativeData_View

  • The purpose of MarshalNativeToManaged() is to return a managed object which is to be used as the return value from the GetString() API from within managed code.
  • Looking at the code for DoTest_GetString() in point 2.8, we see that this return value is to be assigned to “str” (a managed string).
  • MarshalNativeToManaged() will take the value in “pNativeData” and and use it to create a managed string via Marshal.PtrToStringAnsi(). It assumes that there is a C-style NULL terminated string at the memory location pointed to by “pNativeData” and that this string is in ANSI format.
  • The managed string thus created will later be returned and then assigned to “str”. However, before returning back to the DoTest_GetString() function, the StringMarshaler::CleanUpNativeData() will be called by the CLR to free up the native data that was returned from the GetString() API :

CleanUpNativeData_pNativeData_View

  • CleanUpNativeData() frees the buffer pointed to by “pNativeData” via Marshal.FreeCoTaskMem(). The buffer which previously contained the string “The quick brown fox jumps over the lazy dog” is now wiped out as can be seen below (contrast the memory contents of the buffer pointed to by “pNativeData” between the diagrams above and below) :

CleanUpNativeData_pNativeData_MemoryFree

  • Control finally returns to DoTest_GetString() and the cycle of interop marshaling concludes.
  • Incidentally, this is the point where the original buffer, allocated by the CLR to which the parameter “ppszStringReceiver” points in function GetString(), will have its contents set to all zeroes.
  • This can be taken as an indication that the CLR has reclaimed this buffer for re-use.

3.3 The following are pertinent points about the underlying activities behind the call to DoTest_GetString() :

  • Recall the DllImport definition for the GetString() function (see section 2.7). The “strReceiver” parameter is designated as an “out” parameter.
  • This means that the “strReceiver” parameter is meant to be uninitialized and will be assigned by the return value of the GetString() function.
  • This means that whatever is returned from GetString() is owned by the caller (i.e. the client application).
  • This of course refers to the low-level C-style string returned from GetString() and this is the reason why StringMarshaler::CleanUpNativeData() must be called to free it.
  • Next, note how the GetString() API creates this string (point 2.6). It does this by allocating a buffer via CoTaskMemAlloc() and then assigns a C-style string value to this buffer.
  • The choice of using CoTaskMemAlloc() is important because the StringMarshaler::CleanUpNativeData() uses Marshal.FreeCoTaskMem() to free the buffer.
  • But readers should realize that this is completely pre-arranged : StringMarshaler will use Marshal.FreeCoTaskMem() to free the returned buffer and so the unmanaged API GetString() must allocate the buffer using CoTaskMemAlloc().
  • Developers are free, of course, to create custom marshalers which use other mechanisms for buffer allocation and de-allocation.
  • In fact, the CLR uses Marshal.FreeCoTaskMem() to free up unmanaged string buffers and so in the common situation where the default marshaler is used, unmanaged APIs which return strings must use CoTaskMemAlloc().
  • Next note why StringMarshaler::MarshalNativeToManaged() is called. It is called because the low-level string allocated in GetString() is a C-style string and as such cannot be used directly in managed code. StringMarshaler::MarshalNativeToManaged() must be called to convert the C-style string to a managed one via Marshal.PtrToStringAnsi().
  • Marshal.PtrToStringAnsi() is used because the C-style string allocated in GetString() is assumed to be in ANSI format. This is also pre-arranged.

4. In Summary

4.1 I hope this part 2 has proven interesting to the reader.

4.2 We are now familiar with how marshaling is done from unmanaged code to managed.

4.3 In part 3, we shall remain with the subject of marshaling from the unmanaged to the managed world except that we shall study the case where the string to be marshaled is returned directly (as a return value) viz via a double string pointer (as was the case in this part 2).

Advertisements

About Lim Bio Liong

I've been in software development for nearly 20 years specializing in C , COM and C#. It's truly an exicting time we live in, with so much resources at our disposal to gain and share knowledge. I hope my blog will serve a small part in this global knowledge sharing network. For many years now I've been deeply involved with C development work. However since circa 2010, my current work has required me to use more and more on C# with a particular focus on COM interop. I've also written several articles for CodeProject. However, in recent years I've concentrated my time more on helping others in the MSDN forums. Please feel free to leave a comment whenever you have any constructive criticism over any of my blog posts.

Discussion

Trackbacks/Pingbacks

  1. Pingback: Understanding Custom Marshaling Part 4 | limbioliong - November 17, 2013

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: