//
you're reading...
Interop Marshaling

Understanding Custom Marshaling Part 4

1. Introduction.

1.1 We have reached the 4th part of this series of articles elucidating the basic principles of custom marshaling.

1.2 In the first part, we touched on marshaling one-way from managed to unmanaged code.

1.3 Then in part 2 and 3, we studied marshaling the other way from unmanaged to managed.

1.4 Here at last in part 4, we expound on marshaling in both directions : from managed to unmanaged and then back to managed.

2. Client Application.

2.1 We need a client application and a DLL to which we make a p/invoke call as usual.

2.2 Once again, we will use the same client application and the same DLL that we used in parts 1 through 3 but will embellish both with a new function.

2.3 This new function is listed below :

void __stdcall ModifyString(/*[in, out]*/ char** ppszStringToBeModified)
{
	MessageBox(NULL, *ppszStringToBeModified, "Original String", MB_OK);

	::CoTaskMemFree((LPVOID)*ppszStringToBeModified);

	char lpszString [] = "Modified string.";

	*ppszStringToBeModified = (char*)::CoTaskMemAlloc(sizeof(lpszString));

	strcpy_s(*ppszStringToBeModified, sizeof(lpszString), lpszString);	
}

The following are some important points about the ModifyString() API :

  • Notice that the parameter “ppszStringToBeModified” is of the same type as that of the GetString() API that we saw in part 2 (function listing below) :
void __stdcall GetString(/*[out]*/ char** ppszStringReceiver)
{
	char lpszString [] = "The quick brown fox jumps over the lazy dog";

	*ppszStringReceiver = (char*)::CoTaskMemAlloc(sizeof(lpszString));

	strcpy_s(*ppszStringReceiver, sizeof(lpszString), lpszString);
}
  • The parameter of GetString() “ppszStringReceiver” is treated as an “out” parameter. This means that it can be either uninitialized or initialized to NULL when it enters the body of the function. GetString() will then allocate a buffer and point “ppszStringReceiver” to it.
  • However, for the ModifyString() API, “ppszStringToBeModified” is initially treated as already pointing to a valid string buffer.
  • If ModifyString() wants to re-assign a buffer pointer to “ppszStringToBeModified”, it must first free the original buffer.
  • ModifyString() recognizes that on entry “ppszStringToBeModified” points to a valid string buffer and it displays the contents of that string buffer.
  • It then frees that original buffer and allocate a new buffer and points “ppszStringToBeModified” to it.
  • ModifyString() then assigns a new character string to this new buffer.

2.4 The following are some important points about ModifyString() when it is called in managed code :

  • When an API like ModifyString() with a parameter such as “ppszStringToBeModified” is called in a client application, the application must somehow pass a double pointer to some unmanaged string buffer as first parameter to the API.
  • In essence, at low-level, “ppszStringToBeModified” must point to a buffer which is large enough to hold a number (a memory address) that points to a C-style string.
  • This is the normal convention of how a pointer to a pointer to a C-style string passed as parameter in C/C++.
  • This memory-address-containing buffer is also known as a memory stub.
  • Hence when an API such as ModifyString() is used in .NET, the CLR must internally allocate such a memory stub and then pass the address of the stub as the parameter. After all, this is what ModifyString() expects.
  • Our client application can safely assume that this memory stub buffer will be properly managed by the CLR.
  • This situation (on entry into the ReturnString() API) is illustrated as shown below :

ModifyString_Parameter_Memory_View

  • When ModifyString() executes, it will free the original buffer and will modify the contents of the memory stub.
  • However, the stub itself will remain owned by the CLR and can only be freed by the CLR.
  • This situation (on execution and exit from the ReturnString() API) is illustrated as shown below :

ModifyString_Parameter_Memory_View_2

2.4 The C# DllImport declarations for the ModifyString() function is listed below :

[DllImport("TestDLL.dll", EntryPoint = "ModifyString", CallingConvention = CallingConvention.StdCall)]
private static extern void ModifyString
    ([In, Out][MarshalAs(UnmanagedType.CustomMarshaler, MarshalTypeRef = typeof(StringMarshaler))] ref string strToBeModified);

Notice that the parameter “strToBeModified” is marked by 3 attributes : InAttribute, OutAttribute and the MarshalAsAttribute. Furthermore, it is also marked as a parameter passed by reference (the “ref” keyword).

  • The InAttribute is used to indicate that the parameter is passed into the function with an initialized value.
  • The OutAttribute is used to indicate that the parameter is also possibly assigned some value by the ModifyString() API and hence the onus is on the client (our C# application) to release it.
  • It is the “ref” keyword must be used together with both InAttribute and OutAttribute to indicate to the CLR that the parameter is passed in and out.
  • Finally, the MarshalAsAttribute is used to indicate that the StringMarshaler is used as the custom marshaler for parameter passing.

2.5 We shall add the following new C# function to our client application :

static void DoTest_ModifyString()
{
    string str = "Original String";

    ModifyString(ref str);

    Console.WriteLine(str);
}

DoTest_ModifyString() will declare a managed string object “str” and initially assign it the value “Original String”. The ModifyString() API is then called with “str” passed as a ref parameter. The output “str” is then displayed on the console output.

3. Client Application In Action : DoTest_ModifyString().

3.1 Let’s look at DoTest_ModifyString() and examine the steps that will be taken that will lead to control being passed to the ModifyString() API and then back to the C# function :

  • The static StringMarshaler::GetInstance() method will be called by the CLR in order to obtain an instance of the StringMarshaler class.
  • The StringMarshaler::MarshalManagedToNative() function is then called. The parameter to this function is “str” with the string value “Original String”.
  • The Marshal.StringToCoTaskMemAnsi() function is used to convert the managed string “str” into an unmanaged C-style NULL terminated string in ANSI format.
  • At low-level, Marshal.StringToCoTaskMemAnsi() will allocate a buffer to hold a copy of the string value in “str” in ANSI format. It will then copy the string value in “str” to this buffer and will eventually return the memory address of this buffer.
  • This is shown in the diagram below :

MarshalManagedToNative_Memory_Register_View

  • The return value of Marshal.StringToCoTaskMemAnsi() (the address of the allocated string buffer) is stored in the EAX register : 005CFAA8.
  • Looking at the contents of the memory at location 0x005CFAA8, we see that “Original String” (the NULL-terminated C-style string created from “str”) is stored.
  • Next, the ModifyString() API is invoked. The parameter “ppszStringToBeModified” is a number that indicates the address of the memory stub that holds the address of the buffer just previously allocated by Marshal.StringToCoTaskMemAnsi() :

ModifyString_Memory_Register_View

  • From the above diagram, we can see that “ppszStringToBeModified” has value 0x0024f220. This means that the memory stub is located at this address.
  • When we observe the contents of memory at 0x0024f220, we see the value 0x005cfaa8 which is precisely the memory address of the string buffer.
  • ModifyString() will first display a message box showing the value of the string buffer :

MsgBox_OriginalString

  • The string buffer is then freed by CoTaskMemFree().
  • As shown in the diagram below, the contents of the memory that previously held “Original String” is freed :

ModifyString_CoTaskMemFree_Memory_View

  • Then a new string buffer is allocated using CoTaskMemAlloc() and a new string value “Modified string.” is assigned to it. See diagram below :

ModifyString_CoTaskMemAlloc_Memory_View

  • From the diagram above, we see that the new string buffer is at address 0x005C7E90.
  • Notice that the memory stub at address 0x0024F220 now contains 0x005C7E90 which points to the new string buffer.
  • After ModifyString() returns, StringMarshaler.CleanUpManagedData() is called to check if any managed data needs to be freed :

CleanUpManagedData_Memory_View

  • Notice that the parameter is the original “str” that we passed into ModifyString() from within managed code.
  • This call to CleanUpManagedData() is important because the original parameter was passed by reference and on return, a brand new object may be created (as is the case in our example).
  • CleanUpManagedData() offers the custom marshaler a chance to clean up resources associated with the original parameter.
  • In our example, we can leave it to the CLR to reclaim the resources of the original string which is a managed object hence we need not do anything.
  • Next StringMarshaler.MarshalNativeToManaged() is called to convert the native data (the unmanaged C-style string) into a managed object.
  • The Marshal.PtrToStringAnsi() function is used to create a managed string from the unmanaged string at address 0x005c7e90 (“Modified string.”).
  • Finally, StringMarshaler.CleanUpNativeData() is called to free the unmanaged data that was returned from the call to ModifyString() :

CleanUpNativeData_Memory_View

  • Remember that what is returned from the ModifyString() API is actually the low-level pointer to “Modified string.”. And after this has been converted to a managed string, it needs to be freed, hence the call to CleanUpNativeData().
  • Furthermore, the StringMarshaler is called upon to free this string buffer because, since it is a returned value, StringMarshaler owns it and so bears the responsibility to free it.
  • Control then returns to DoTest_ModifyString() and the cycle of interop marshaling concludes.

3.2 The following are pertinent points about the underlying activities behind the call to DoTest_ModifyString() :

  • Note the DllImport definition for the ModifyString() function (see section 2.4). The parameter “strToBeModified” is passed by reference (note the “ref” keyword).
  • This does not mean that the ModifyString() API gets to own the object that gets passed to it.
  • No, rather, it is free to modify the object behind the parameter and later return it (in modified form).
  • It is still the client code that owns the returned object.
  • The fact that ModifyString() has to clear the memory associated with the original string only means that the memory clearing is part of the modification process.
  • As for the client code, memory ownership obliges memory freeing : the owner of a memory buffer is responsible for its eventual de-allocation.
  • This is why StringMarshaler::CleanUpNativeData() must be called to free it on behalf of the client application.
  • Note the interesting fact that before StringMarshaler::CleanUpNativeData() is called, StringMarshaler.CleanUpManagedData() is called to allow the custom marshaler a chance to perform clearing of any resources associated with the marshaling process.
  • StringMarshaler::CleanUpNativeData() can certainly be used to undo any extra activities performed in StringMarshaler::MarshalManagedToNative().
  • Note, however, that any information required to perform the cleaning up of native data must be tagged along with the original object passed as parameter to the target of the marshaling process (in our case the ModifyString() API).
  • This parameter will end up as the parameter to StringMarshaler::MarshalManagedToNative().
  • For example, if there are any additional unmanaged memory allocation to be performed, the address of the allocated unmanaged memory must be retained in the passed in object. In this case, the object must be a structure that has a member that can be used to hold such an address.
  • It cannot be stored in the marshaler itself. This would require that the marshaler hold state on behalf of the parameter which would require a unique marshaler for every marshaling process.

4. In Summary.

4.1 This concludes our basic study of custom marshaling.

4.2 I certainly hope that the entire treatise from part 1 through part 4 has been thorough and will be a good source for future reference.

4.3 Henceforth, I will be writing more articles associated with custom marshaling dealing with other details that I have left out in parts 1 through 4.

4.4 I also look forward to providing some additional sample custom marshalers that can be useful in real-world projects as well as for motivational and learning purposes.

Advertisements

About Lim Bio Liong

I've been in software development for nearly 20 years specializing in C , COM and C#. It's truly an exicting time we live in, with so much resources at our disposal to gain and share knowledge. I hope my blog will serve a small part in this global knowledge sharing network. For many years now I've been deeply involved with C development work. However since circa 2010, my current work has required me to use more and more on C# with a particular focus on COM interop. I've also written several articles for CodeProject. However, in recent years I've concentrated my time more on helping others in the MSDN forums. Please feel free to leave a comment whenever you have any constructive criticism over any of my blog posts.

Discussion

2 thoughts on “Understanding Custom Marshaling Part 4

  1. Hi Bio, I notice you have the debugger running in both the C++ and C# code simultaneously which is very useful, How do you achieve this ? – Another good article by the way.

    Posted by Pete Kane | November 20, 2013, 9:39 am
    • Hello Pete,

      1. Go to your C# project properties, Debug section.

      2. In the “Enable Debuggers” section, select the “Enable unmanaged code debugging” checkbox.

      3. Then go to your C++ code and put a breakpoint in the function you would like to step into.

      4. Start debugging.

      5. Note that steps 2 and 3 must both be done.

      – Bio.

      Posted by Lim Bio Liong | November 20, 2013, 3:44 pm

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: