//
you're reading...
Interop Marshaling

Understanding Custom Marshaling Part 1

1. Introduction.

1.1 Custom marshaling is a fascinating .NET feature.

1.2 As is the case for standard interop marshaling, it is used to transform data of a managed type into equivalent data of an unmanaged type for the purpose of parameter passing to/from an unmanaged function.

1.3 This article will demonstrate how custom marshaling can be performed.

2. Custom Marshaling In Action.

2.1 As mentioned, custom marshaling is used to transform managed data into unmanaged data and vice versa.

2.2 The following diagram illustrates this :

custom_marshaling_outline

2.3 One may look at the custom marshaler as a factory that takes managed data as raw material and then churning out data that can be used by the unmanaged side.

2.4 Note that being able to produce equivalent unmanaged data does not mean that the original managed data has somehow changed. No, the managed data remains intact. Rather, a new data that represents the managed data is created.

2.5 The same goes for the other direction : it serves as a factory for producing managed data from unmanaged ones.

2.6 And the same goes for the fact that the unmanaged data remains what it is but a managed representation of that unmanaged data is created and used in its place in the managed world.

3. An Example Custom Marshaler.

3.1 I will demonstrate a sample marshaler based on a subject that is close to our hearts – the marshaling of strings to and from unmanaged code.

3.2 The code below lists the custom String Marshaler class :

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Runtime.InteropServices;

namespace StringMarshalerClassLib
{
    public class StringMarshaler : ICustomMarshaler
    {
        public object MarshalNativeToManaged(IntPtr pNativeData)
        {
            return Marshal.PtrToStringAnsi(pNativeData);
        }

        public IntPtr MarshalManagedToNative(object ManagedObj)
        {
            // Managed Object must be a string type.
            if (!(ManagedObj is string))
            {
                return IntPtr.Zero;
            }

            return Marshal.StringToCoTaskMemAnsi((string)ManagedObj);
        }

        public void CleanUpNativeData(IntPtr pNativeData)
        {
            Marshal.FreeCoTaskMem(pNativeData);
            pNativeData = IntPtr.Zero;
        }

        public void CleanUpManagedData(object ManagedObj)
        {
            // Nothing to do
        }

        public int GetNativeDataSize()
        {
            return -1;
        }

        public static ICustomMarshaler GetInstance(string cookie)
        {
            // Always return the same instance
            if (marshaler == null)
            {
                marshaler = new StringMarshaler();
            }

            return marshaler;
        }

        static private StringMarshaler marshaler;
    }
}

3.3 The class StringMarshaler implements the ICustomMarshaler interface from which the following methods must be implememented :

  • MarshalNativeToManaged().
  • MarshalManagedToNative()
  • CleanUpNativeData()
  • CleanUpManagedData()
  • GetNativeDataSize()
  • GetInstance() [a static method]

3.4 The rather unusual requirement is the GetInstance() method which must be declared static.

3.5 The reason for the static requirement is due to the fact that in essence, the CLR will only call GetInstance() once per application domain. Hence there is at most one custom marshaler loaded per application domain. This is so no matter how many times the custom marshaler is used in the application domain.

3.6 Note that you can influence the CLR into calling GetInstance() multiple times by using different Marshal Cookies, thereby potentially instantiating multiple instances of the custom marshaler (we shall explore this in a later part of this series of articles). However, in spirit, the custom marshaler is meant to be a singleton and must not hold any state.

3.7 Note also that not all methods need concrete implementation. As you can see above, some of the ICustomMarshaler methods are trivial. This depends on the situation of how the marshaler is used. Furthermore, at runtime, not all methods, even those that have concrete implementations, will be called. It depends on the direction of marshaling required.

3.8 In this series of articles, I would want to go through and expound each method of the ICustomMarshaler interface.

3.9 Towards this end, I think it would be good to use a client application that steps through the relevant methods of the interface. In this part 1, since only one-directional marshaling (from managed to unmanaged) is presented, I shall go through only the ICustomMarshaler methods that necessarily require concrete implementation and that will get invoked by the CLR. This is presented in the next section.

3.10 I shall say more about the actual code in the methods of the StringMarshaler as I go through the example case below. But suffice it to say that it need not be overly sophisticated nor require unsafe code. At least not for this example.

3.11 A note about deployment : the StringMarshaler class may be compiled into a class library and then referenced by a client application or, alternatively, the class .cs file may be added into the client application project. I have personally chosen to compile the StringMarshaler class into a class library.

4. Client Application.

4.1 There are actually 2 sets of client code to be used in this example :

  • C# application.
  • DLL with an exported function called by the C# app.

4.2 We need a C# application of course. But in order for a custom marshaler to be activated, we need to make an interop call and this comes in the form of a DLL exported function.

4.3 In this example, I shall demonstrate single-directional marshaling going from managed code towards unmanaged code.

4.4 For this purpose, the DLL (named TestDLL.dll, say) exports the following function : SetString(). The following lists the code for SetString() :

void __stdcall SetString(/*[in]*/ const char* lpszString)
{
	MessageBox(NULL, lpszString, "TestDLL", MB_OK);
}

It is very simple and takes an input constant C-style string and displays it from a message box. The purpose of such a function is to demonstrate a pure one directional parameter marshaling.

4.5 The following is a listing of the C# client application :

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Runtime.InteropServices;
using StringMarshalerClassLib;

namespace ConsoleClient
{
    class Program
    {
        [DllImport("TestDLL.dll", EntryPoint = "SetString", CallingConvention = CallingConvention.StdCall)]
        private static extern void SetString
            ([In][MarshalAs(UnmanagedType.CustomMarshaler, MarshalTypeRef = typeof(StringMarshaler))] string strValue);

        static void DoTest_SetString()
        {
            string str = "My String";

            SetString(str);
        }

        static void Main(string[] args)
        {
            DoTest_SetString();
        }
    }
}

4.6 The client application (a console program) is designed to use the SetString() API exported from TestDLL.dll. In order to do this, it has to make p/invoke declarations for the API as follows :

[DllImport("TestDLL.dll", EntryPoint = "SetString", CallingConvention = CallingConvention.StdCall)]
private static extern void SetString
    ([In][MarshalAs(UnmanagedType.CustomMarshaler, MarshalTypeRef = typeof(StringMarshaler))] string strValue);

Notice that the parameter of the SetString() API is decorated with the MarshalAsAttribute specifying the use of a custom marshaler of type StringMarshaler.

This is significant and indicates to the .NET framework that when the interop call is made to the API, the packaging of the parameter data will not be performed by the Common Language Runtime (CLR). It will instead be performed by a StringMarshaler object.

The invokation of the API (i.e. the act of loading the DLL, looking up the address of the exported function and the handing over of execution control to the function itself) is performed by the CLR, but the transformation of the managed data to unmanaged data, which is then handed over to the CLR to be inserted onto the stack of the function call, is performed by the custom marshaler.

4.7 As is clear from the code above, the call to the SetString() API is performed in the static method DoTest_SetString(). We shall go through this function thoroughly in the next section.

5. Client Application In Action : DoTest_SetString().

5.1 The following is a listing of the DoTest_SetString() function :

static void DoTest_SetString()
{
    string str = "My String";

    SetString(str);
}

5.2 It is very simple. A managed string “str” is defined and assigned the value “My String”. And then the SetString() API is called.

5.3 Now when SetString() is activated, “str” is passed as a parameter.

5.4 The following is the sequence of steps that will be taken that will lead to control being passed to the SetString() API :

  • The static StringMarshaler::GetInstance() method will be called by the CLR in order to obtain an instance of the StringMarshaler class.
  • Next, the StringMarshaler::MarshalManagedToNative() function will be called in order to transform the managed string (“str”) into an unmanaged (native) object. This unmanaged object must be returned via an IntPtr.
  • Control will now reach the actual SetString() function (inside TestDLL.dll). It will be invoked with a pointer to the unmanaged string created in StringMarshaler::MarshalManagedToNative() (the IntPtr) set as the parameter.
  • There is nothing special about the SetString(). It will simply display the following message box using the parameter :

SetString_MsgBox

  • After this, control will return to managed code and StringMarshaler::CleanUpNativeData() will be called next. This is important because the unmanaged (native) object created in StringMarshaler::MarshalManagedToNative() remains owned by the StringMarshaler instance and must thus be freed by it (more on this below).
  • Control then returns to DoTest_SetString() and the cycle of interop marshaling concludes.

5.5 The following are pertinent points about the underlying activities behind the call to DoTest_SetString() :

  • Recall the DllImport definition for the SetString() function (see section 4.6). The first parameter is decorated with the InAttribute.
  • This means that the “strValue” parameter is passed by value to the SetString() function. This means that SetString() is to treat its parameter as read-only.
  • It also means that the “strValue” parameter remains owned by the managed code. It will be freed by the garbage collector at the appropriate time.
  • Note how MarshalManagedToNative() does this. It used the Marshal.StringToCoTaskMemAnsi() function to convert the managed string into an unmanaged one and then return a pointer to it.
  • Because a C-style unmanaged string is required, and a managed string cannot be treated as such, one has to be allocated in memory (and later freed). The allocation of this is done via the unmanaged Windows API CoTaskMemAlloc().
  • CoTaskMemAlloc() returns a pointer which is then used by Marshal.StringToCoTaskMemAnsi() to copy the contents of the managed string into the allocated buffer. The copied string will be in ANSI format.
  • The pointer returned from CoTaskMemAlloc() is then returned from MarshalManagedToNative() as an IntPtr.
  • Additionally, this newly allocated C-style string, (the unmanaged equivalent of “strValue”) created by the StringMarshaler, is also owned by the StringMarshaler instance which created it.
  • When SetString() returns, the StringMarshaler’s CleanUpNativeData() method is called to free this C-style string.
  • Since CoTaskMemAlloc() was used to allocated the original C-style string, CoTaskMemFree() will be used to free it.
  • We can see that this is done inside CleanUpNativeData() via Marshal.FreeCoTaskMem().
  • In this example, we have used methods of the Marshal class to perform managed to unmanaged data transformation.
  • Other implementers are free to use whatever means to perform this. The only requirements are that the transformed data be passed via IntPtr and that the custom marshaler be responsible for freeing any allocated data.

6. In Summary.

6.1 After reading this part 1, I hope that the reader has gained an interest in custom marshaling.

6.2 In this part 1, I have demonstrated how a managed string can be transformed into an unmanaged one in order to be used in unmanaged code. I have also shown how this unmanaged data will eventually be freed.

6.3 As mentioned previously, the code for a custom marshaler need not be overly complicated. Basic .NET class library functions may be used to achieve its end.

6.4 The one advantage that developers should use is the fact that we can get involved in the marshaling process itself and control how data is transformed between the managed and unmanaged worlds.

6.5 In part 2, we shall explore single-directional marshaling but in the other direction : from unmanaged code to managed.

Advertisements

About Lim Bio Liong

I've been in software development for nearly 20 years specializing in C , COM and C#. It's truly an exicting time we live in, with so much resources at our disposal to gain and share knowledge. I hope my blog will serve a small part in this global knowledge sharing network. For many years now I've been deeply involved with C development work. However since circa 2010, my current work has required me to use more and more on C# with a particular focus on COM interop. I've also written several articles for CodeProject. However, in recent years I've concentrated my time more on helping others in the MSDN forums. Please feel free to leave a comment whenever you have any constructive criticism over any of my blog posts.

Discussion

11 thoughts on “Understanding Custom Marshaling Part 1

  1. Hi there, I copied and built your example but get the error DllEntryPointNotFound

    Posted by Pete Kane | November 14, 2013, 11:28 am
  2. Hi again, I solved the DllEntryPointNotFound by adding __delspec(dllEcport) to the C++ function and the function does work but I now receive a PInvokeStackImbalance error – any ideas? also the MessageBox parameters insist on LPCWSTR types, which I’ve tried to cast but the output is total gibberish.

    Posted by Pete Kane | November 14, 2013, 11:41 am
    • Sorry I meant __delspec(dllExport)

      Posted by Pete Kane | November 14, 2013, 11:42 am
    • Hello Pete,

      1. I assume that you are using Visual Studio.

      2. In your project for TestDLL.dll, set the project settings to “Use Multi-Byte Character Set” instead of “Use Unicode Character Set”.

      3. The example code in this article uses ANSI characters and not Unicode.

      Hope this helps,
      – Bio.

      Posted by Lim Bio Liong | November 14, 2013, 4:43 pm
      • Hi Bio, I noticed MessageBox was defined as MessageBoxW so I undefined it and redefined it as MessageBoxA and it all worked fine , I’m very interested in your articles calling external dlls created in c++ are you planning any more ? I need to find a way to get a ADODB Recordset object from a c++ dll function which takes a string as a parameter do you think it’s possible ?

        Posted by pete0906 | November 14, 2013, 5:41 pm
  3. Hello Pete,

    1. >> …c++ dll function which takes a string as a parameter do you think it’s possible ?
    Yes, definitely.

    2. Declare the parameter using the MarshalAsAttribute and specify UnmanagedType.LPStr or UnmanagedType.LPWStr as parameter.

    3. Refer to “Returning Strings from a C++ API to C#” (https://limbioliong.wordpress.com/2011/06/16/returning-strings-from-a-c-api/) for more details.

    – Bio.

    Posted by Lim Bio Liong | November 15, 2013, 5:35 am
    • Hi Bio, thanks for your patience, I know ( thanks to you ) how to pass a string in to a c++ function but is it possible to return a com object like ADODB::_Recordsetptr ? I suppose the difficult part is Marshaling it, what “type” would one use ? Thanks again, good articles.

      Pete Kane

      Posted by pete0906 | November 15, 2013, 6:56 am
      • Hello Pete,

        1. >> is it possible to return a com object like ADODB::_Recordsetptr ?
        Yes, definitely.

        2. This is known as COM Intetop.

        3. It is a big subject and I suggest that you work through the following Microsoft Tutorial :

        COM Interop Tutorials
        http://msdn.microsoft.com/en-us/library/aa645712%28v=vs.71%29.aspx

        – Bio.

        Posted by Lim Bio Liong | November 15, 2013, 8:03 am
      • Hi Bio, I read the article you suggested but it doesn’t show how to create a c++ dll that exports a function I tried this but the tlbimp command says mydll.dll is not a valid type lib

         
        
        #include  
        
        #include  
        
          
        
        using namespace std; 
        
          
        
        #import "C:\Program Files\Common Files\System\ADO\msado15.dll" rename("EOF","EndOfFile") 
        
        using namespace ADODB; 
        
          
        
        extern "C" 
        
        { 
        
          
        
               __declspec(dllexport) ADODB::_RecordsetPtr GetRecordSet() 
        
               { 
        
                      _RecordsetPtr spRst; 
        
                      spRst.CreateInstance(__uuidof(_RecordsetPtr)); 
        
                      return(spRst); 
        
               } 
        
          
        
        } 
        
        

        Posted by pete0906 | November 15, 2013, 10:41 am
  4. Hello Pete,

    1. You need to be familiar with COM interop in general.

    2. On the C++ DLL side, it is no problem to define a function that returns a pointer to a COM interface and use it in managed code.

    3. On the managed side, you need to reference the type libraries that defines the ADODB COM objects and then write code that work with these COM objects.

    4. From what I can see from your code above, you would likely need to reference msado15.dll in your managed project (e.g. C#).

    5. From what I can infer from your message above, your DLL (exports GetRecordSet()) is probably not a COM DLL
    hence it is not possible to do a TLBIMP on it.

    6. But do go through any reference or tutorial on COM interop. You need to be familiar with this in order to work with COM objects in managed code.

    – Bio.

    Posted by Lim Bio Liong | November 16, 2013, 4:28 am

Trackbacks/Pingbacks

  1. Pingback: Understanding Custom Marshaling Part 2 | limbioliong - November 9, 2013

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: