//
you're reading...
BSTR, Programming Issues/Tips

The Importance of Proper BSTR Allocation.

1. Introduction.

1.1 Note that code like the following :

BSTR bstr = L"My BSTR";

does not allocate a BSTR.

1.2 To allocate a BSTR, you must use the ::SysAllocString() API :

BSTR bstr = ::SysAllocString(L"My BSTR");

2. Explanation.

2.1 We can verify this with a call to the ::SysStringByteLen() API :

UINT uiLen = ::SysStringByteLen(bstr);

2.2 In the case where the BSTR was allocated using ::SysAllocString(), we get uiLen == 14 (the total byte size of the string “My BSTR” in unicode). Each unicode character takes up 2 bytes, “My BSTR” contains 7 characters and so the byte size of the BSTR is 14 bytes.

2.3 In the case where the BSTR was “allocated” merely by being pointed to a wide character string, we get a value for uiLen that most likely does not equal 14. It is a random number.

2.4 This is because ::SysAllocString() generates a proper BSTR with a preceding 4 byte length indicator placed in the memory location just before the actual BSTR data. This is illustrated in the diagram below :

The above diagram shows the memory layout of a genuine BSTR containing the string “My BSTR”. The numbers immediately preceding the string, i.e. “0e 00 00 00”, is the 4-byte length indicator. It works out to the number 14.

2.5 Code like :

BSTR bstr = L"My BSTR";

merely creates in memory a Unicode string without any preceding 4 byte length indicator.

3. Effects of Calling ::SysFreeString().

3.1 Calling ::SysFreeString() on a BSTR allocated by ::SysAllocString() will properly free the BSTR.

3.2 However, it is not clear what will happen when ::SysFreeString() is called on a BSTR allocated by the code snippet in point 1.1.

3.3 Through debugging and observation the BSTR does not seem to get freed. I have tried testing with the environment variable OANOCACHE set to 1 (which forces immediate deletion of freed BSTRs) and the BSTR is not freed.

3.4 This is so even if I manually modified the preceeding 4-byte length indicator to reflect the actual length of the BSTR.

4. Potential Problems.

4.1 An improperly allocated BSTR can cause problems when passed as an out parameter from an API or a COM method.

4.2 This is because of the use of the BSTR based on the preceding length indicator which is absent from a BSTR not allocated via ::SysAllocString(). Client code will assume that the length indicator exists and will process the BSTR accordingly.

4.3 As an example, observe the following API written in C++ :

void __stdcall GetString(/*[out]*/ BSTR* pBstrReceiver)
{
  *pBstrReceiver = L"My BSTR";
}

4.4 The following is a sample C# client code including the DllImport declaration for GetString() :

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Runtime.InteropServices;

namespace CSConsoleClient01
{
    class Program
    {
        [DllImport("TestBSTRDLL.dll", CallingConvention=CallingConvention.StdCall)]
        private static extern void GetString([MarshalAs(UnmanagedType.BStr)][Out] out string strReceiver);

        static void TestGetString()
        {
            string strReceiver = null;

            GetString(out strReceiver);

            Console.WriteLine("string : {0:S}.", strReceiver);
        }

        static void Main(string[] args)
        {
            TestGetString();
        }
    }
}

4.5 Now, when the GetString() API is called, the following is the memory layout of the BSTR that is assigned to pBstrReceiver :

The 4 bytes preceding “My BSTR” will inadvertently be used to mean the length prefix. The number as displayed above “c8 26 00 00” happens to be that which appears on my machine. It could be any random number. On my machine this works out to the large number 9928. The BSTR is taken to be of byte size 9928.

4.6 In the C# code, when GetString() returns, strReceiver will contain a very long string which contains mostly garbage.

4.7 Besides this, there will be another problem : a BSTR which is returned as an “out” parameter will be owned by the caller of the API or COM method. The onus is on the caller to free the memory occuppied by the returned BSTR.

4.8 With C#, the interop marshaler will duely call Marshal.FreeBSTR() to free the returned BSTR. Marshal.FreeBSTR() will internally call ::SysFreeString(). And as mentioned in section 3, calling ::SysFreeString() on an invalid BSTR may not free the BSTR and so there may potentially be a memory leak.

5. Some Interesting Observations.

5.1 An invalidly allocated BSTR (like the one from the code snippet in point 1.1) but with the preceeding 4-byte length indicator manually modified to reflect proper size will return a correct value when ::SysStringByteLen() is called.

5.2 Such a modified BSTR will also be correctly used by the interop marshaler when it is used to create a managed string (as is the case in the example C# code in section 4). With a proper length prefix, even if manually set, the managed string will contain data as intended by the original BSTR. But remember that even with the length prefix modified to a correct value, the BSTR may still not be freed when ::SysFreeString() is called on it.

Advertisements

About Lim Bio Liong

I've been in software development for nearly 20 years specializing in C , COM and C#. It's truly an exicting time we live in, with so much resources at our disposal to gain and share knowledge. I hope my blog will serve a small part in this global knowledge sharing network. For many years now I've been deeply involved with C development work. However since circa 2010, my current work has required me to use more and more on C# with a particular focus on COM interop. I've also written several articles for CodeProject. However, in recent years I've concentrated my time more on helping others in the MSDN forums. Please feel free to leave a comment whenever you have any constructive criticism over any of my blog posts.

Discussion

3 thoughts on “The Importance of Proper BSTR Allocation.

  1. Hi Lim,

    I have one query related to BSTR,
    Suppose i have declared BSTR as below

    BSTR xyz;

    then i passed “xyz” as address in function.

    like,

    ::PassingXYZ(&xyz);

    at client end i want to check that “xyz” should be initialized with something, at least NULL or Some Value in it,
    How can i do that?

    Client code will look like below

    PassingXYZ(BSTR *receivedXYZ)
    {

    // Here i have to check “receivedXYZ” is initialized,
    // because here i ma converting “receivedXYZ” to const char * and “receivedXYZ” is not initialised then it is
    //crashing and not even catching exception in catch(…) block

    // Also send me how can i catch exceptions if any in COM related to Sys Functions.
    //(SysAllocString, SysStringByteLen etc)

    //Code snippet, which is crashing

    int length = SysStringByteLen( *receivedXYZ); // Here it is crashing because “BSTR xyz;” (Not initialised)

    // I have check here for “xyz” , how can i check.,
    }

    Posted by Jayesh Patil | August 30, 2012, 12:34 pm
    • Hello Jayesh,

      1. There is no way to check whether a BSTR has been initialized except to check whether it is NULL. Robustness of code is achieved through adherence to protocol (explained below).

      2. A function like PassingXYZ() must be documented to indicate whether its parameter “receivedXYZ” is an [in] or [out] or [in, out] parameter.

      3. That is the convention by which APIs and COM methods determine how to process parameters (including a BSTR type parameter).

      4. If “receivedXYZ” is an [in] parameter, then the onus is on the caller to ensure that it has been initialized properly (including the possibility that it is set to NULL). In this case, the PassingXYZ() function must leave it alone and not change its value. It is also the responsibility of the caller to free the BSTR eventually. In this situation it does not make sense to set “receivedXYZ” to be of type BSTR*, it can simply be a BSTR.

      5. If “receivedXYZ” is an [out] parameter, then the caller need not initialize it. It is the responsibility of the PassingXYZ() function to initialize and set its value it internally. After the function call, the caller code must free the BSTR. In this case, it does make sense to type “receivedXYZ” as a BSTR*.

      6. If “receivedXYZ” is an [in, out] parameter, then it is the responsibility of the caller to first initialize it and then pass its address to PassingXYZ(). Internally, PassingXYZ() is allowed to change the contents of “receivedXYZ” (including calling SysReAllocString()). PassingXYZ() may also free the BSTR and then re-allocate a new one for “receivedXYZ”. The SysReAllocString() API is a good example of an API that takes an [in, out] BSTR parameter. In this case, “receivedXYZ” should be a BSTR*.

      7. Concerning catching of exceptions due to access violations (e.g. when SysStringByteLen() is called on an invalid BSTR), use Structured Exception Handling (SEH) instead of using C++ exception handling, e.g. :

      void PassingXYZ(BSTR *receivedXYZ)
      {
      __try
      {
      int length = SysStringByteLen( *receivedXYZ);




      }
      __except(EXCEPTION_EXECUTE_HANDLER)
      {



      }
      }

      – Bio.

      Posted by Lim Bio Liong | September 3, 2012, 4:21 am

Trackbacks/Pingbacks

  1. Pingback: Writing to Data Exported From a DLL in Managed Code. « limbioliong - November 17, 2011

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: