Inheriting From a Native C++ Class in C#
Hi, this is Jim Springfield, an architect on the Visual C++ team. I have blogged in the past about our IDE and Intellisense work. I am still heavily focused on that and we are working hard to deliver an improved experience, but this post is about a completely different topic. A few months ago, I started thinking about how to access C++ classes from managed code and came up with this technique, which I haven’t seen mentioned anywhere else.
There are many ways that native code and managed code can interact and call each other. If you have native code that you want to call from C# you have several choices depending on the nature of the API. If you have a flat “C” API, you can use P/Invoke to directly call the API. If the native code is exposed using COM, the CLR’s COM Interop can provide access. If you have a C++ class, you could go add COM support, or write a custom wrapper using C++/CLI and expose a new managed class.
I really wanted something more direct than these. Initially, I was just trying to see if I could call a native C++ class from C#, but as I started playing with it, I realized that I could actually “inherit” from the native class. I put “inherit” in quotes, because you could make an argument that it isn’t truly inheritance, but I will let the reader make the final decision.
Let’s say I have a C++ class exposed from a DLL that I want to consume in C#. The class looks like the following.
class __declspec(dllexport) CSimpleClass {
public:
int value;
CSimpleClass(int value) : value(value)
{
}
~CSimpleClass()
{
printf("~CSimpleClass
");
}
void M1()
{
printf("C++/CSimpleClass::M1()
");
V0();
V1(value);
V2();
}
virtual void V0()
{
printf("C++/CSimpleClass::V0()
");
}
virtual void V1(int x)
{
printf("C++/CSimpleClass::V1(%d)
", x);
}
virtual void V2()
{
printf("C++/CSimpleClass::V2()
", value);
}
};
The __declspec(dllexport) means that the class is exported from the DLL. What this really means is that all of the class methods are exported from the DLL. If I look at the list of exports using dumpbin.exe or depends.exe, I see the following list of exports.
??0CSimpleClass@@QAE@ABV0@@Z
??0CSimpleClass@@QAE@H@Z
??1CSimpleClass@@QAE@XZ
??4CSimpleClass@@QAEAAV0@ABV0@@Z
??_7CSimpleClass@@6B@
?M1@CSimpleClass@@QAEXXZ
?V0@CSimpleClass@@UAEXXZ
?V1@CSimpleClass@@UAEXH@Z
?V2@CSimpleClass@@UAEXXZ
These are decorated (i.e. “mangled”) names. For most of these, you can probably guess what the name is actually referring to.
(Note: Name mangling may change between versions of C++ and mangling is different between x86, x64, and Itanium platforms. The example here works on both VS2008 and the CTP release of VS2010.)
There is a nifty tool called undname.exe that ships with Visual Studio, which can take a mangled name and undecorate it. Running it on each of the names above gives the corresponding output.
public: __thiscall CSimpleClass::CSimpleClass(int)
public: __thiscall CSimpleClass::~CSimpleClass(void)
public: class CSimpleClass & __thiscall CSimpleClass::operator=(class CSimpleClass const &)
const CSimpleClass::`vftable'
public: void __thiscall CSimpleClass::M1(void)
public: virtual void __thiscall CSimpleClass::V0(void)
public: virtual void __thiscall CSimpleClass::V1(int)
public: virtual void __thiscall CSimpleClass::V2(void)
Other than the methods we explicitly defined, there is also a compiler generated assignment operator and a reference to the vtable for this class. OK, so I know that using P/Invoke, C# can call into native DLL entry points, and I just happen to have a list of native entry points.
First, however, we need to define a structure in C# that corresponds to the native class. Our native class only has one field: an int. However, it does have virtual methods, so there is also a vtable pointer at the beginning of the class.
(Note: I am only dealing with single inheritance here. With multiple inheritance, there are multiple vtables and vtable pointers.)
[StructLayout(LayoutKind.Sequential, Pack = 4)]
public unsafe struct __CSimpleClass
{
public IntPtr* _vtable;
public int value;
}
Next, I am going to define a C# class that wraps the native class and mimics it. I want to expose synchronous destruction, so the C# equivalent of that is implementing IDisposable, which I do here. I also create a matching constructor and the “M1” method of CSimpleClass. I use “DllImport” to specify the DLL name, entrypoint, and calling convention. The “ThisCall” convention is the default for C++ member functions.
(Note: to be safer, I should explicitly specify calling conventions and structure packing in my native code, but that is left out for brevity. If they aren’t explicitly specified, compiler options can change the defaults.)
There are calls in the code below to Memory.Alloc and Memory.Free. These were implemented by me and just forward to HeapAlloc/Free in kernel32.dll.
public unsafe class CSimpleClass : IDisposable
{
private __CSimpleClass* _cpp;
// CSimpleClass constructor and destructor
[DllImport("cppexp.dll", EntryPoint = "??0CSimpleClass@@QAE@H@Z", CallingConvention = CallingConvention.ThisCall)]
private static extern int _CSimpleClass_Constructor(__CSimpleClass* ths, int value);
[DllImport("cppexp.dll", EntryPoint = "??1CSimpleClass@@QAE@XZ", CallingConvention = CallingConvention.ThisCall)]
private static extern int _CSimpleClass_Destructor(__CSimpleClass* ths);
// void M1();
[DllImport("cppexp.dll", EntryPoint = "?M1@CSimpleClass@@QAEXXZ", CallingConvention = CallingConvention.ThisCall)]
private static extern void _M1(__CSimpleClass* ths);
public CSimpleClass(int value)
{
//Allocate storage for object
_cpp = (__CSimpleClass*)Memory.Alloc(sizeof(__CSimpleClass));
//Call constructor
_CSimpleClass_Constructor(_cpp, value);
}
public void Dispose()
{
//call destructor
_CSimpleClass_Destructor(_cpp);
//release memory
Memory.Free(_cpp);
_cpp = null;
}
public void M1()
{
_M1(_cpp);
}
}
So, at this point I can create a CSimpleClass in C# and call the “M1” method like this. The “using” statement defines a scope. At the end of the scope, Dispose() will automatically be called on sc.
static void Main(string[] args)
{
CSimpleClass sc = new CSimpleClass(10);
using (sc)
{
//M1 calls all of the virtual functions V0,V1,V2
sc.M1();
}
}
Running this code gives me the following output on the console. M1 calls each of the virtual functions V0, V1, and V2.
C++/CSimpleClass::M1()
C++/CSimpleClass::V0()
C++/CSimpleClass::V1(10)
C++/CSimpleClass::V2()
OK, this is pretty cool, right? That’s what I was thinking anyway. A couple of days later, I picked up this code again and started thinking that it would be really cool if I could override a virtual function. I’ve already got the vtable pointer in my __CSimpleClass structure. I know that the vtable pointer points to an array of function pointers, at least in the simple single inheritance case. (Multiple inheritance and virtual inheritance can add some significant wrinkles to this.) If I can change a function in the vtable, then I’ve overridden it. The vtables themselves are shared by all instances of a class, so I can’t just go pound a slot in the vtable with my own function pointer. I need to actually create my own vtable.
I need to construct an array of native pointers to my virtual method overrides and replace the vtable pointer with a pointer to my vtable. As it turns out, the .Net libraries provides a mechanism to implement callbacks from native code. This is Marshal.GetFunctionPointerForDelegate, and it works just fine for our needs.
First of all, we need to use DllImport to get access to the virtual functions we are overriding. This is just like what we did to access the M1 method above. The example below only shows the code for V1, but we actually need it for V0 and V2 as well. I chose V1 for the example as it is the only virtual that takes a parameter. The others take no arguments.
[DllImport("cppexp.dll", EntryPoint = "?V1@CSimpleClass@@UAEXH@Z", CallingConvention = CallingConvention.ThisCall)]
private static extern void _V1(__CSimpleClass* ths, int i);
Now, we need to implement our override in the managed version of CSimpleClass. It simple forwards to the _V1 that we defined above, which is a direct call to the native version in cppexp.dll.
public virtual void V1(int i)
{
_V1(_cpp, i);
}
The tricky part is to get our new virtual function V1 into the vtable. This can be done by creating a delegate in our class. We declare a delegate and specify an instance of it. Again, we need to do this for V0 and V2 as well.
public delegate void V1_Delegate(int i);
public V1_Delegate _v1_Delegate;
In our C# CSimpleClass constructor, we need to create the delegates, use Marshal.GetFunctionPointerForDelegate for each delegate, put them into an array, and override the vtable pointer in the native class. Here is what the final class looks like. We remember the old vtable pointer as well, so that we can reset it in the Dispose method to the old value. C++ differs from C# in this regard in that as an object is constructed, its vtable pointer will change to match the level in the inheritance. If you look closely, you will see two other helper functions that I’ve defined: InitVtable and ResetVtable. InitVtable does the work of copying the function pointers from the managed array into some native memory and then patching the vtable of the object. ResetVtable puts the old vtable pointer back and frees the memory of the created vtable. In C++, a single copy of the vtable is shared by all instances of a class, but here we create a unique vtable for each instance. This is needed as the delegates encompass the actual managed object itself rather than just a pointer to a method that takes a “this” pointer. We don’t actually use the “this” pointer that is passed to us from native code as the delegate implicitly knows the managed object and the managed object contains a pointer to the native object.
public unsafe class CSimpleClass : IDisposable
{
private __CSimpleClass* _cpp;
private IntPtr* _oldvtbl;
private void InitVtable(__CSimpleClass* ths, IntPtr[] arr, int len)
{
IntPtr* newvtable = (IntPtr*)Memory.Alloc(len * sizeof(IntPtr));
for (int i = 0; i < len; i++)
newvtable[i] = arr[i];
_oldvtbl = ths->_vtable;
ths->_vtable = newvtable;
}
private void ResetVtable(__CSimpleClass* ths)
{
IntPtr* oldvtbl = ths->_vtable;
ths->_vtable = _oldvtbl;
Memory.Free(oldvtbl);
}
// CSimpleClass constructor and destructor
[DllImport("cppexp.dll", EntryPoint = "??0CSimpleClass@@QAE@H@Z", CallingConvention = CallingConvention.ThisCall)]
private static extern int _CSimpleClass_Constructor(__CSimpleClass* ths, int value);
[DllImport("cppexp.dll", EntryPoint = "??1CSimpleClass@@QAE@XZ", CallingConvention = CallingConvention.ThisCall)]
private static extern int _CSimpleClass_Destructor(__CSimpleClass* ths);
// void M1();
// virtual void V0();
// virtual void V1(int x);
// virtual void V2();
[DllImport("cppexp.dll", EntryPoint = "?M1@CSimpleClass@@QAEXXZ", CallingConvention = CallingConvention.ThisCall)]
private static extern void _M1(__CSimpleClass* ths);
[DllImport("cppexp.dll", EntryPoint = "?V0@CSimpleClass@@UAEXXZ", CallingConvention = CallingConvention.ThisCall)]
private static extern void _V0(__CSimpleClass* ths);
[DllImport("cppexp.dll", EntryPoint = "?V1@CSimpleClass@@UAEXH@Z", CallingConvention = CallingConvention.ThisCall)]
private static extern void _V1(__CSimpleClass* ths, int i);
[DllImport("cppexp.dll", EntryPoint = "?V2@CSimpleClass@@UAEXXZ", CallingConvention = CallingConvention.ThisCall)]
private static extern void _V2(__CSimpleClass* ths);
public delegate void V0_Delegate();
public delegate void V1_Delegate(int i);
public delegate void V2_Delegate();
public V0_Delegate _v0_Delegate;
public V1_Delegate _v1_Delegate;
public V2_Delegate _v2_Delegate;
public CSimpleClass(int value)
{
//Allocate storage for object
_cpp = (__CSimpleClass*)Memory.Alloc(sizeof(__CSimpleClass));
//Call constructor
_CSimpleClass_Constructor(_cpp, value);
//Create delegates for the virtual functions
_v0_Delegate = new V0_Delegate(V0);
_v1_Delegate = new V1_Delegate(V1);
_v2_Delegate = new V2_Delegate(V2);
IntPtr[] arr = new IntPtr[3];
arr[0] = Marshal.GetFunctionPointerForDelegate(_v0_Delegate);
arr[1] = Marshal.GetFunctionPointerForDelegate(_v1_Delegate);
arr[2] = Marshal.GetFunctionPointerForDelegate(_v2_Delegate);
//Create a new vtable and replace it in the object
InitVtable(_cpp, arr, 3);
}
public void Dispose()
{
//reset old vtable pointer
ResetVtable(_cpp);
//call destructor
_CSimpleClass_Destructor(_cpp);
//release memory
Memory.Free(_cpp);
_cpp = null;
}
public void M1()
{
_M1(_cpp);
}
public virtual void V0()
{
_V0(_cpp);
}
public virtual void V1(int i)
{
_V1(_cpp, i);
}
public virtual void V2()
{
_V2(_cpp);
}
}
We have a managed CSimpleClass with virtual methods that can be overridden in a derived class. If we create a new C# class that inherits from CSimpleClass, we can override any virtual functions. In CSimpleClassEx, we are overriding V2 and writing out some text.
class CSimpleClassEx : CSimpleClass
{
public CSimpleClassEx(int value)
: base(value)
{
}
public override void V2()
{
Console.WriteLine("C#/CSimpleClassEx.V2()");
}
}
If we create in instance of CSimpleClassEx and call M1, we now get the following output.
C++/CSimpleClass::M1()
C++/CSimpleClass::V0()
C++/CSimpleClass::V1(10)
C#/CSimpleClassEx.V2()
So, what do you think? Is it really inheritance? Is it just a stupid trick? It definitely requires a lot of manual code writing to make this work, but let’s do some blue sky thinking for a bit. It is easy to get a list of exports from the DLL, and the mangled names encapsulate a good bit of information including calling convention, name, return type, parameters. I could probably write a tool to generate this. And if I have the PDB, I could get the structure of the class including data members, structure packing, etc.
Now, back to working on C++ IDE performance and scalability for Visual Studio 2010.