Thursday, April 3, 2008

Introducing epy-plugin


Let's say you like programming in Python; and let's say you have an application that supports extension plugins (e.g. Total Commander) and you have an idea for such a plugin. How would you use Python to write said plugin?

Normally, if you wanted to use a scripting language to write a plugin for an application, you'd have to embed the interpreter for that language in a DLL and export native functions for the plugin's interface -- stubs that call into the interpreted language. These stubs would have to deal with the parameter marshaling between the two languages. The result would be a DLL that only works with that plugin interface.

Using the ctypes module (which is part of the standard library as of Python 2.5 and inspired me with this process) it is possible to create wrappers for Python functions that look like native functions to the outside world (i.e. ctypes deals with the trouble of creating appropriate stubs for our Python functions). So all we have to do now is come up with a generic DLL that simply forwards calls to the ctypes generated wrappers and the native calls will automagically land in our Python code. The goal of the epy-plugin project is to create THAT generic DLL.

How it works (what I've come up with so far):

  • the Python side: we mark all the functions (callables) we want to expose; these basically represent the plugin interface we want to implement in Python. We wrap these functions using ctypes and we compile a list of tuples (apiName, apiAddress) that we pass to the DLL.
  • the DLL (C/C++) side: at load time we get the list of APIs exposed by our script and we patch the DLL's IMAGE_EXPORT_DIRECTORY, redirecting all exported functions to the addresses of the ctypes generated wrappers.
In the case of Run-time Dynamic Linking (the case of extension plugins) our DLL's export directory doesn't need to be created in advance. For Load-time Dynamic Linking the export directory must already contain the API list because the PE loader checks the exported symbols before calling the DLL's entry-point.
At the moment, the code in the repository only works with run-time dynamic linking: the export directory only contains a dummy export and the real one is generated on-the-fly at module loading time.

Because CPython doesn't support multiple interpreters in the same address space (the Global Interpreter Lock being kind-of-a "Global Process Lock"), there are some issues with epy-plugin:
  • you can't have more than one python-based plugin in a process; for example, in the current implementation of epy-plugin, when one of them is unloaded the rest will break because of the call to Py_Finalize() which kills the interpreter.
  • the application for which you write the plugin can't already embed Python.
One possible route to investigate would be to make epy-plugin aware of other instances of itself in the same process and fiddle with PyInterpreterState and PyThreadState objects. This will most likely imply that we can't use straight redirection of function calls anymore, and some C/C++ stubs need to be generated for each exported function, which set the right interpreter/thread-state for the current plugin (this might not solve the issue with the application already embedding Python through other means than epy-plugin).

The other route (which I'm more inclined to follow) would be to not share python25.dll at all, by using something like Joachim Bauch's MemoryModule to privately load the Python interpreter and it's extension dependencies into each instance of epy-plugin. That way the interpreter instances will run completely isolated and there would be no worries of someone else besides our script messing with the Python runtime. Of course, there is no guarantee that this method works with something as complex as the Python interpreter...

Note: while searching the web for solutions to the "multiple python interpreters in the same address space" problem, I came across an entry on the ctypes wiki that confirms it is possible to privately load multiple Python interpreters (and also shows someone else thought about writing plugins in Python).