2018年8月24日 星期五

Interfacing Python and C: Advanced “ctypes” Features

Interfacing Python and C: Advanced “ctypes” Features


Learn advanced patterns for interfacing Python with native libraries, like dealing with C structs from Python and pass-by-value versus pass-by-reference semantics.
Python ctypes Tutorial
The built-in ctypes module is a powerful feature in Python, allowing you to use existing libraries in other languages by writting simple wrappers in Python itself.
In the first part of this tutorial, we covered the basics of ctypes. In part two we will dig a little deeper, covering:
  • Creating simple Python classes to mirror C structures
  • Dealing with C pointers in Python: Pass-by-value vs Pass-by-reference
  • Expanding our C structure wrappers to hide complexity from Python code
  • Interacting with nested C structures from Python
Again, let’s start by taking a look with the simple C library we will be using and how to build it, and then jump into loading a C library and calling functions in it.

Interfacing Python and C: The C Library Testbed

As with the previous tutorial, all of the code to build and test the examples discussed here (as well as the Markdown for this article) are committed to my GitHub repository.
The library consists of two data structures: Point and Line. A Point is a pair of (x, y) coordinates while a Line has a start and end point. There are also a handful of functions which modify each of these types.
Let’s take a closer look at the Point structure and the functions surrounding it. Here’s the corresponding C code split into a Point.h header file and a Point.cimplementation:
/* Point.h */
/* Simple structure for ctypes example */
typedef struct {
    int x;
    int y;
} Point;
/* Point.c */
/* Display a Point value */
void show_point(Point point) {
    printf("Point in C      is (%d, %d)\n", point.x, point.y);
}

/* Increment a Point which was passed by value */
void move_point(Point point) {
    show_point(point);
    point.x++;
    point.y++;
    show_point(point);
}

/* Increment a Point which was passed by reference */
void move_point_by_ref(Point *point) {
    show_point(*point);
    point->x++;
    point->y++;
    show_point(*point);
}

/* Return by value */
Point get_point(void) {
    static int counter = 0;
    Point point = { counter++, counter++ };
    printf("Returning Point    (%d, %d)\n", point.x, point.y);
    return point;
}
I won’t go into each of these functions in detail as they are fairly straightforward. The most interesting bit here is the difference between move_point and move_point_by_ref. We’ll talk a bit later about this when we discuss pass-by-value and pass-by-reference semantics.
We’ll also be using a Line structure, which is composed of two Points:
/* Line.h */
/* Compound C structure for our ctypes example */
typedef struct {
    Point start;
    Point end;
} Line;
/* Line.c */
void show_line(Line line) {
    printf("Line in C      is (%d, %d)->(%d, %d)\n",
           line.start.x, line.start.y,
           line.end.x, line.end.y);
}

void move_line_by_ref(Line *line) {
    show_line(*line);
    move_point_by_ref(&line->start);
    move_point_by_ref(&line->end);
    show_line(*line);
}

Line get_line(void) {
    Line l = { get_point(), get_point() };
    return l;
}
The Point structure and its associated functions will allow us to show how to wrap structures and deal with memory references in ctypes. The Line structure will allow us to work with nested structures and the complications that arise from that.
The Makefile in the repo is set up to completely build and run the demo from scratch:
all: point wrappedPoint line

clean:
    rm *.o *.so

libpoint.so: Point.o
    gcc -shared $^ -o $@

libline.so: Point.o Line.o
    gcc -shared $^ -o $@

.o: .c
    gcc -c -Wall -Werror -fpic $^

point: libpoint.so
    ./testPoint.py

wrappedPoint: libpoint.so
    ./testWrappedPoint.py

line: libline.so
    ./testLine.py

doc:
    pandoc ctypes2.md > ctypes2.html
    firefox ctypes2.html
To build and run the demo you only need to run the following command in your shell:
$ make

Creating Simple Python Classes to Mirror C Structures

Now that we’ve seen the C code we’ll be using, we can start in on Python and ctypes. We’ll start with a quick wrapper function that will simplify the rest of our code, then we’ll look at how to wrap C structures. Finally, we’ll discuss dealing with C pointers from Python and the differences between pass-by-value and pass-by-reference.

Wrapping ctypes Functions

Before we get into the depths of this tutorial, I’ll show you a utility function we’ll be using throughout. This Python function is called wrap_function. It takes the object returned from ctypes.CDLL and the name of a function (as a string). It returns a Python object which holds the function and the specified restype and argtypes:
def wrap_function(lib, funcname, restype, argtypes):
    """Simplify wrapping ctypes functions"""
    func = lib.__getattr__(funcname)
    func.restype = restype
    func.argtypes = argtypes
    return func
These are concepts covered in my previous ctypes tutorial, so if this doesn’t make sense, it might be worth reviewing part one again.

Mirroring C Structures with Python Classes

Creating Python classes which mirror C structs requires little code, but does have a little magic behind the scenes:
class Point(ctypes.Structure):
    _fields_ = [('x', ctypes.c_int), ('y', ctypes.c_int)]

    def __repr__(self):
        return '({0}, {1})'.format(self.x, self.y)
As you can see above, we make use of the _fields_ attribute of the class. Please note the single underscore—this is not a “dunder” function. This attribute is a list of tuples and allows ctypes to map attributes from Python back to the underlying C structure.
Let’s look at how it’s used:
>>> libc = ctypes.CDLL('./libpoint.so')
>>> show_point = wrap_function(libc, 'show_point', None, [Point])
>>> p = Point(1, 2)
>>> show_point(p)
'(1, 2)'
Notice that we can access the x and y attributes of the Point class in Python in the __repr__ function. We can also pass the Point directly to the show_pointfunction in the C library. Ctypes uses the _fields_ map to manage the conversions automatically for you. Care should be taken with using the _fields_attribute, however. We’ll look at this in a little more detail in the nested structures section below.

Pass-by-value vs Pass-by-reference (pointers)

In Python we get used to referring to things as either mutable or immutable. This controls what happens when you modify an object you’ve passed to a function. For example, number objects are immutable. When you call myfunc in the code below, the value of y does not get modified. The program prints the value 9:
def myfunc(x):
    x = x + 2

y = 9
myfunc(y)
print("this is y", y)
Contrarily, list objects are mutable. In a similar function:
def mylistfunc(x):
    x.append("more data")

z = list()
mylistfunc(z)
print("this is z", z)
As you can see, the list, z, that is passed in to the function is modified and the output is this is z ['more data']
When interfacing with C, we need to take this concept a step further. When we pass a parameter to a function, C always “passes by value”. What this means is that, unless you pass in a pointer to an object, the original object is never changed. Applying this to ctypes, we need to be aware of which values are being passed as pointers and thus need the ctypes.POINTER(Point) type applied to them.
In the example below, we have two versions of the function to move a point: move_point, which passes by value, and move_point_by_ref which passes by reference.
# --- Pass by value ---
print("Pass by value")
move_point = wrap_function(libc, 'move_point', None, [Point])
a = Point(5, 6)
print("Point in Python is", a)
move_point(a)
print("Point in Python is", a)
print()
# --- Pass by reference ---
print("Pass by reference")
move_point_by_ref = wrap_function(libc, 'move_point_by_ref', None,
                                  [ctypes.POINTER(Point)])
a = Point(5, 6)
print("Point in Python is", a)
move_point_by_ref(a)
print("Point in Python is", a)
print()
The output from these two code sections looks like this:
Pass by value
Point in Python is (5, 6)
Point in C      is (5, 6)
Point in C      is (6, 7)
Point in Python is (5, 6)

Pass by reference
Point in Python is (5, 6)
Point in C      is (5, 6)
Point in C      is (6, 7)
Point in Python is (6, 7)
As you can see, when we call move_point, the C code can change the value of the Point, but that change is not reflected in the Python object. When we call move_point_by_ref, however, the change is visible in the Python object. This is because we passed the address of the memory which holds that value and the C code took special care (via using the -> accessor) to modify that memory.
When working in cross-language interfaces, memory access and memory management are important aspects to keep in mind.

Accessing C Structs from Python – An OOP Wrapper

We saw above that providing a simple wrapper to a C structure is quite easy using ctypes. We can also expand this wrapper to make it behave like a “proper” Python class instead of a C struct using object-oriented programming principles.
Here’s an example:
class Point(ctypes.Structure):
    _fields_ = [('x', ctypes.c_int), ('y', ctypes.c_int)]

    def __init__(self, lib, x=None, y=None):
        if x:
            self.x = x
            self.y = y
        else:
            get_point = wrap_function(lib, 'get_point', Point, None)
            self = get_point()

        self.show_point_func = wrap_function(lib, 'show_point', None, [Point])
        self.move_point_func = wrap_function(lib, 'move_point', None, [Point])
        self.move_point_ref_func = wrap_function(lib, 'move_point_by_ref', None,
                                                 [ctypes.POINTER(Point)])

    def __repr__(self):
        return '({0}, {1})'.format(self.x, self.y)

    def show_point(self):
        self.show_point_func(self)

    def move_point(self):
        self.move_point_func(self)

    def move_point_by_ref(self):
        self.move_point_ref_func(self)
You’ll see the _fields_ and __repr__ attributes are the same as we had in our simple wrapper, but now we’ve added a constructor and wrapping functions for each method we’ll use.
The interesting code is all in the constructor. The initial part initializes the x and yfields. You can see that we have two methods to achieve this. If the user passed in values, we can directly assign those to the fields. If the default values were used, we call the get_point function in the library and assign that directly to self.
Once we’ve initialized the fields in our Point class, we then wrap the functions into attributes of our class to allow them to be accessed in a more object orientedmanner.
In the testWrappedPoint module, we do the same tests we did with our Point class but instead of passing the Point class to the function, move_point_by_ref(a), we call the function on the object a.move_point_by_ref().

Accessing Nested C Structures From Python

Finally, we’re going to look at how to use nested structures in ctypes. The obvious next step in our example is to extend a Point to a Line:
class Line(ctypes.Structure):
    _fields_ = [('start', testPoint.Point), ('end', testPoint.Point)]

    def __init__(self, lib):
        get_line = wrap_function(lib, 'get_line', Line, None)
        line = get_line()
        self.start = line.start
        self.end = line.end
        self.show_line_func = wrap_function(lib, 'show_line', None, [Line])
        self.move_line_func = wrap_function(lib, 'move_line_by_ref', None,
                                            [ctypes.POINTER(Line)])

    def __repr__(self):
        return '{0}->{1}'.format(self.start, self.end)

    def show_line(self):
        self.show_line_func(self)

    def moveLine(self):
        self.move_line_func(self)
Most of this class should look fairly familiar if you’ve been following along. The one interesting difference is how we initialize the _fields_ attribute. You’ll remember in the Point class we could assign the returned value from get_point() directly to self. This doesn’t work with our Line wrapper as the entries in the _fields_ list are not basic CTypes types, but rather a subclass of one of them. Assigning these directly tends to mess up how the value is stored so that the Python attributes you add to the class are inaccessible.
The basic rule I’ve found in wrapping structures like this is to only add the Python class attributes at the top level and leave the inner structures (i.e. Point) with the simple _fields_ attribute.

Advanced ctypes Features – Conclusion

In this tutorial we covered some more advanced topics in using the ctypesmodule to interface Python with external C libraries. I found several resources out there while researching:
  • The ctypesgen project has tools which will auto generate Python wrapping modules for C header files. I spent some time playing with this and it looks quite good.
  • The idea for the wrap_function function was lifted shamelessly from some ctypes tips here.
In the first part of this tutorial, we covered the basics of ctypes, so be sure to check there if you’re looking for a ctypes primer. And, finally, if you’d like to see and play with the code I wrote while working on this, please visit my GitHub repository. This tutorial is in the tutorial2 directory.
This article was filed under: ctypesoptimizationprogramming, and python.

Related Articles:
  • Extending Python With C Libraries and the “ctypes” Module – An end-to-end tutorial of how to extend your Python programs with libraries written in C, using the built-in “ctypes” module.
  • Interfacing Python and C: The CFFI Module – How to use Python’s built-in CFFI module for interfacing Python with native libraries as an alternative to the “ctypes” approach.
  • Fundamental Data Structures in Python – In this article series we’ll take a tour of some fundamental data structures and implementations of abstract data types (ADTs) available in Python’s standard library.
  • What Are Python Generators? – Generators are a tricky subject in Python. With this tutorial you’ll make the leap from class-based iterators to using generator functions and the “yield” statement in no time.
  • Debugging memory usage in a live Python web app – I worked on a Python web app a while ago that was struggling with using too much memory in production. A helpful technique for debugging this issue was adding a simple API endpoint that exposed memory stats while the app was running.

About the Author

Jim Anderson
Jim has been programming for a long time in a variety of languages. He has worked on embedded systems, built distributed build systems, done off-shore vendor management, and sat in many, many meetings.
He currently gets paid for writing C++ for embedded systems, but loves Python. Jim is a proud member of PythonistaCafe and blogs over at snowboardingcoder.com.

沒有留言:

張貼留言