Python custom exceptions in C(++) extensions

Python is a nice scripting language, but it has the fame of being difficult to embed and interact with existing projects.

Until now, I have been able to embed it in the software I work on with pybind11 and Binder (half of my MS degree thesis is about that), I had some problems, but I was able to solve most of them. But recently I have experienced a problem that is quite uncommon on the Internet: declaring a new exception in a native (C++/C module) with some custom methods/properties.

The existing proposal

I quickly looked on the Internet for the problem. The Python C API makes creating an exception type possible with PyObject *PyErr_NewException(const char *name, PyObject *bases, PyObject *dict): the first parameter is the name of the exception, in the modulename.ExceptionName format, the second parameter is the base or bases of the new exception (can be a PyType, or a tuple of PyTypes), and a dictionary of custom members and methods for the new type.

However both the second and third parameter can be left to NULL: in this case the new type will inherit from PyExc_Exception and will not have customizations. This is the majority of the cases, and by a search on the CPython code I could not find any reference that could use as an example on how to use the dict parameter.

When declaring a custom exception with pybind11, it uses the same function, and adds some template magic to add some sugar on the C++ → Python conversion, but it does not expose the dict param. I did not succeed in using a custom base with the standard exception class, therefore I decided to use what the BDFL gave us, that is directly the C API.

I actually found an example on how to achieve my result, but only one. It is a stack overflow question which does just what I want, but I honestly do not like the idea of running Python code while initializing my own extension.

Moreover, let’s try to check what the interpreter thinks of the objects created by the Stackoverflow solution:

>>> library.MyLibraryError.message
<function message at 0x7ff479cacc20>
>>> library.MyLibraryError.code
<property object at 0x7ff479cb0b30>

Here you can see that they are not actual MyLibraryError attributes, but that they have just been attached to.

By comparison, on a standard exception member, you would see something like that:

>>> OSError.errno
<member 'errno' of 'OSError' objects>

However the solution gave me a good idea to avoid at least using Python code to declare the methods.

Idea #1: using an already declared method

The main obstacle is that it is not possible to just convert a C function to a PyObject, but that happens when you initialize the module, that happens before the creation of the exception. So, it becomes possible to get the function as a PyObject, by using PyObject_GetAttrString. But first, let’s create a Python extension to use as a test. Python examples alwyas use spam, or similar Monthy Python stuff, whereas I will borrow Davie5’04’s buzzwords and many of the memes he always refers to 😁️ .

Our starting code (that we will call bass.c) is this:

#include <Python.h>

static PyObject *SMHError;

static PyObject *bass_omg(PyObject *self, PyObject *args)
{
	PyObject *exc_args = PyTuple_New(2);
	PyTuple_SetItem(exc_args, 0, PyUnicode_FromString("unbelivable"));
	PyTuple_SetItem(exc_args, 1, PyLong_FromLong(420));
	PyErr_SetObject(SMHError, exc_args);
	return NULL;
}

static PyMethodDef BassMethods[] = {
	{"omg",  bass_omg, METH_VARARGS,
		"Raise an exception that will make you say OMG."},
	{NULL, NULL, 0, NULL}
};

static struct PyModuleDef bassmodule = {
	PyModuleDef_HEAD_INIT,
	"bass", NULL, -1, BassMethods
};

PyMODINIT_FUNC PyInit_bass(void)
{
	PyObject *m;

	m = PyModule_Create(&bassmodule);
	if (!m) {
		return NULL;
	}

	SMHError = PyErr_NewException("bass.SMHError", NULL, NULL);
	Py_XINCREF(SMHError);
	if (PyModule_AddObject(m, "SMHError", SMHError) < 0) {
		Py_XDECREF(SMHError);
		Py_CLEAR(SMHError);
		Py_DECREF(m);
		return NULL;
	}

	return m;
}

I built it directly with GCC, instead of using Python utilities. A similar command whould work also on Windows, with clang (without the option -fPIC, because of the way Windows DLLs work, with the .pyd extension instead of .so and the right paths).

gcc -Wall -fPIC -shared -I/usr/include/python3.7 -lpython3.7m bass.c -o bass.so

And in the Python console it will be possible to do something like that:

>>> import bass
>>> bass.omg()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
bass.SMHError: ('unbelivable', 420)

As a first step, we want to obtain only the string on the output. Or, as an even simpler task, we could try to obtain any custom string, to avoid the various reference counts. Let’s add this function to the code:

static PyObject *bass_str(PyObject *self, PyObject *args)
{
	return PyUnicode_FromString("epic");
}

and make it a module function, by modifying BassMethods in this way:

static PyMethodDef BassMethods[] = {
	{"omg",  bass_omg, METH_VARARGS,
		"Raise an exception that will make you say OMG."},
	{"_smh_str",  bass_str, METH_VARARGS, NULL },
	{NULL, NULL, 0, NULL}
};

Now, let’s also try to use the dict argument of PyErr_NewException in PyInit_bass:

	PyObject *members = PyDict_New();
	PyObject *member_str = PyObject_GetAttrString(m, "_smh_str");
	PyDict_SetItemString(members, "__str__", member_str);
	/* PyObject_GetAttrString returns a new reference, and PyDict_SetItemString
	does not steal it, so decrease the ref count. */
	Py_XDECREF(member_str);
	SMHError = PyErr_NewException("bass.SMHError", NULL, members);
	Py_XDECREF(members);
	Py_XINCREF(SMHError);

In this way, we override the default __str__, but it is still recognized as a function that was not declared as a member, and it can be used directly:

>>> import bass
>>> bass.omg()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
bass.SMHError: epic
>>> bass.SMHError.__str__
<built-in function _smh_str>
>>> bass._smh_str()
'epic'

Moreover, I do not have any clue on how to create a property in this way.

Idea #2: let’s analyze PyErr_NewException

But Open Source Software is fantastic, when you do not understand something, you can go having a look at its code.

PyErr_NewException is declared in Python/errors.c: it does some processing of the arguments and then calls PyType_Type.

Accordingly to the official documentation, that is a function that can be called to create a new type. It does not tell useful information on how to pass C functions in dict, but gives a very useful information: the PyObject * object returned by PyErr_NewException is not something special realted to the exceptions, but it is a standard PyTypeObject!

So, we could try to cast it to a pointer of that type, instead of using the dictionary. But first, let’s change a bit the bass_str:

static PyObject *SMHError_tp_str(PyObject *self)
{
	return PyUnicode_FromString("epic");
}

Then, remove it from BassMethods and modify also PyInit_bass:

	PyTypeObject *smherr_type = (PyTypeObject *)SMHError;
	smherr_type->tp_str = SMHError_tp_str;

If we try again with the interpreter we obtain a nice result:

>>> import bass
>>> bass.omg()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
bass.SMHError: epic
>>> bass.SMHError.__str__
<slot wrapper '__str__' of 'BaseException' objects>
>>> bass.SMHError.__str__()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: descriptor '__str__' of 'BaseException' object needs an argument

So, since it is working, we can change the SMHError_tp_str to return the original string, instead.

static PyObject *SMHError_tp_str(PyObject *self)
{
	if (!PyObject_HasAttrString(self, "args")) {
		// This should never happen
		return PyUnicode_FromString("");
	}

	PyObject *ret = NULL;
	// New reference
	PyObject *args = PyObject_GetAttrString(self, "args");
	if (PyTuple_Size(args) < 1) {
		ret = PyObject_Repr(args);
	} else {
		// Already checked, so we could use PyTuple_GET_ITEM
		PyObject *str = PyTuple_GetItem(args, 0);
		ret = PyObject_Str(str);
		/* str is a borrowed reference (no need to decref), whereas ret is a new
		reference that we transfer to the caller */
	}
	Py_XDECREF(args);
	return ret;
}

It’s becoming long, but we now we can see some advancements:

>>> import bass
>>> bass.omg()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
bass.SMHError: unbelivable

Next step: the getter

Since things are going well, let’s try to see if we are lucky also with the code getter. First, create the function and put it in an array with the getters/setters:

static PyObject *SMHError_getcode(PyObject *self, void *closure)
{
	// Like before, but with less checks, it is okay for this function to fail
	PyObject *args = PyObject_GetAttrString(self, "args");
	if (!args) {
		return NULL;
	}

	PyObject *ret = PyTuple_GetItem(args, 1);
	Py_XINCREF(ret);
	Py_XDECREF(args);
	return ret;
}

static PyGetSetDef SMHError_getsetters[] = {
	{"code", SMHError_getcode, NULL, NULL, NULL},
	{NULL}
};

Then, as usual, pass the newly created array in PyInit_bass:

smherr_type->tp_getset = SMHError_getsetters;

An test it:

>>> import bass
>>> ex = bass.SMHError('test', 1)
>>> ex.code
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'SMHError' object has no attribute 'code'

This time we were not so lucky. By looking in the documentation, somewhere we find that the PyType_Ready would copy all the relevant information to the tp_dict field of the type, that I think is the dict we have not understood how to populate, yet.

We are not calling that function, but even if we did, nothing would change: that function has already been called by PyType_Type. Calling it again does not do anything, but we can go seeing what it does on the source (Objects/typeobject.c). This teaches us (or at least me 😄️) about the existence of the Descriptors, that are the objects that translate the C functions to the items of that dictionary!

However they need to have the type object, so we cannot call them before the PyErr_NewException, for which the dict attribute continues to be useless for our purposes.

Long story short, replace the assignment of tp_getset with the following code (freely inspired by the private function add_getset of the Python interpreter itself, which performs also some additional checks):

	PyObject *descr = PyDescr_NewGetSet(smherr_type, SMHError_getsetters);
	if (PyDict_SetItem(smherr_type->tp_dict, PyDescr_NAME(descr), descr) < 0) {
		Py_DECREF(m);
		Py_DECREF(descr);
		return NULL;
	}
	Py_DECREF(descr);

Et voilà!

>>> import bass
>>> try:
...     bass.omg()
... except bass.SMHError as e:
...     ex = e
...
>>> ex
SMHError('unbelivable', 420)
>>> ex.code
420
>>> str(ex)
'unbelivable'
>>> bass.SMHError.code
<attribute 'code' of 'SMHError' objects>

That was hard, but finally we have what we wanted!

Of course, we could have also customized tp_repr but the default one is okay for me.

Amongst the descriptor functions, we could create also some methods, but __str__ is special, and have its own slot in the PyTypeObject, so it’s better to keep it like that.

So, what about OSErrors and the others?

I did not think of looking for their declaration up until I managed to create my custom exception, but at this point it is interesting to compare also with what Python does with their standard exceptions.

The file to look in is Objects/exceptions.c: all the basic exceptions are declared here, with some macros, in particular ComplexExtendsException for OSError and similar ones.

This macro populates a PyTypeObject, then the PyType_Ready is called for them in the _PyExc_Init function, one of the few non static function, called then during the interpreter initialization by pycore_init_types. Again, another evidence that Python keeps things simple: an exception is just a normal class that inherits from Exception, or to be more precise, form BaseException, which can be verified with a few Python lines:

>>> class A:
...     pass
...
>>> raise A
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: exceptions must derive from BaseException

Back to C++

Arriving up to here was long and required me some hours (actually I got things sorted out while I was writing the article), but I am not satisfied, yet! I still need to raise these custom exception from C++, so I think I can obtain a hybrid solution with pybind11!

Up until now, we saw that we have two possibilities: creating a class as a child of Exception, or creating the exception with the C API and then use pybind11 only to convert C++ exceptions.

I would prefer the first method, because it involves basically no Python C API and has fully automated reference count with RAII. So, let’s try with that, without the pybind11 exception conversion code, to start:

#include <exception>

#include <pybind11/pybind11.h>
namespace py = pybind11;

class InstrumentError : public std::exception {
	std::string m_message;
	int mErrorCode;

public:
	InstrumentError(const char *message, int code)
	{
		m_error_code = code;
	}

	const char *what() const noexcept
	{
		return m_message.c_str();
	}

	int get_code() const
	{
		return m_error_code;
	}

	py::str __str__() const
	{
		return py::str(m_message);
	}
};

void pick() {
	throw InstrumentError("Highly illegal", 666);
}

PYBIND11_MODULE(guitar, m) {
	m.def("pick", &pick);

	/* Notice the dynamic_attr: is is needed because native Python calsses support
	adding custom attributes to an internal dictionary.
	Also Exceptions allow doing that, and it seems it is compulsory for derived
	classes to allow that. See:
	https://pybind11.readthedocs.io/en/stable/classes.html#dynamic-attributes */
	py::class_<InstrumentError>(
			m, "InstrumentError", py::handle(PyExc_Exception), py::dynamic_attr())
		.def("__str__", &InstrumentError::__str__)
		.def_property_readonly("code", &InstrumentError::get_code);
}

So, if we run this:

>>> import guitar
Segmentation fault

Ouch! GDB (or any other debugger) can tell us that pybind11 is trying to get some internal data about PyExc_Exception, but it fails because it has internal data only of pybind11-registered types. I think that going around this “bug” or “unsupported feature” (I do not know how I should call that) might just become a mess, so the only left method is the C API, but this time we can simplify a bit the callbacks with pybind11 utilities around Python types.

I will not repeat the code of the extension and of the pick function, as they do not need to be changed (actually you can remove the __str__ method, it will not be used).

static PyObject *InstrumentError_tp_str(PyObject *selfPtr)
{
	py::str ret;
	try {
		py::handle self(selfPtr);
		py::tuple args = self.attr("args");
		ret = py::str(args[0]);
	} catch (py::error_already_set &e) {
		ret = "";
	}

	/* ret will go out of scope when returning, therefore increase its reference
	count, and transfer it to the caller (like PyObject_Str). */
	ret.inc_ref();
	return ret.ptr();
}

static PyObject *InstrumentError_getcode(PyObject *selfPtr, void *closure)
{
	try {
		py::handle self(selfPtr);
		py::tuple args = self.attr("args");
		py::object code = args[1];
		code.inc_ref();
		return code.ptr();
	} catch (py::error_already_set &e) {
		/* We could simply backpropagate the exception with e.restore, but
		exceptions like OSError return None when an attribute is not set. */
		py::none ret;
		ret.inc_ref();
		return ret.ptr();
	}
}

static PyGetSetDef InstrumentError_getsetters[] = {
	{"code", InstrumentError_getcode, NULL, NULL, NULL},
	{NULL}
};

/* Python conventions on code style would call this only InstrumentError, but
that name is now used by the class. */
static PyObject *PyInstrumentError;

PYBIND11_MODULE(guitar, m) {
	m.def("pick", &pick);

	PyInstrumentError = PyErr_NewException("guitar.InstrumentError", NULL, NULL);
	if (PyInstrumentError) {
		PyTypeObject *as_type = reinterpret_cast<PyTypeObject *>(PyInstrumentError);
		as_type->tp_str = InstrumentError_tp_str;
		PyObject *descr = PyDescr_NewGetSet(as_type, InstrumentError_getsetters);
		auto dict = py::reinterpret_borrow<py::dict>(as_type->tp_dict);
		dict[py::handle(PyDescr_NAME(descr))] = py::handle(descr);

		Py_XINCREF(PyInstrumentError);
		m.add_object("InstrumentError", py::handle(PyInstrumentError));
	}

	py::register_exception_translator([](std::exception_ptr p) {
		try {
			if (p) {
				std::rethrow_exception(p);
			}
		} catch (InstrumentError &e) {
			py::tuple args(2);
			args[0] = e.what();
			args[1] = e.get_code();
			PyErr_SetObject(PyInstrumentError, args.ptr());
		}
	});
}

So now, compile and once again try to see what we obtain:

>>> import guitar
>>> try:
...     guitar.pick()
... except guitar.InstrumentError as e:
...     ex = e
...
>>> ex
InstrumentError('Highly illegal', 666)
>>> str(ex)
'Highly illegal'
>>> ex.code
666

Finally, we reached our objective!

Conclusions

Binding of custom exceptions are possible, but not very spread. On the Internet I found only an example on Stack Overflow but it was too compilcated.

But «Simple is better than complex.» and «Complex is better than complicated.»: in any tutorial about custom exceptions in Python, they tell you to inherit from Exception, just do that also with the C API, then.

And if you are using pybind11, you need to do that with the C API, but at least you can use pybind11 wrappers around Python objects.