python-3.6.zip added from Github

README.cosmo contains the necessary links.
This commit is contained in:
ahgamut 2021-08-08 09:38:33 +05:30 committed by Justine Tunney
parent 75fc601ff5
commit 0c4c56ff39
4219 changed files with 1968626 additions and 0 deletions

View file

@ -0,0 +1,167 @@
.. highlightlang:: c
.. _building:
*****************************
Building C and C++ Extensions
*****************************
A C extension for CPython is a shared library (e.g. a ``.so`` file on Linux,
``.pyd`` on Windows), which exports an *initialization function*.
To be importable, the shared library must be available on :envvar:`PYTHONPATH`,
and must be named after the module name, with an appropriate extension.
When using distutils, the correct filename is generated automatically.
The initialization function has the signature:
.. c:function:: PyObject* PyInit_modulename(void)
It returns either a fully-initialized module, or a :c:type:`PyModuleDef`
instance. See :ref:`initializing-modules` for details.
.. highlightlang:: python
For modules with ASCII-only names, the function must be named
``PyInit_<modulename>``, with ``<modulename>`` replaced by the name of the
module. When using :ref:`multi-phase-initialization`, non-ASCII module names
are allowed. In this case, the initialization function name is
``PyInitU_<modulename>``, with ``<modulename>`` encoded using Python's
*punycode* encoding with hyphens replaced by underscores. In Python::
def initfunc_name(name):
try:
suffix = b'_' + name.encode('ascii')
except UnicodeEncodeError:
suffix = b'U_' + name.encode('punycode').replace(b'-', b'_')
return b'PyInit' + suffix
It is possible to export multiple modules from a single shared library by
defining multiple initialization functions. However, importing them requires
using symbolic links or a custom importer, because by default only the
function corresponding to the filename is found.
See the *"Multiple modules in one library"* section in :pep:`489` for details.
.. highlightlang:: c
Building C and C++ Extensions with distutils
============================================
.. sectionauthor:: Martin v. Löwis <martin@v.loewis.de>
Extension modules can be built using distutils, which is included in Python.
Since distutils also supports creation of binary packages, users don't
necessarily need a compiler and distutils to install the extension.
A distutils package contains a driver script, :file:`setup.py`. This is a plain
Python file, which, in the most simple case, could look like this:
.. code-block:: python3
from distutils.core import setup, Extension
module1 = Extension('demo',
sources = ['demo.c'])
setup (name = 'PackageName',
version = '1.0',
description = 'This is a demo package',
ext_modules = [module1])
With this :file:`setup.py`, and a file :file:`demo.c`, running ::
python setup.py build
will compile :file:`demo.c`, and produce an extension module named ``demo`` in
the :file:`build` directory. Depending on the system, the module file will end
up in a subdirectory :file:`build/lib.system`, and may have a name like
:file:`demo.so` or :file:`demo.pyd`.
In the :file:`setup.py`, all execution is performed by calling the ``setup``
function. This takes a variable number of keyword arguments, of which the
example above uses only a subset. Specifically, the example specifies
meta-information to build packages, and it specifies the contents of the
package. Normally, a package will contain additional modules, like Python
source modules, documentation, subpackages, etc. Please refer to the distutils
documentation in :ref:`distutils-index` to learn more about the features of
distutils; this section explains building extension modules only.
It is common to pre-compute arguments to :func:`setup`, to better structure the
driver script. In the example above, the ``ext_modules`` argument to
:func:`~distutils.core.setup` is a list of extension modules, each of which is
an instance of
the :class:`~distutils.extension.Extension`. In the example, the instance
defines an extension named ``demo`` which is build by compiling a single source
file, :file:`demo.c`.
In many cases, building an extension is more complex, since additional
preprocessor defines and libraries may be needed. This is demonstrated in the
example below.
.. code-block:: python3
from distutils.core import setup, Extension
module1 = Extension('demo',
define_macros = [('MAJOR_VERSION', '1'),
('MINOR_VERSION', '0')],
include_dirs = ['/usr/local/include'],
libraries = ['tcl83'],
library_dirs = ['/usr/local/lib'],
sources = ['demo.c'])
setup (name = 'PackageName',
version = '1.0',
description = 'This is a demo package',
author = 'Martin v. Loewis',
author_email = 'martin@v.loewis.de',
url = 'https://docs.python.org/extending/building',
long_description = '''
This is really just a demo package.
''',
ext_modules = [module1])
In this example, :func:`~distutils.core.setup` is called with additional
meta-information, which
is recommended when distribution packages have to be built. For the extension
itself, it specifies preprocessor defines, include directories, library
directories, and libraries. Depending on the compiler, distutils passes this
information in different ways to the compiler. For example, on Unix, this may
result in the compilation commands ::
gcc -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -fPIC -DMAJOR_VERSION=1 -DMINOR_VERSION=0 -I/usr/local/include -I/usr/local/include/python2.2 -c demo.c -o build/temp.linux-i686-2.2/demo.o
gcc -shared build/temp.linux-i686-2.2/demo.o -L/usr/local/lib -ltcl83 -o build/lib.linux-i686-2.2/demo.so
These lines are for demonstration purposes only; distutils users should trust
that distutils gets the invocations right.
.. _distributing:
Distributing your extension modules
===================================
When an extension has been successfully build, there are three ways to use it.
End-users will typically want to install the module, they do so by running ::
python setup.py install
Module maintainers should produce source packages; to do so, they run ::
python setup.py sdist
In some cases, additional files need to be included in a source distribution;
this is done through a :file:`MANIFEST.in` file; see :ref:`manifest` for details.
If the source distribution has been build successfully, maintainers can also
create binary distributions. Depending on the platform, one of the following
commands can be used to do so. ::
python setup.py bdist_wininst
python setup.py bdist_rpm
python setup.py bdist_dumb

View file

@ -0,0 +1,335 @@
.. highlightlang:: c
.. _embedding:
***************************************
Embedding Python in Another Application
***************************************
The previous chapters discussed how to extend Python, that is, how to extend the
functionality of Python by attaching a library of C functions to it. It is also
possible to do it the other way around: enrich your C/C++ application by
embedding Python in it. Embedding provides your application with the ability to
implement some of the functionality of your application in Python rather than C
or C++. This can be used for many purposes; one example would be to allow users
to tailor the application to their needs by writing some scripts in Python. You
can also use it yourself if some of the functionality can be written in Python
more easily.
Embedding Python is similar to extending it, but not quite. The difference is
that when you extend Python, the main program of the application is still the
Python interpreter, while if you embed Python, the main program may have nothing
to do with Python --- instead, some parts of the application occasionally call
the Python interpreter to run some Python code.
So if you are embedding Python, you are providing your own main program. One of
the things this main program has to do is initialize the Python interpreter. At
the very least, you have to call the function :c:func:`Py_Initialize`. There are
optional calls to pass command line arguments to Python. Then later you can
call the interpreter from any part of the application.
There are several different ways to call the interpreter: you can pass a string
containing Python statements to :c:func:`PyRun_SimpleString`, or you can pass a
stdio file pointer and a file name (for identification in error messages only)
to :c:func:`PyRun_SimpleFile`. You can also call the lower-level operations
described in the previous chapters to construct and use Python objects.
.. seealso::
:ref:`c-api-index`
The details of Python's C interface are given in this manual. A great deal of
necessary information can be found here.
.. _high-level-embedding:
Very High Level Embedding
=========================
The simplest form of embedding Python is the use of the very high level
interface. This interface is intended to execute a Python script without needing
to interact with the application directly. This can for example be used to
perform some operation on a file. ::
#include <Python.h>
int
main(int argc, char *argv[])
{
wchar_t *program = Py_DecodeLocale(argv[0], NULL);
if (program == NULL) {
fprintf(stderr, "Fatal error: cannot decode argv[0]\n");
exit(1);
}
Py_SetProgramName(program); /* optional but recommended */
Py_Initialize();
PyRun_SimpleString("from time import time,ctime\n"
"print('Today is', ctime(time()))\n");
if (Py_FinalizeEx() < 0) {
exit(120);
}
PyMem_RawFree(program);
return 0;
}
The :c:func:`Py_SetProgramName` function should be called before
:c:func:`Py_Initialize` to inform the interpreter about paths to Python run-time
libraries. Next, the Python interpreter is initialized with
:c:func:`Py_Initialize`, followed by the execution of a hard-coded Python script
that prints the date and time. Afterwards, the :c:func:`Py_FinalizeEx` call shuts
the interpreter down, followed by the end of the program. In a real program,
you may want to get the Python script from another source, perhaps a text-editor
routine, a file, or a database. Getting the Python code from a file can better
be done by using the :c:func:`PyRun_SimpleFile` function, which saves you the
trouble of allocating memory space and loading the file contents.
.. _lower-level-embedding:
Beyond Very High Level Embedding: An overview
=============================================
The high level interface gives you the ability to execute arbitrary pieces of
Python code from your application, but exchanging data values is quite
cumbersome to say the least. If you want that, you should use lower level calls.
At the cost of having to write more C code, you can achieve almost anything.
It should be noted that extending Python and embedding Python is quite the same
activity, despite the different intent. Most topics discussed in the previous
chapters are still valid. To show this, consider what the extension code from
Python to C really does:
#. Convert data values from Python to C,
#. Perform a function call to a C routine using the converted values, and
#. Convert the data values from the call from C to Python.
When embedding Python, the interface code does:
#. Convert data values from C to Python,
#. Perform a function call to a Python interface routine using the converted
values, and
#. Convert the data values from the call from Python to C.
As you can see, the data conversion steps are simply swapped to accommodate the
different direction of the cross-language transfer. The only difference is the
routine that you call between both data conversions. When extending, you call a
C routine, when embedding, you call a Python routine.
This chapter will not discuss how to convert data from Python to C and vice
versa. Also, proper use of references and dealing with errors is assumed to be
understood. Since these aspects do not differ from extending the interpreter,
you can refer to earlier chapters for the required information.
.. _pure-embedding:
Pure Embedding
==============
The first program aims to execute a function in a Python script. Like in the
section about the very high level interface, the Python interpreter does not
directly interact with the application (but that will change in the next
section).
The code to run a function defined in a Python script is:
.. literalinclude:: ../includes/run-func.c
This code loads a Python script using ``argv[1]``, and calls the function named
in ``argv[2]``. Its integer arguments are the other values of the ``argv``
array. If you :ref:`compile and link <compiling>` this program (let's call
the finished executable :program:`call`), and use it to execute a Python
script, such as:
.. code-block:: python
def multiply(a,b):
print("Will compute", a, "times", b)
c = 0
for i in range(0, a):
c = c + b
return c
then the result should be:
.. code-block:: shell-session
$ call multiply multiply 3 2
Will compute 3 times 2
Result of call: 6
Although the program is quite large for its functionality, most of the code is
for data conversion between Python and C, and for error reporting. The
interesting part with respect to embedding Python starts with ::
Py_Initialize();
pName = PyUnicode_DecodeFSDefault(argv[1]);
/* Error checking of pName left out */
pModule = PyImport_Import(pName);
After initializing the interpreter, the script is loaded using
:c:func:`PyImport_Import`. This routine needs a Python string as its argument,
which is constructed using the :c:func:`PyUnicode_FromString` data conversion
routine. ::
pFunc = PyObject_GetAttrString(pModule, argv[2]);
/* pFunc is a new reference */
if (pFunc && PyCallable_Check(pFunc)) {
...
}
Py_XDECREF(pFunc);
Once the script is loaded, the name we're looking for is retrieved using
:c:func:`PyObject_GetAttrString`. If the name exists, and the object returned is
callable, you can safely assume that it is a function. The program then
proceeds by constructing a tuple of arguments as normal. The call to the Python
function is then made with::
pValue = PyObject_CallObject(pFunc, pArgs);
Upon return of the function, ``pValue`` is either *NULL* or it contains a
reference to the return value of the function. Be sure to release the reference
after examining the value.
.. _extending-with-embedding:
Extending Embedded Python
=========================
Until now, the embedded Python interpreter had no access to functionality from
the application itself. The Python API allows this by extending the embedded
interpreter. That is, the embedded interpreter gets extended with routines
provided by the application. While it sounds complex, it is not so bad. Simply
forget for a while that the application starts the Python interpreter. Instead,
consider the application to be a set of subroutines, and write some glue code
that gives Python access to those routines, just like you would write a normal
Python extension. For example::
static int numargs=0;
/* Return the number of arguments of the application command line */
static PyObject*
emb_numargs(PyObject *self, PyObject *args)
{
if(!PyArg_ParseTuple(args, ":numargs"))
return NULL;
return PyLong_FromLong(numargs);
}
static PyMethodDef EmbMethods[] = {
{"numargs", emb_numargs, METH_VARARGS,
"Return the number of arguments received by the process."},
{NULL, NULL, 0, NULL}
};
static PyModuleDef EmbModule = {
PyModuleDef_HEAD_INIT, "emb", NULL, -1, EmbMethods,
NULL, NULL, NULL, NULL
};
static PyObject*
PyInit_emb(void)
{
return PyModule_Create(&EmbModule);
}
Insert the above code just above the :c:func:`main` function. Also, insert the
following two statements before the call to :c:func:`Py_Initialize`::
numargs = argc;
PyImport_AppendInittab("emb", &PyInit_emb);
These two lines initialize the ``numargs`` variable, and make the
:func:`emb.numargs` function accessible to the embedded Python interpreter.
With these extensions, the Python script can do things like
.. code-block:: python
import emb
print("Number of arguments", emb.numargs())
In a real application, the methods will expose an API of the application to
Python.
.. TODO: threads, code examples do not really behave well if errors happen
(what to watch out for)
.. _embeddingincplusplus:
Embedding Python in C++
=======================
It is also possible to embed Python in a C++ program; precisely how this is done
will depend on the details of the C++ system used; in general you will need to
write the main program in C++, and use the C++ compiler to compile and link your
program. There is no need to recompile Python itself using C++.
.. _compiling:
Compiling and Linking under Unix-like systems
=============================================
It is not necessarily trivial to find the right flags to pass to your
compiler (and linker) in order to embed the Python interpreter into your
application, particularly because Python needs to load library modules
implemented as C dynamic extensions (:file:`.so` files) linked against
it.
To find out the required compiler and linker flags, you can execute the
:file:`python{X.Y}-config` script which is generated as part of the
installation process (a :file:`python3-config` script may also be
available). This script has several options, of which the following will
be directly useful to you:
* ``pythonX.Y-config --cflags`` will give you the recommended flags when
compiling:
.. code-block:: shell-session
$ /opt/bin/python3.4-config --cflags
-I/opt/include/python3.4m -I/opt/include/python3.4m -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes
* ``pythonX.Y-config --ldflags`` will give you the recommended flags when
linking:
.. code-block:: shell-session
$ /opt/bin/python3.4-config --ldflags
-L/opt/lib/python3.4/config-3.4m -lpthread -ldl -lutil -lm -lpython3.4m -Xlinker -export-dynamic
.. note::
To avoid confusion between several Python installations (and especially
between the system Python and your own compiled Python), it is recommended
that you use the absolute path to :file:`python{X.Y}-config`, as in the above
example.
If this procedure doesn't work for you (it is not guaranteed to work for
all Unix-like platforms; however, we welcome :ref:`bug reports <reporting-bugs>`)
you will have to read your system's documentation about dynamic linking and/or
examine Python's :file:`Makefile` (use :func:`sysconfig.get_makefile_filename`
to find its location) and compilation
options. In this case, the :mod:`sysconfig` module is a useful tool to
programmatically extract the configuration values that you will want to
combine together. For example:
.. code-block:: pycon
>>> import sysconfig
>>> sysconfig.get_config_var('LIBS')
'-lpthread -ldl -lutil'
>>> sysconfig.get_config_var('LINKFORSHARED')
'-Xlinker -export-dynamic'
.. XXX similar documentation for Windows missing

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,74 @@
.. _extending-index:
##################################################
Extending and Embedding the Python Interpreter
##################################################
This document describes how to write modules in C or C++ to extend the Python
interpreter with new modules. Those modules can not only define new functions
but also new object types and their methods. The document also describes how
to embed the Python interpreter in another application, for use as an extension
language. Finally, it shows how to compile and link extension modules so that
they can be loaded dynamically (at run time) into the interpreter, if the
underlying operating system supports this feature.
This document assumes basic knowledge about Python. For an informal
introduction to the language, see :ref:`tutorial-index`. :ref:`reference-index`
gives a more formal definition of the language. :ref:`library-index` documents
the existing object types, functions and modules (both built-in and written in
Python) that give the language its wide application range.
For a detailed description of the whole Python/C API, see the separate
:ref:`c-api-index`.
Recommended third party tools
=============================
This guide only covers the basic tools for creating extensions provided
as part of this version of CPython. Third party tools like
`Cython <http://cython.org/>`_, `cffi <https://cffi.readthedocs.io>`_,
`SWIG <http://www.swig.org>`_ and `Numba <https://numba.pydata.org/>`_
offer both simpler and more sophisticated approaches to creating C and C++
extensions for Python.
.. seealso::
`Python Packaging User Guide: Binary Extensions <https://packaging.python.org/en/latest/extensions/>`_
The Python Packaging User Guide not only covers several available
tools that simplify the creation of binary extensions, but also
discusses the various reasons why creating an extension module may be
desirable in the first place.
Creating extensions without third party tools
=============================================
This section of the guide covers creating C and C++ extensions without
assistance from third party tools. It is intended primarily for creators
of those tools, rather than being a recommended way to create your own
C extensions.
.. toctree::
:maxdepth: 2
:numbered:
extending.rst
newtypes_tutorial.rst
newtypes.rst
building.rst
windows.rst
Embedding the CPython runtime in a larger application
=====================================================
Sometimes, rather than creating an extension that runs inside the Python
interpreter as the main application, it is desirable to instead embed
the CPython runtime inside a larger application. This section covers
some of the details involved in doing that successfully.
.. toctree::
:maxdepth: 2
:numbered:
embedding.rst

View file

@ -0,0 +1,619 @@
.. highlightlang:: c
*****************************************
Defining Extension Types: Assorted Topics
*****************************************
.. _dnt-type-methods:
This section aims to give a quick fly-by on the various type methods you can
implement and what they do.
Here is the definition of :c:type:`PyTypeObject`, with some fields only used in
debug builds omitted:
.. literalinclude:: ../includes/typestruct.h
Now that's a *lot* of methods. Don't worry too much though -- if you have
a type you want to define, the chances are very good that you will only
implement a handful of these.
As you probably expect by now, we're going to go over this and give more
information about the various handlers. We won't go in the order they are
defined in the structure, because there is a lot of historical baggage that
impacts the ordering of the fields. It's often easiest to find an example
that includes the fields you need and then change the values to suit your new
type. ::
const char *tp_name; /* For printing */
The name of the type -- as mentioned in the previous chapter, this will appear in
various places, almost entirely for diagnostic purposes. Try to choose something
that will be helpful in such a situation! ::
Py_ssize_t tp_basicsize, tp_itemsize; /* For allocation */
These fields tell the runtime how much memory to allocate when new objects of
this type are created. Python has some built-in support for variable length
structures (think: strings, tuples) which is where the :c:member:`~PyTypeObject.tp_itemsize` field
comes in. This will be dealt with later. ::
const char *tp_doc;
Here you can put a string (or its address) that you want returned when the
Python script references ``obj.__doc__`` to retrieve the doc string.
Now we come to the basic type methods -- the ones most extension types will
implement.
Finalization and De-allocation
------------------------------
.. index::
single: object; deallocation
single: deallocation, object
single: object; finalization
single: finalization, of objects
::
destructor tp_dealloc;
This function is called when the reference count of the instance of your type is
reduced to zero and the Python interpreter wants to reclaim it. If your type
has memory to free or other clean-up to perform, you can put it here. The
object itself needs to be freed here as well. Here is an example of this
function::
static void
newdatatype_dealloc(newdatatypeobject *obj)
{
free(obj->obj_UnderlyingDatatypePtr);
Py_TYPE(obj)->tp_free(obj);
}
.. index::
single: PyErr_Fetch()
single: PyErr_Restore()
One important requirement of the deallocator function is that it leaves any
pending exceptions alone. This is important since deallocators are frequently
called as the interpreter unwinds the Python stack; when the stack is unwound
due to an exception (rather than normal returns), nothing is done to protect the
deallocators from seeing that an exception has already been set. Any actions
which a deallocator performs which may cause additional Python code to be
executed may detect that an exception has been set. This can lead to misleading
errors from the interpreter. The proper way to protect against this is to save
a pending exception before performing the unsafe action, and restoring it when
done. This can be done using the :c:func:`PyErr_Fetch` and
:c:func:`PyErr_Restore` functions::
static void
my_dealloc(PyObject *obj)
{
MyObject *self = (MyObject *) obj;
PyObject *cbresult;
if (self->my_callback != NULL) {
PyObject *err_type, *err_value, *err_traceback;
/* This saves the current exception state */
PyErr_Fetch(&err_type, &err_value, &err_traceback);
cbresult = PyObject_CallObject(self->my_callback, NULL);
if (cbresult == NULL)
PyErr_WriteUnraisable(self->my_callback);
else
Py_DECREF(cbresult);
/* This restores the saved exception state */
PyErr_Restore(err_type, err_value, err_traceback);
Py_DECREF(self->my_callback);
}
Py_TYPE(obj)->tp_free((PyObject*)self);
}
.. note::
There are limitations to what you can safely do in a deallocator function.
First, if your type supports garbage collection (using :c:member:`~PyTypeObject.tp_traverse`
and/or :c:member:`~PyTypeObject.tp_clear`), some of the object's members can have been
cleared or finalized by the time :c:member:`~PyTypeObject.tp_dealloc` is called. Second, in
:c:member:`~PyTypeObject.tp_dealloc`, your object is in an unstable state: its reference
count is equal to zero. Any call to a non-trivial object or API (as in the
example above) might end up calling :c:member:`~PyTypeObject.tp_dealloc` again, causing a
double free and a crash.
Starting with Python 3.4, it is recommended not to put any complex
finalization code in :c:member:`~PyTypeObject.tp_dealloc`, and instead use the new
:c:member:`~PyTypeObject.tp_finalize` type method.
.. seealso::
:pep:`442` explains the new finalization scheme.
.. index::
single: string; object representation
builtin: repr
Object Presentation
-------------------
In Python, there are two ways to generate a textual representation of an object:
the :func:`repr` function, and the :func:`str` function. (The :func:`print`
function just calls :func:`str`.) These handlers are both optional.
::
reprfunc tp_repr;
reprfunc tp_str;
The :c:member:`~PyTypeObject.tp_repr` handler should return a string object containing a
representation of the instance for which it is called. Here is a simple
example::
static PyObject *
newdatatype_repr(newdatatypeobject * obj)
{
return PyUnicode_FromFormat("Repr-ified_newdatatype{{size:%d}}",
obj->obj_UnderlyingDatatypePtr->size);
}
If no :c:member:`~PyTypeObject.tp_repr` handler is specified, the interpreter will supply a
representation that uses the type's :c:member:`~PyTypeObject.tp_name` and a uniquely-identifying
value for the object.
The :c:member:`~PyTypeObject.tp_str` handler is to :func:`str` what the :c:member:`~PyTypeObject.tp_repr` handler
described above is to :func:`repr`; that is, it is called when Python code calls
:func:`str` on an instance of your object. Its implementation is very similar
to the :c:member:`~PyTypeObject.tp_repr` function, but the resulting string is intended for human
consumption. If :c:member:`~PyTypeObject.tp_str` is not specified, the :c:member:`~PyTypeObject.tp_repr` handler is
used instead.
Here is a simple example::
static PyObject *
newdatatype_str(newdatatypeobject * obj)
{
return PyUnicode_FromFormat("Stringified_newdatatype{{size:%d}}",
obj->obj_UnderlyingDatatypePtr->size);
}
Attribute Management
--------------------
For every object which can support attributes, the corresponding type must
provide the functions that control how the attributes are resolved. There needs
to be a function which can retrieve attributes (if any are defined), and another
to set attributes (if setting attributes is allowed). Removing an attribute is
a special case, for which the new value passed to the handler is *NULL*.
Python supports two pairs of attribute handlers; a type that supports attributes
only needs to implement the functions for one pair. The difference is that one
pair takes the name of the attribute as a :c:type:`char\*`, while the other
accepts a :c:type:`PyObject\*`. Each type can use whichever pair makes more
sense for the implementation's convenience. ::
getattrfunc tp_getattr; /* char * version */
setattrfunc tp_setattr;
/* ... */
getattrofunc tp_getattro; /* PyObject * version */
setattrofunc tp_setattro;
If accessing attributes of an object is always a simple operation (this will be
explained shortly), there are generic implementations which can be used to
provide the :c:type:`PyObject\*` version of the attribute management functions.
The actual need for type-specific attribute handlers almost completely
disappeared starting with Python 2.2, though there are many examples which have
not been updated to use some of the new generic mechanism that is available.
.. _generic-attribute-management:
Generic Attribute Management
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Most extension types only use *simple* attributes. So, what makes the
attributes simple? There are only a couple of conditions that must be met:
#. The name of the attributes must be known when :c:func:`PyType_Ready` is
called.
#. No special processing is needed to record that an attribute was looked up or
set, nor do actions need to be taken based on the value.
Note that this list does not place any restrictions on the values of the
attributes, when the values are computed, or how relevant data is stored.
When :c:func:`PyType_Ready` is called, it uses three tables referenced by the
type object to create :term:`descriptor`\s which are placed in the dictionary of the
type object. Each descriptor controls access to one attribute of the instance
object. Each of the tables is optional; if all three are *NULL*, instances of
the type will only have attributes that are inherited from their base type, and
should leave the :c:member:`~PyTypeObject.tp_getattro` and :c:member:`~PyTypeObject.tp_setattro` fields *NULL* as
well, allowing the base type to handle attributes.
The tables are declared as three fields of the type object::
struct PyMethodDef *tp_methods;
struct PyMemberDef *tp_members;
struct PyGetSetDef *tp_getset;
If :c:member:`~PyTypeObject.tp_methods` is not *NULL*, it must refer to an array of
:c:type:`PyMethodDef` structures. Each entry in the table is an instance of this
structure::
typedef struct PyMethodDef {
const char *ml_name; /* method name */
PyCFunction ml_meth; /* implementation function */
int ml_flags; /* flags */
const char *ml_doc; /* docstring */
} PyMethodDef;
One entry should be defined for each method provided by the type; no entries are
needed for methods inherited from a base type. One additional entry is needed
at the end; it is a sentinel that marks the end of the array. The
:attr:`ml_name` field of the sentinel must be *NULL*.
The second table is used to define attributes which map directly to data stored
in the instance. A variety of primitive C types are supported, and access may
be read-only or read-write. The structures in the table are defined as::
typedef struct PyMemberDef {
char *name;
int type;
int offset;
int flags;
char *doc;
} PyMemberDef;
For each entry in the table, a :term:`descriptor` will be constructed and added to the
type which will be able to extract a value from the instance structure. The
:attr:`type` field should contain one of the type codes defined in the
:file:`structmember.h` header; the value will be used to determine how to
convert Python values to and from C values. The :attr:`flags` field is used to
store flags which control how the attribute can be accessed.
The following flag constants are defined in :file:`structmember.h`; they may be
combined using bitwise-OR.
+---------------------------+----------------------------------------------+
| Constant | Meaning |
+===========================+==============================================+
| :const:`READONLY` | Never writable. |
+---------------------------+----------------------------------------------+
| :const:`READ_RESTRICTED` | Not readable in restricted mode. |
+---------------------------+----------------------------------------------+
| :const:`WRITE_RESTRICTED` | Not writable in restricted mode. |
+---------------------------+----------------------------------------------+
| :const:`RESTRICTED` | Not readable or writable in restricted mode. |
+---------------------------+----------------------------------------------+
.. index::
single: READONLY
single: READ_RESTRICTED
single: WRITE_RESTRICTED
single: RESTRICTED
An interesting advantage of using the :c:member:`~PyTypeObject.tp_members` table to build
descriptors that are used at runtime is that any attribute defined this way can
have an associated doc string simply by providing the text in the table. An
application can use the introspection API to retrieve the descriptor from the
class object, and get the doc string using its :attr:`__doc__` attribute.
As with the :c:member:`~PyTypeObject.tp_methods` table, a sentinel entry with a :attr:`name` value
of *NULL* is required.
.. XXX Descriptors need to be explained in more detail somewhere, but not here.
Descriptor objects have two handler functions which correspond to the
\member{tp_getattro} and \member{tp_setattro} handlers. The
\method{__get__()} handler is a function which is passed the descriptor,
instance, and type objects, and returns the value of the attribute, or it
returns \NULL{} and sets an exception. The \method{__set__()} handler is
passed the descriptor, instance, type, and new value;
Type-specific Attribute Management
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
For simplicity, only the :c:type:`char\*` version will be demonstrated here; the
type of the name parameter is the only difference between the :c:type:`char\*`
and :c:type:`PyObject\*` flavors of the interface. This example effectively does
the same thing as the generic example above, but does not use the generic
support added in Python 2.2. It explains how the handler functions are
called, so that if you do need to extend their functionality, you'll understand
what needs to be done.
The :c:member:`~PyTypeObject.tp_getattr` handler is called when the object requires an attribute
look-up. It is called in the same situations where the :meth:`__getattr__`
method of a class would be called.
Here is an example::
static PyObject *
newdatatype_getattr(newdatatypeobject *obj, char *name)
{
if (strcmp(name, "data") == 0)
{
return PyLong_FromLong(obj->data);
}
PyErr_Format(PyExc_AttributeError,
"'%.50s' object has no attribute '%.400s'",
tp->tp_name, name);
return NULL;
}
The :c:member:`~PyTypeObject.tp_setattr` handler is called when the :meth:`__setattr__` or
:meth:`__delattr__` method of a class instance would be called. When an
attribute should be deleted, the third parameter will be *NULL*. Here is an
example that simply raises an exception; if this were really all you wanted, the
:c:member:`~PyTypeObject.tp_setattr` handler should be set to *NULL*. ::
static int
newdatatype_setattr(newdatatypeobject *obj, char *name, PyObject *v)
{
PyErr_Format(PyExc_RuntimeError, "Read-only attribute: %s", name);
return -1;
}
Object Comparison
-----------------
::
richcmpfunc tp_richcompare;
The :c:member:`~PyTypeObject.tp_richcompare` handler is called when comparisons are needed. It is
analogous to the :ref:`rich comparison methods <richcmpfuncs>`, like
:meth:`__lt__`, and also called by :c:func:`PyObject_RichCompare` and
:c:func:`PyObject_RichCompareBool`.
This function is called with two Python objects and the operator as arguments,
where the operator is one of ``Py_EQ``, ``Py_NE``, ``Py_LE``, ``Py_GT``,
``Py_LT`` or ``Py_GT``. It should compare the two objects with respect to the
specified operator and return ``Py_True`` or ``Py_False`` if the comparison is
successful, ``Py_NotImplemented`` to indicate that comparison is not
implemented and the other object's comparison method should be tried, or *NULL*
if an exception was set.
Here is a sample implementation, for a datatype that is considered equal if the
size of an internal pointer is equal::
static PyObject *
newdatatype_richcmp(PyObject *obj1, PyObject *obj2, int op)
{
PyObject *result;
int c, size1, size2;
/* code to make sure that both arguments are of type
newdatatype omitted */
size1 = obj1->obj_UnderlyingDatatypePtr->size;
size2 = obj2->obj_UnderlyingDatatypePtr->size;
switch (op) {
case Py_LT: c = size1 < size2; break;
case Py_LE: c = size1 <= size2; break;
case Py_EQ: c = size1 == size2; break;
case Py_NE: c = size1 != size2; break;
case Py_GT: c = size1 > size2; break;
case Py_GE: c = size1 >= size2; break;
}
result = c ? Py_True : Py_False;
Py_INCREF(result);
return result;
}
Abstract Protocol Support
-------------------------
Python supports a variety of *abstract* 'protocols;' the specific interfaces
provided to use these interfaces are documented in :ref:`abstract`.
A number of these abstract interfaces were defined early in the development of
the Python implementation. In particular, the number, mapping, and sequence
protocols have been part of Python since the beginning. Other protocols have
been added over time. For protocols which depend on several handler routines
from the type implementation, the older protocols have been defined as optional
blocks of handlers referenced by the type object. For newer protocols there are
additional slots in the main type object, with a flag bit being set to indicate
that the slots are present and should be checked by the interpreter. (The flag
bit does not indicate that the slot values are non-*NULL*. The flag may be set
to indicate the presence of a slot, but a slot may still be unfilled.) ::
PyNumberMethods *tp_as_number;
PySequenceMethods *tp_as_sequence;
PyMappingMethods *tp_as_mapping;
If you wish your object to be able to act like a number, a sequence, or a
mapping object, then you place the address of a structure that implements the C
type :c:type:`PyNumberMethods`, :c:type:`PySequenceMethods`, or
:c:type:`PyMappingMethods`, respectively. It is up to you to fill in this
structure with appropriate values. You can find examples of the use of each of
these in the :file:`Objects` directory of the Python source distribution. ::
hashfunc tp_hash;
This function, if you choose to provide it, should return a hash number for an
instance of your data type. Here is a simple example::
static Py_hash_t
newdatatype_hash(newdatatypeobject *obj)
{
Py_hash_t result;
result = obj->some_size + 32767 * obj->some_number;
if (result == -1)
result = -2;
return result;
}
:c:type:`Py_hash_t` is a signed integer type with a platform-varying width.
Returning ``-1`` from :c:member:`~PyTypeObject.tp_hash` indicates an error,
which is why you should be careful to avoid returning it when hash computation
is successful, as seen above.
::
ternaryfunc tp_call;
This function is called when an instance of your data type is "called", for
example, if ``obj1`` is an instance of your data type and the Python script
contains ``obj1('hello')``, the :c:member:`~PyTypeObject.tp_call` handler is invoked.
This function takes three arguments:
#. *self* is the instance of the data type which is the subject of the call.
If the call is ``obj1('hello')``, then *self* is ``obj1``.
#. *args* is a tuple containing the arguments to the call. You can use
:c:func:`PyArg_ParseTuple` to extract the arguments.
#. *kwds* is a dictionary of keyword arguments that were passed. If this is
non-*NULL* and you support keyword arguments, use
:c:func:`PyArg_ParseTupleAndKeywords` to extract the arguments. If you
do not want to support keyword arguments and this is non-*NULL*, raise a
:exc:`TypeError` with a message saying that keyword arguments are not supported.
Here is a toy ``tp_call`` implementation::
static PyObject *
newdatatype_call(newdatatypeobject *self, PyObject *args, PyObject *kwds)
{
PyObject *result;
char *arg1;
char *arg2;
char *arg3;
if (!PyArg_ParseTuple(args, "sss:call", &arg1, &arg2, &arg3)) {
return NULL;
}
result = PyUnicode_FromFormat(
"Returning -- value: [%d] arg1: [%s] arg2: [%s] arg3: [%s]\n",
obj->obj_UnderlyingDatatypePtr->size,
arg1, arg2, arg3);
return result;
}
::
/* Iterators */
getiterfunc tp_iter;
iternextfunc tp_iternext;
These functions provide support for the iterator protocol. Both handlers
take exactly one parameter, the instance for which they are being called,
and return a new reference. In the case of an error, they should set an
exception and return *NULL*. :c:member:`~PyTypeObject.tp_iter` corresponds
to the Python :meth:`__iter__` method, while :c:member:`~PyTypeObject.tp_iternext`
corresponds to the Python :meth:`~iterator.__next__` method.
Any :term:`iterable` object must implement the :c:member:`~PyTypeObject.tp_iter`
handler, which must return an :term:`iterator` object. Here the same guidelines
apply as for Python classes:
* For collections (such as lists and tuples) which can support multiple
independent iterators, a new iterator should be created and returned by
each call to :c:member:`~PyTypeObject.tp_iter`.
* Objects which can only be iterated over once (usually due to side effects of
iteration, such as file objects) can implement :c:member:`~PyTypeObject.tp_iter`
by returning a new reference to themselves -- and should also therefore
implement the :c:member:`~PyTypeObject.tp_iternext` handler.
Any :term:`iterator` object should implement both :c:member:`~PyTypeObject.tp_iter`
and :c:member:`~PyTypeObject.tp_iternext`. An iterator's
:c:member:`~PyTypeObject.tp_iter` handler should return a new reference
to the iterator. Its :c:member:`~PyTypeObject.tp_iternext` handler should
return a new reference to the next object in the iteration, if there is one.
If the iteration has reached the end, :c:member:`~PyTypeObject.tp_iternext`
may return *NULL* without setting an exception, or it may set
:exc:`StopIteration` *in addition* to returning *NULL*; avoiding
the exception can yield slightly better performance. If an actual error
occurs, :c:member:`~PyTypeObject.tp_iternext` should always set an exception
and return *NULL*.
.. _weakref-support:
Weak Reference Support
----------------------
One of the goals of Python's weak reference implementation is to allow any type
to participate in the weak reference mechanism without incurring the overhead on
performance-critical objects (such as numbers).
.. seealso::
Documentation for the :mod:`weakref` module.
For an object to be weakly referencable, the extension type must do two things:
#. Include a :c:type:`PyObject\*` field in the C object structure dedicated to
the weak reference mechanism. The object's constructor should leave it
*NULL* (which is automatic when using the default
:c:member:`~PyTypeObject.tp_alloc`).
#. Set the :c:member:`~PyTypeObject.tp_weaklistoffset` type member
to the offset of the aforementioned field in the C object structure,
so that the interpreter knows how to access and modify that field.
Concretely, here is how a trivial object structure would be augmented
with the required field::
typedef struct {
PyObject_HEAD
PyObject *weakreflist; /* List of weak references */
} TrivialObject;
And the corresponding member in the statically-declared type object::
static PyTypeObject TrivialType = {
PyVarObject_HEAD_INIT(NULL, 0)
/* ... other members omitted for brevity ... */
.tp_weaklistoffset = offsetof(TrivialObject, weakreflist),
};
The only further addition is that ``tp_dealloc`` needs to clear any weak
references (by calling :c:func:`PyObject_ClearWeakRefs`) if the field is
non-*NULL*::
static void
Trivial_dealloc(TrivialObject *self)
{
/* Clear weakrefs first before calling any destructors */
if (self->weakreflist != NULL)
PyObject_ClearWeakRefs((PyObject *) self);
/* ... remainder of destruction code omitted for brevity ... */
Py_TYPE(self)->tp_free((PyObject *) self);
}
More Suggestions
----------------
In order to learn how to implement any specific method for your new data type,
get the :term:`CPython` source code. Go to the :file:`Objects` directory,
then search the C source files for ``tp_`` plus the function you want
(for example, ``tp_richcompare``). You will find examples of the function
you want to implement.
When you need to verify that an object is a concrete instance of the type you
are implementing, use the :c:func:`PyObject_TypeCheck` function. A sample of
its use might be something like the following::
if (!PyObject_TypeCheck(some_object, &MyType)) {
PyErr_SetString(PyExc_TypeError, "arg #1 not a mything");
return NULL;
}
.. seealso::
Download CPython source releases.
https://www.python.org/downloads/source/
The CPython project on GitHub, where the CPython source code is developed.
https://github.com/python/cpython

View file

@ -0,0 +1,896 @@
.. highlightlang:: c
.. _defining-new-types:
**********************************
Defining Extension Types: Tutorial
**********************************
.. sectionauthor:: Michael Hudson <mwh@python.net>
.. sectionauthor:: Dave Kuhlman <dkuhlman@rexx.com>
.. sectionauthor:: Jim Fulton <jim@zope.com>
Python allows the writer of a C extension module to define new types that
can be manipulated from Python code, much like the built-in :class:`str`
and :class:`list` types. The code for all extension types follows a
pattern, but there are some details that you need to understand before you
can get started. This document is a gentle introduction to the topic.
.. _dnt-basics:
The Basics
==========
The :term:`CPython` runtime sees all Python objects as variables of type
:c:type:`PyObject\*`, which serves as a "base type" for all Python objects.
The :c:type:`PyObject` structure itself only contains the object's
:term:`reference count` and a pointer to the object's "type object".
This is where the action is; the type object determines which (C) functions
get called by the interpreter when, for instance, an attribute gets looked up
on an object, a method called, or it is multiplied by another object. These
C functions are called "type methods".
So, if you want to define a new extension type, you need to create a new type
object.
This sort of thing can only be explained by example, so here's a minimal, but
complete, module that defines a new type named :class:`Custom` inside a C
extension module :mod:`custom`:
.. note::
What we're showing here is the traditional way of defining *static*
extension types. It should be adequate for most uses. The C API also
allows defining heap-allocated extension types using the
:c:func:`PyType_FromSpec` function, which isn't covered in this tutorial.
.. literalinclude:: ../includes/custom.c
Now that's quite a bit to take in at once, but hopefully bits will seem familiar
from the previous chapter. This file defines three things:
#. What a :class:`Custom` **object** contains: this is the ``CustomObject``
struct, which is allocated once for each :class:`Custom` instance.
#. How the :class:`Custom` **type** behaves: this is the ``CustomType`` struct,
which defines a set of flags and function pointers that the interpreter
inspects when specific operations are requested.
#. How to initialize the :mod:`custom` module: this is the ``PyInit_custom``
function and the associated ``custommodule`` struct.
The first bit is::
typedef struct {
PyObject_HEAD
} CustomObject;
This is what a Custom object will contain. ``PyObject_HEAD`` is mandatory
at the start of each object struct and defines a field called ``ob_base``
of type :c:type:`PyObject`, containing a pointer to a type object and a
reference count (these can be accessed using the macros :c:macro:`Py_REFCNT`
and :c:macro:`Py_TYPE` respectively). The reason for the macro is to
abstract away the layout and to enable additional fields in debug builds.
.. note::
There is no semicolon above after the :c:macro:`PyObject_HEAD` macro.
Be wary of adding one by accident: some compilers will complain.
Of course, objects generally store additional data besides the standard
``PyObject_HEAD`` boilerplate; for example, here is the definition for
standard Python floats::
typedef struct {
PyObject_HEAD
double ob_fval;
} PyFloatObject;
The second bit is the definition of the type object. ::
static PyTypeObject CustomType = {
PyVarObject_HEAD_INIT(NULL, 0)
.tp_name = "custom.Custom",
.tp_doc = "Custom objects",
.tp_basicsize = sizeof(CustomObject),
.tp_itemsize = 0,
.tp_new = PyType_GenericNew,
};
.. note::
We recommend using C99-style designated initializers as above, to
avoid listing all the :c:type:`PyTypeObject` fields that you don't care
about and also to avoid caring about the fields' declaration order.
The actual definition of :c:type:`PyTypeObject` in :file:`object.h` has
many more :ref:`fields <type-structs>` than the definition above. The
remaining fields will be filled with zeros by the C compiler, and it's
common practice to not specify them explicitly unless you need them.
We're going to pick it apart, one field at a time::
PyVarObject_HEAD_INIT(NULL, 0)
This line is mandatory boilerplate to initialize the ``ob_base``
field mentioned above. ::
.tp_name = "custom.Custom",
The name of our type. This will appear in the default textual representation of
our objects and in some error messages, for example:
.. code-block:: pycon
>>> "" + custom.Custom()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: can only concatenate str (not "custom.Custom") to str
Note that the name is a dotted name that includes both the module name and the
name of the type within the module. The module in this case is :mod:`custom` and
the type is :class:`Custom`, so we set the type name to :class:`custom.Custom`.
Using the real dotted import path is important to make your type compatible
with the :mod:`pydoc` and :mod:`pickle` modules. ::
.tp_basicsize = sizeof(CustomObject),
.tp_itemsize = 0,
This is so that Python knows how much memory to allocate when creating
new :class:`Custom` instances. :c:member:`~PyTypeObject.tp_itemsize` is
only used for variable-sized objects and should otherwise be zero.
.. note::
If you want your type to be subclassable from Python, and your type has the same
:c:member:`~PyTypeObject.tp_basicsize` as its base type, you may have problems with multiple
inheritance. A Python subclass of your type will have to list your type first
in its :attr:`~class.__bases__`, or else it will not be able to call your type's
:meth:`__new__` method without getting an error. You can avoid this problem by
ensuring that your type has a larger value for :c:member:`~PyTypeObject.tp_basicsize` than its
base type does. Most of the time, this will be true anyway, because either your
base type will be :class:`object`, or else you will be adding data members to
your base type, and therefore increasing its size.
We set the class flags to :const:`Py_TPFLAGS_DEFAULT`. ::
.tp_flags = Py_TPFLAGS_DEFAULT,
All types should include this constant in their flags. It enables all of the
members defined until at least Python 3.3. If you need further members,
you will need to OR the corresponding flags.
We provide a doc string for the type in :c:member:`~PyTypeObject.tp_doc`. ::
.tp_doc = "Custom objects",
To enable object creation, we have to provide a :c:member:`~PyTypeObject.tp_new`
handler. This is the equivalent of the Python method :meth:`__new__`, but
has to be specified explicitly. In this case, we can just use the default
implementation provided by the API function :c:func:`PyType_GenericNew`. ::
.tp_new = PyType_GenericNew,
Everything else in the file should be familiar, except for some code in
:c:func:`PyInit_custom`::
if (PyType_Ready(&CustomType) < 0)
return;
This initializes the :class:`Custom` type, filling in a number of members
to the appropriate default values, including :attr:`ob_type` that we initially
set to *NULL*. ::
PyModule_AddObject(m, "Custom", (PyObject *) &CustomType);
This adds the type to the module dictionary. This allows us to create
:class:`Custom` instances by calling the :class:`Custom` class:
.. code-block:: pycon
>>> import custom
>>> mycustom = custom.Custom()
That's it! All that remains is to build it; put the above code in a file called
:file:`custom.c` and:
.. code-block:: python
from distutils.core import setup, Extension
setup(name="custom", version="1.0",
ext_modules=[Extension("custom", ["custom.c"])])
in a file called :file:`setup.py`; then typing
.. code-block:: shell-session
$ python setup.py build
at a shell should produce a file :file:`custom.so` in a subdirectory; move to
that directory and fire up Python --- you should be able to ``import custom`` and
play around with Custom objects.
That wasn't so hard, was it?
Of course, the current Custom type is pretty uninteresting. It has no data and
doesn't do anything. It can't even be subclassed.
.. note::
While this documentation showcases the standard :mod:`distutils` module
for building C extensions, it is recommended in real-world use cases to
use the newer and better-maintained ``setuptools`` library. Documentation
on how to do this is out of scope for this document and can be found in
the `Python Packaging User's Guide <https://packaging.python.org/tutorials/distributing-packages/>`_.
Adding data and methods to the Basic example
============================================
Let's extend the basic example to add some data and methods. Let's also make
the type usable as a base class. We'll create a new module, :mod:`custom2` that
adds these capabilities:
.. literalinclude:: ../includes/custom2.c
This version of the module has a number of changes.
We've added an extra include::
#include <structmember.h>
This include provides declarations that we use to handle attributes, as
described a bit later.
The :class:`Custom` type now has three data attributes in its C struct,
*first*, *last*, and *number*. The *first* and *last* variables are Python
strings containing first and last names. The *number* attribute is a C integer.
The object structure is updated accordingly::
typedef struct {
PyObject_HEAD
PyObject *first; /* first name */
PyObject *last; /* last name */
int number;
} CustomObject;
Because we now have data to manage, we have to be more careful about object
allocation and deallocation. At a minimum, we need a deallocation method::
static void
Custom_dealloc(CustomObject *self)
{
Py_XDECREF(self->first);
Py_XDECREF(self->last);
Py_TYPE(self)->tp_free((PyObject *) self);
}
which is assigned to the :c:member:`~PyTypeObject.tp_dealloc` member::
.tp_dealloc = (destructor) Custom_dealloc,
This method first clears the reference counts of the two Python attributes.
:c:func:`Py_XDECREF` correctly handles the case where its argument is
*NULL* (which might happen here if ``tp_new`` failed midway). It then
calls the :c:member:`~PyTypeObject.tp_free` member of the object's type
(computed by ``Py_TYPE(self)``) to free the object's memory. Note that
the object's type might not be :class:`CustomType`, because the object may
be an instance of a subclass.
.. note::
The explicit cast to ``destructor`` above is needed because we defined
``Custom_dealloc`` to take a ``CustomObject *`` argument, but the ``tp_dealloc``
function pointer expects to receive a ``PyObject *`` argument. Otherwise,
the compiler will emit a warning. This is object-oriented polymorphism,
in C!
We want to make sure that the first and last names are initialized to empty
strings, so we provide a ``tp_new`` implementation::
static PyObject *
Custom_new(PyTypeObject *type, PyObject *args, PyObject *kwds)
{
CustomObject *self;
self = (CustomObject *) type->tp_alloc(type, 0);
if (self != NULL) {
self->first = PyUnicode_FromString("");
if (self->first == NULL) {
Py_DECREF(self);
return NULL;
}
self->last = PyUnicode_FromString("");
if (self->last == NULL) {
Py_DECREF(self);
return NULL;
}
self->number = 0;
}
return (PyObject *) self;
}
and install it in the :c:member:`~PyTypeObject.tp_new` member::
.tp_new = Custom_new,
The ``tp_new`` handler is responsible for creating (as opposed to initializing)
objects of the type. It is exposed in Python as the :meth:`__new__` method.
It is not required to define a ``tp_new`` member, and indeed many extension
types will simply reuse :c:func:`PyType_GenericNew` as done in the first
version of the ``Custom`` type above. In this case, we use the ``tp_new``
handler to initialize the ``first`` and ``last`` attributes to non-*NULL*
default values.
``tp_new`` is passed the type being instantiated (not necessarily ``CustomType``,
if a subclass is instantiated) and any arguments passed when the type was
called, and is expected to return the instance created. ``tp_new`` handlers
always accept positional and keyword arguments, but they often ignore the
arguments, leaving the argument handling to initializer (a.k.a. ``tp_init``
in C or ``__init__`` in Python) methods.
.. note::
``tp_new`` shouldn't call ``tp_init`` explicitly, as the interpreter
will do it itself.
The ``tp_new`` implementation calls the :c:member:`~PyTypeObject.tp_alloc`
slot to allocate memory::
self = (CustomObject *) type->tp_alloc(type, 0);
Since memory allocation may fail, we must check the :c:member:`~PyTypeObject.tp_alloc`
result against *NULL* before proceeding.
.. note::
We didn't fill the :c:member:`~PyTypeObject.tp_alloc` slot ourselves. Rather
:c:func:`PyType_Ready` fills it for us by inheriting it from our base class,
which is :class:`object` by default. Most types use the default allocation
strategy.
.. note::
If you are creating a co-operative :c:member:`~PyTypeObject.tp_new` (one
that calls a base type's :c:member:`~PyTypeObject.tp_new` or :meth:`__new__`),
you must *not* try to determine what method to call using method resolution
order at runtime. Always statically determine what type you are going to
call, and call its :c:member:`~PyTypeObject.tp_new` directly, or via
``type->tp_base->tp_new``. If you do not do this, Python subclasses of your
type that also inherit from other Python-defined classes may not work correctly.
(Specifically, you may not be able to create instances of such subclasses
without getting a :exc:`TypeError`.)
We also define an initialization function which accepts arguments to provide
initial values for our instance::
static int
Custom_init(CustomObject *self, PyObject *args, PyObject *kwds)
{
static char *kwlist[] = {"first", "last", "number", NULL};
PyObject *first = NULL, *last = NULL, *tmp;
if (!PyArg_ParseTupleAndKeywords(args, kwds, "|OOi", kwlist,
&first, &last,
&self->number))
return -1;
if (first) {
tmp = self->first;
Py_INCREF(first);
self->first = first;
Py_XDECREF(tmp);
}
if (last) {
tmp = self->last;
Py_INCREF(last);
self->last = last;
Py_XDECREF(tmp);
}
return 0;
}
by filling the :c:member:`~PyTypeObject.tp_init` slot. ::
.tp_init = (initproc) Custom_init,
The :c:member:`~PyTypeObject.tp_init` slot is exposed in Python as the
:meth:`__init__` method. It is used to initialize an object after it's
created. Initializers always accept positional and keyword arguments,
and they should return either ``0`` on success or ``-1`` on error.
Unlike the ``tp_new`` handler, there is no guarantee that ``tp_init``
is called at all (for example, the :mod:`pickle` module by default
doesn't call :meth:`__init__` on unpickled instances). It can also be
called multiple times. Anyone can call the :meth:`__init__` method on
our objects. For this reason, we have to be extra careful when assigning
the new attribute values. We might be tempted, for example to assign the
``first`` member like this::
if (first) {
Py_XDECREF(self->first);
Py_INCREF(first);
self->first = first;
}
But this would be risky. Our type doesn't restrict the type of the
``first`` member, so it could be any kind of object. It could have a
destructor that causes code to be executed that tries to access the
``first`` member; or that destructor could release the
:term:`Global interpreter Lock` and let arbitrary code run in other
threads that accesses and modifies our object.
To be paranoid and protect ourselves against this possibility, we almost
always reassign members before decrementing their reference counts. When
don't we have to do this?
* when we absolutely know that the reference count is greater than 1;
* when we know that deallocation of the object [#]_ will neither release
the :term:`GIL` nor cause any calls back into our type's code;
* when decrementing a reference count in a :c:member:`~PyTypeObject.tp_dealloc`
handler on a type which doesn't support cyclic garbage collection [#]_.
We want to expose our instance variables as attributes. There are a
number of ways to do that. The simplest way is to define member definitions::
static PyMemberDef Custom_members[] = {
{"first", T_OBJECT_EX, offsetof(CustomObject, first), 0,
"first name"},
{"last", T_OBJECT_EX, offsetof(CustomObject, last), 0,
"last name"},
{"number", T_INT, offsetof(CustomObject, number), 0,
"custom number"},
{NULL} /* Sentinel */
};
and put the definitions in the :c:member:`~PyTypeObject.tp_members` slot::
.tp_members = Custom_members,
Each member definition has a member name, type, offset, access flags and
documentation string. See the :ref:`Generic-Attribute-Management` section
below for details.
A disadvantage of this approach is that it doesn't provide a way to restrict the
types of objects that can be assigned to the Python attributes. We expect the
first and last names to be strings, but any Python objects can be assigned.
Further, the attributes can be deleted, setting the C pointers to *NULL*. Even
though we can make sure the members are initialized to non-*NULL* values, the
members can be set to *NULL* if the attributes are deleted.
We define a single method, :meth:`Custom.name()`, that outputs the objects name as the
concatenation of the first and last names. ::
static PyObject *
Custom_name(CustomObject *self)
{
if (self->first == NULL) {
PyErr_SetString(PyExc_AttributeError, "first");
return NULL;
}
if (self->last == NULL) {
PyErr_SetString(PyExc_AttributeError, "last");
return NULL;
}
return PyUnicode_FromFormat("%S %S", self->first, self->last);
}
The method is implemented as a C function that takes a :class:`Custom` (or
:class:`Custom` subclass) instance as the first argument. Methods always take an
instance as the first argument. Methods often take positional and keyword
arguments as well, but in this case we don't take any and don't need to accept
a positional argument tuple or keyword argument dictionary. This method is
equivalent to the Python method:
.. code-block:: python
def name(self):
return "%s %s" % (self.first, self.last)
Note that we have to check for the possibility that our :attr:`first` and
:attr:`last` members are *NULL*. This is because they can be deleted, in which
case they are set to *NULL*. It would be better to prevent deletion of these
attributes and to restrict the attribute values to be strings. We'll see how to
do that in the next section.
Now that we've defined the method, we need to create an array of method
definitions::
static PyMethodDef Custom_methods[] = {
{"name", (PyCFunction) Custom_name, METH_NOARGS,
"Return the name, combining the first and last name"
},
{NULL} /* Sentinel */
};
(note that we used the :const:`METH_NOARGS` flag to indicate that the method
is expecting no arguments other than *self*)
and assign it to the :c:member:`~PyTypeObject.tp_methods` slot::
.tp_methods = Custom_methods,
Finally, we'll make our type usable as a base class for subclassing. We've
written our methods carefully so far so that they don't make any assumptions
about the type of the object being created or used, so all we need to do is
to add the :const:`Py_TPFLAGS_BASETYPE` to our class flag definition::
.tp_flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE,
We rename :c:func:`PyInit_custom` to :c:func:`PyInit_custom2`, update the
module name in the :c:type:`PyModuleDef` struct, and update the full class
name in the :c:type:`PyTypeObject` struct.
Finally, we update our :file:`setup.py` file to build the new module:
.. code-block:: python
from distutils.core import setup, Extension
setup(name="custom", version="1.0",
ext_modules=[
Extension("custom", ["custom.c"]),
Extension("custom2", ["custom2.c"]),
])
Providing finer control over data attributes
============================================
In this section, we'll provide finer control over how the :attr:`first` and
:attr:`last` attributes are set in the :class:`Custom` example. In the previous
version of our module, the instance variables :attr:`first` and :attr:`last`
could be set to non-string values or even deleted. We want to make sure that
these attributes always contain strings.
.. literalinclude:: ../includes/custom3.c
To provide greater control, over the :attr:`first` and :attr:`last` attributes,
we'll use custom getter and setter functions. Here are the functions for
getting and setting the :attr:`first` attribute::
static PyObject *
Custom_getfirst(CustomObject *self, void *closure)
{
Py_INCREF(self->first);
return self->first;
}
static int
Custom_setfirst(CustomObject *self, PyObject *value, void *closure)
{
PyObject *tmp;
if (value == NULL) {
PyErr_SetString(PyExc_TypeError, "Cannot delete the first attribute");
return -1;
}
if (!PyUnicode_Check(value)) {
PyErr_SetString(PyExc_TypeError,
"The first attribute value must be a string");
return -1;
}
tmp = self->first;
Py_INCREF(value);
self->first = value;
Py_DECREF(tmp);
return 0;
}
The getter function is passed a :class:`Custom` object and a "closure", which is
a void pointer. In this case, the closure is ignored. (The closure supports an
advanced usage in which definition data is passed to the getter and setter. This
could, for example, be used to allow a single set of getter and setter functions
that decide the attribute to get or set based on data in the closure.)
The setter function is passed the :class:`Custom` object, the new value, and the
closure. The new value may be *NULL*, in which case the attribute is being
deleted. In our setter, we raise an error if the attribute is deleted or if its
new value is not a string.
We create an array of :c:type:`PyGetSetDef` structures::
static PyGetSetDef Custom_getsetters[] = {
{"first", (getter) Custom_getfirst, (setter) Custom_setfirst,
"first name", NULL},
{"last", (getter) Custom_getlast, (setter) Custom_setlast,
"last name", NULL},
{NULL} /* Sentinel */
};
and register it in the :c:member:`~PyTypeObject.tp_getset` slot::
.tp_getset = Custom_getsetters,
The last item in a :c:type:`PyGetSetDef` structure is the "closure" mentioned
above. In this case, we aren't using a closure, so we just pass *NULL*.
We also remove the member definitions for these attributes::
static PyMemberDef Custom_members[] = {
{"number", T_INT, offsetof(CustomObject, number), 0,
"custom number"},
{NULL} /* Sentinel */
};
We also need to update the :c:member:`~PyTypeObject.tp_init` handler to only
allow strings [#]_ to be passed::
static int
Custom_init(CustomObject *self, PyObject *args, PyObject *kwds)
{
static char *kwlist[] = {"first", "last", "number", NULL};
PyObject *first = NULL, *last = NULL, *tmp;
if (!PyArg_ParseTupleAndKeywords(args, kwds, "|UUi", kwlist,
&first, &last,
&self->number))
return -1;
if (first) {
tmp = self->first;
Py_INCREF(first);
self->first = first;
Py_DECREF(tmp);
}
if (last) {
tmp = self->last;
Py_INCREF(last);
self->last = last;
Py_DECREF(tmp);
}
return 0;
}
With these changes, we can assure that the ``first`` and ``last`` members are
never *NULL* so we can remove checks for *NULL* values in almost all cases.
This means that most of the :c:func:`Py_XDECREF` calls can be converted to
:c:func:`Py_DECREF` calls. The only place we can't change these calls is in
the ``tp_dealloc`` implementation, where there is the possibility that the
initialization of these members failed in ``tp_new``.
We also rename the module initialization function and module name in the
initialization function, as we did before, and we add an extra definition to the
:file:`setup.py` file.
Supporting cyclic garbage collection
====================================
Python has a :term:`cyclic garbage collector (GC) <garbage collection>` that
can identify unneeded objects even when their reference counts are not zero.
This can happen when objects are involved in cycles. For example, consider:
.. code-block:: pycon
>>> l = []
>>> l.append(l)
>>> del l
In this example, we create a list that contains itself. When we delete it, it
still has a reference from itself. Its reference count doesn't drop to zero.
Fortunately, Python's cyclic garbage collector will eventually figure out that
the list is garbage and free it.
In the second version of the :class:`Custom` example, we allowed any kind of
object to be stored in the :attr:`first` or :attr:`last` attributes [#]_.
Besides, in the second and third versions, we allowed subclassing
:class:`Custom`, and subclasses may add arbitrary attributes. For any of
those two reasons, :class:`Custom` objects can participate in cycles:
.. code-block:: pycon
>>> import custom3
>>> class Derived(custom3.Custom): pass
...
>>> n = Derived()
>>> n.some_attribute = n
To allow a :class:`Custom` instance participating in a reference cycle to
be properly detected and collected by the cyclic GC, our :class:`Custom` type
needs to fill two additional slots and to enable a flag that enables these slots:
.. literalinclude:: ../includes/custom4.c
First, the traversal method lets the cyclic GC know about subobjects that could
participate in cycles::
static int
Custom_traverse(CustomObject *self, visitproc visit, void *arg)
{
int vret;
if (self->first) {
vret = visit(self->first, arg);
if (vret != 0)
return vret;
}
if (self->last) {
vret = visit(self->last, arg);
if (vret != 0)
return vret;
}
return 0;
}
For each subobject that can participate in cycles, we need to call the
:c:func:`visit` function, which is passed to the traversal method. The
:c:func:`visit` function takes as arguments the subobject and the extra argument
*arg* passed to the traversal method. It returns an integer value that must be
returned if it is non-zero.
Python provides a :c:func:`Py_VISIT` macro that automates calling visit
functions. With :c:func:`Py_VISIT`, we can minimize the amount of boilerplate
in ``Custom_traverse``::
static int
Custom_traverse(CustomObject *self, visitproc visit, void *arg)
{
Py_VISIT(self->first);
Py_VISIT(self->last);
return 0;
}
.. note::
The :c:member:`~PyTypeObject.tp_traverse` implementation must name its
arguments exactly *visit* and *arg* in order to use :c:func:`Py_VISIT`.
Second, we need to provide a method for clearing any subobjects that can
participate in cycles::
static int
Custom_clear(CustomObject *self)
{
Py_CLEAR(self->first);
Py_CLEAR(self->last);
return 0;
}
Notice the use of the :c:func:`Py_CLEAR` macro. It is the recommended and safe
way to clear data attributes of arbitrary types while decrementing
their reference counts. If you were to call :c:func:`Py_XDECREF` instead
on the attribute before setting it to *NULL*, there is a possibility
that the attribute's destructor would call back into code that reads the
attribute again (*especially* if there is a reference cycle).
.. note::
You could emulate :c:func:`Py_CLEAR` by writing::
PyObject *tmp;
tmp = self->first;
self->first = NULL;
Py_XDECREF(tmp);
Nevertheless, it is much easier and less error-prone to always
use :c:func:`Py_CLEAR` when deleting an attribute. Don't
try to micro-optimize at the expense of robustness!
The deallocator ``Custom_dealloc`` may call arbitrary code when clearing
attributes. It means the circular GC can be triggered inside the function.
Since the GC assumes reference count is not zero, we need to untrack the object
from the GC by calling :c:func:`PyObject_GC_UnTrack` before clearing members.
Here is our reimplemented deallocator using :c:func:`PyObject_GC_UnTrack`
and ``Custom_clear``::
static void
Custom_dealloc(CustomObject *self)
{
PyObject_GC_UnTrack(self);
Custom_clear(self);
Py_TYPE(self)->tp_free((PyObject *) self);
}
Finally, we add the :const:`Py_TPFLAGS_HAVE_GC` flag to the class flags::
.tp_flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE | Py_TPFLAGS_HAVE_GC,
That's pretty much it. If we had written custom :c:member:`~PyTypeObject.tp_alloc` or
:c:member:`~PyTypeObject.tp_free` handlers, we'd need to modify them for cyclic
garbage collection. Most extensions will use the versions automatically provided.
Subclassing other types
=======================
It is possible to create new extension types that are derived from existing
types. It is easiest to inherit from the built in types, since an extension can
easily use the :c:type:`PyTypeObject` it needs. It can be difficult to share
these :c:type:`PyTypeObject` structures between extension modules.
In this example we will create a :class:`SubList` type that inherits from the
built-in :class:`list` type. The new type will be completely compatible with
regular lists, but will have an additional :meth:`increment` method that
increases an internal counter:
.. code-block:: pycon
>>> import sublist
>>> s = sublist.SubList(range(3))
>>> s.extend(s)
>>> print(len(s))
6
>>> print(s.increment())
1
>>> print(s.increment())
2
.. literalinclude:: ../includes/sublist.c
As you can see, the source code closely resembles the :class:`Custom` examples in
previous sections. We will break down the main differences between them. ::
typedef struct {
PyListObject list;
int state;
} SubListObject;
The primary difference for derived type objects is that the base type's
object structure must be the first value. The base type will already include
the :c:func:`PyObject_HEAD` at the beginning of its structure.
When a Python object is a :class:`SubList` instance, its ``PyObject *`` pointer
can be safely cast to both ``PyListObject *`` and ``SubListObject *``::
static int
SubList_init(SubListObject *self, PyObject *args, PyObject *kwds)
{
if (PyList_Type.tp_init((PyObject *) self, args, kwds) < 0)
return -1;
self->state = 0;
return 0;
}
We see above how to call through to the :attr:`__init__` method of the base
type.
This pattern is important when writing a type with custom
:c:member:`~PyTypeObject.tp_new` and :c:member:`~PyTypeObject.tp_dealloc`
members. The :c:member:`~PyTypeObject.tp_new` handler should not actually
create the memory for the object with its :c:member:`~PyTypeObject.tp_alloc`,
but let the base class handle it by calling its own :c:member:`~PyTypeObject.tp_new`.
The :c:type:`PyTypeObject` struct supports a :c:member:`~PyTypeObject.tp_base`
specifying the type's concrete base class. Due to cross-platform compiler
issues, you can't fill that field directly with a reference to
:c:type:`PyList_Type`; it should be done later in the module initialization
function::
PyMODINIT_FUNC
PyInit_sublist(void)
{
PyObject* m;
SubListType.tp_base = &PyList_Type;
if (PyType_Ready(&SubListType) < 0)
return NULL;
m = PyModule_Create(&sublistmodule);
if (m == NULL)
return NULL;
Py_INCREF(&SubListType);
PyModule_AddObject(m, "SubList", (PyObject *) &SubListType);
return m;
}
Before calling :c:func:`PyType_Ready`, the type structure must have the
:c:member:`~PyTypeObject.tp_base` slot filled in. When we are deriving an
existing type, it is not necessary to fill out the :c:member:`~PyTypeObject.tp_alloc`
slot with :c:func:`PyType_GenericNew` -- the allocation function from the base
type will be inherited.
After that, calling :c:func:`PyType_Ready` and adding the type object to the
module is the same as with the basic :class:`Custom` examples.
.. rubric:: Footnotes
.. [#] This is true when we know that the object is a basic type, like a string or a
float.
.. [#] We relied on this in the :c:member:`~PyTypeObject.tp_dealloc` handler
in this example, because our type doesn't support garbage collection.
.. [#] We now know that the first and last members are strings, so perhaps we
could be less careful about decrementing their reference counts, however,
we accept instances of string subclasses. Even though deallocating normal
strings won't call back into our objects, we can't guarantee that deallocating
an instance of a string subclass won't call back into our objects.
.. [#] Also, even with our attributes restricted to strings instances, the user
could pass arbitrary :class:`str` subclasses and therefore still create
reference cycles.

View file

@ -0,0 +1,137 @@
.. highlightlang:: c
.. _building-on-windows:
****************************************
Building C and C++ Extensions on Windows
****************************************
This chapter briefly explains how to create a Windows extension module for
Python using Microsoft Visual C++, and follows with more detailed background
information on how it works. The explanatory material is useful for both the
Windows programmer learning to build Python extensions and the Unix programmer
interested in producing software which can be successfully built on both Unix
and Windows.
Module authors are encouraged to use the distutils approach for building
extension modules, instead of the one described in this section. You will still
need the C compiler that was used to build Python; typically Microsoft Visual
C++.
.. note::
This chapter mentions a number of filenames that include an encoded Python
version number. These filenames are represented with the version number shown
as ``XY``; in practice, ``'X'`` will be the major version number and ``'Y'``
will be the minor version number of the Python release you're working with. For
example, if you are using Python 2.2.1, ``XY`` will actually be ``22``.
.. _win-cookbook:
A Cookbook Approach
===================
There are two approaches to building extension modules on Windows, just as there
are on Unix: use the :mod:`distutils` package to control the build process, or
do things manually. The distutils approach works well for most extensions;
documentation on using :mod:`distutils` to build and package extension modules
is available in :ref:`distutils-index`. If you find you really need to do
things manually, it may be instructive to study the project file for the
:source:`winsound <PCbuild/winsound.vcxproj>` standard library module.
.. _dynamic-linking:
Differences Between Unix and Windows
====================================
.. sectionauthor:: Chris Phoenix <cphoenix@best.com>
Unix and Windows use completely different paradigms for run-time loading of
code. Before you try to build a module that can be dynamically loaded, be aware
of how your system works.
In Unix, a shared object (:file:`.so`) file contains code to be used by the
program, and also the names of functions and data that it expects to find in the
program. When the file is joined to the program, all references to those
functions and data in the file's code are changed to point to the actual
locations in the program where the functions and data are placed in memory.
This is basically a link operation.
In Windows, a dynamic-link library (:file:`.dll`) file has no dangling
references. Instead, an access to functions or data goes through a lookup
table. So the DLL code does not have to be fixed up at runtime to refer to the
program's memory; instead, the code already uses the DLL's lookup table, and the
lookup table is modified at runtime to point to the functions and data.
In Unix, there is only one type of library file (:file:`.a`) which contains code
from several object files (:file:`.o`). During the link step to create a shared
object file (:file:`.so`), the linker may find that it doesn't know where an
identifier is defined. The linker will look for it in the object files in the
libraries; if it finds it, it will include all the code from that object file.
In Windows, there are two types of library, a static library and an import
library (both called :file:`.lib`). A static library is like a Unix :file:`.a`
file; it contains code to be included as necessary. An import library is
basically used only to reassure the linker that a certain identifier is legal,
and will be present in the program when the DLL is loaded. So the linker uses
the information from the import library to build the lookup table for using
identifiers that are not included in the DLL. When an application or a DLL is
linked, an import library may be generated, which will need to be used for all
future DLLs that depend on the symbols in the application or DLL.
Suppose you are building two dynamic-load modules, B and C, which should share
another block of code A. On Unix, you would *not* pass :file:`A.a` to the
linker for :file:`B.so` and :file:`C.so`; that would cause it to be included
twice, so that B and C would each have their own copy. In Windows, building
:file:`A.dll` will also build :file:`A.lib`. You *do* pass :file:`A.lib` to the
linker for B and C. :file:`A.lib` does not contain code; it just contains
information which will be used at runtime to access A's code.
In Windows, using an import library is sort of like using ``import spam``; it
gives you access to spam's names, but does not create a separate copy. On Unix,
linking with a library is more like ``from spam import *``; it does create a
separate copy.
.. _win-dlls:
Using DLLs in Practice
======================
.. sectionauthor:: Chris Phoenix <cphoenix@best.com>
Windows Python is built in Microsoft Visual C++; using other compilers may or
may not work (though Borland seems to). The rest of this section is MSVC++
specific.
When creating DLLs in Windows, you must pass :file:`pythonXY.lib` to the linker.
To build two DLLs, spam and ni (which uses C functions found in spam), you could
use these commands::
cl /LD /I/python/include spam.c ../libs/pythonXY.lib
cl /LD /I/python/include ni.c spam.lib ../libs/pythonXY.lib
The first command created three files: :file:`spam.obj`, :file:`spam.dll` and
:file:`spam.lib`. :file:`Spam.dll` does not contain any Python functions (such
as :c:func:`PyArg_ParseTuple`), but it does know how to find the Python code
thanks to :file:`pythonXY.lib`.
The second command created :file:`ni.dll` (and :file:`.obj` and :file:`.lib`),
which knows how to find the necessary functions from spam, and also from the
Python executable.
Not every identifier is exported to the lookup table. If you want any other
modules (including Python) to be able to see your identifiers, you have to say
``_declspec(dllexport)``, as in ``void _declspec(dllexport) initspam(void)`` or
``PyObject _declspec(dllexport) *NiGetSpamData(void)``.
Developer Studio will throw in a lot of import libraries that you do not really
need, adding about 100K to your executable. To get rid of them, use the Project
Settings dialog, Link tab, to specify *ignore default libraries*. Add the
correct :file:`msvcrtxx.lib` to the list of libraries.