GNU Info

Info Node: (python2.1-api.info)Thread State and the Global Interpreter Lock

(python2.1-api.info)Thread State and the Global Interpreter Lock


Prev: Initialization Up: Initialization
Enter node , (file) or (file)node

Thread State and the Global Interpreter Lock
============================================

The Python interpreter is not fully thread safe.  In order to support
multi-threaded Python programs, there's a global lock that must be held
by the current thread before it can safely access Python objects.
Without the lock, even the simplest operations could cause problems in
a multi-threaded program: for example, when two threads simultaneously
increment the reference count of the same object, the reference count
could end up being incremented only once instead of twice.

Therefore, the rule exists that only the thread that has acquired the
global interpreter lock may operate on Python objects or call Python/C
API functions.  In order to support multi-threaded Python programs, the
interpreter regularly releases and reacquires the lock -- by default,
every ten bytecode instructions (this can be changed with
`sys.setcheckinterval()').  The lock is also released and reacquired
around potentially blocking I/O operations like reading or writing a
file, so that other threads can run while the thread that requests the
I/O is waiting for the I/O operation to complete.

The Python interpreter needs to keep some bookkeeping information
separate per thread -- for this it uses a data structure called
`PyThreadState' .  This is new in Python 1.5; in earlier versions, such
state was stored in global variables, and switching threads could cause
problems.  In particular, exception handling is now thread safe, when
the application uses `sys.exc_info()' to access the exception last
raised in the current thread.

There's one global variable left, however: the pointer to the current
`PyThreadState'  structure.  While most thread packages have a way to
store "per-thread global data," Python's internal platform independent
thread abstraction doesn't support this yet.  Therefore, the current
thread state must be manipulated explicitly.

This is easy enough in most cases.  Most code manipulating the global
interpreter lock has the following simple structure:

     Save the thread state in a local variable.
     Release the interpreter lock.
     ...Do some blocking I/O operation...
     Reacquire the interpreter lock.
     Restore the thread state from the local variable.

This is so common that a pair of macros exists to simplify it:

     Py_BEGIN_ALLOW_THREADS
     ...Do some blocking I/O operation...
     Py_END_ALLOW_THREADS

The `Py_BEGIN_ALLOW_THREADS'  macro opens a new block and declares a
hidden local variable; the `Py_END_ALLOW_THREADS'  macro closes the
block.  Another advantage of using these two macros is that when Python
is compiled without thread support, they are defined empty, thus saving
the thread state and lock manipulations.

When thread support is enabled, the block above expands to the
following code:

         PyThreadState *_save;
     
         _save = PyEval_SaveThread();
         ...Do some blocking I/O operation...
         PyEval_RestoreThread(_save);

Using even lower level primitives, we can get roughly the same effect
as follows:

         PyThreadState *_save;
     
         _save = PyThreadState_Swap(NULL);
         PyEval_ReleaseLock();
         ...Do some blocking I/O operation...
         PyEval_AcquireLock();
         PyThreadState_Swap(_save);

There are some subtle differences; in particular,
`PyEval_RestoreThread()'  saves and restores the value of the  global
variable `errno' , since the lock manipulation does not guarantee that
`errno' is left alone.  Also, when thread support is disabled,
`PyEval_SaveThread()'  and `PyEval_RestoreThread()' don't manipulate
the lock; in this case, `PyEval_ReleaseLock()'  and
`PyEval_AcquireLock()'  are not available.  This is done so that
dynamically loaded extensions compiled with thread support enabled can
be loaded by an interpreter that was compiled with disabled thread
support.

The global interpreter lock is used to protect the pointer to the
current thread state.  When releasing the lock and saving the thread
state, the current thread state pointer must be retrieved before the
lock is released (since another thread could immediately acquire the
lock and store its own thread state in the global variable).
Conversely, when acquiring the lock and restoring the thread state, the
lock must be acquired before storing the thread state pointer.

Why am I going on with so much detail about this?  Because when threads
are created from C, they don't have the global interpreter lock, nor is
there a thread state data structure for them.  Such threads must
bootstrap themselves into existence, by first creating a thread state
data structure, then acquiring the lock, and finally storing their
thread state pointer, before they can start using the Python/C API.
When they are done, they should reset the thread state pointer, release
the lock, and finally free their thread state data structure.

When creating a thread data structure, you need to provide an
interpreter state data structure.  The interpreter state data structure
hold global data that is shared by all threads in an interpreter, for
example the module administration (`sys.modules').  Depending on your
needs, you can either create a new interpreter state data structure, or
share the interpreter state data structure used by the Python main
thread (to access the latter, you must obtain the thread state and
access its `interp' member; this must be done by a thread that is
created by Python or by the main thread after Python is initialized).

`PyInterpreterState'
     This data structure represents the state shared by a number of
     cooperating threads.  Threads belonging to the same interpreter
     share their module administration and a few other internal items.
     There are no public members in this structure.

     Threads belonging to different interpreters initially share
     nothing, except process state like available memory, open file
     descriptors and such.  The global interpreter lock is also shared
     by all threads, regardless of to which interpreter they belong.

`PyThreadState'
     This data structure represents the state of a single thread.  The
     only public data member is `PyInterpreterState *'`interp', which
     points to this thread's interpreter state.

`void PyEval_InitThreads()'
     Initialize and acquire the global interpreter lock.  It should be
     called in the main thread before creating a second thread or
     engaging in any other thread operations such as
     `PyEval_ReleaseLock()'  or `PyEval_ReleaseThread(TSTATE)' .  It is
     not needed before calling `PyEval_SaveThread()'  or
     `PyEval_RestoreThread()' .

     This is a no-op when called for a second time.  It is safe to call
     this function before calling `Py_Initialize()' .

     When only the main thread exists, no lock operations are needed.
     This is a common situation (most Python programs do not use
     threads), and the lock operations slow the interpreter down a bit.
     Therefore, the lock is not created initially.  This situation is
     equivalent to having acquired the lock: when there is only a
     single thread, all object accesses are safe.  Therefore, when this
     function initializes the lock, it also acquires it.  Before the
     Python `thread'  module creates a new thread, knowing that either
     it has the lock or the lock hasn't been created yet, it calls
     `PyEval_InitThreads()'.  When this call returns, it is guaranteed
     that the lock has been created and that it has acquired it.

     It is *not* safe to call this function when it is unknown which
     thread (if any) currently has the global interpreter lock.

     This function is not available when thread support is disabled at
     compile time.

`void PyEval_AcquireLock()'
     Acquire the global interpreter lock.  The lock must have been
     created earlier.  If this thread already has the lock, a deadlock
     ensues.  This function is not available when thread support is
     disabled at compile time.

`void PyEval_ReleaseLock()'
     Release the global interpreter lock.  The lock must have been
     created earlier.  This function is not available when thread
     support is disabled at compile time.

`void PyEval_AcquireThread(PyThreadState *tstate)'
     Acquire the global interpreter lock and then set the current thread
     state to TSTATE, which should not be `NULL'.  The lock must have
     been created earlier.  If this thread already has the lock,
     deadlock ensues.  This function is not available when thread
     support is disabled at compile time.

`void PyEval_ReleaseThread(PyThreadState *tstate)'
     Reset the current thread state to `NULL' and release the global
     interpreter lock.  The lock must have been created earlier and
     must be held by the current thread.  The TSTATE argument, which
     must not be `NULL', is only used to check that it represents the
     current thread state -- if it isn't, a fatal error is reported.
     This function is not available when thread support is disabled at
     compile time.

`PyThreadState* PyEval_SaveThread()'
     Release the interpreter lock (if it has been created and thread
     support is enabled) and reset the thread state to `NULL',
     returning the previous thread state (which is not `NULL').  If the
     lock has been created, the current thread must have acquired it.
     (This function is available even when thread support is disabled at
     compile time.)

`void PyEval_RestoreThread(PyThreadState *tstate)'
     Acquire the interpreter lock (if it has been created and thread
     support is enabled) and set the thread state to TSTATE, which must
     not be `NULL'.  If the lock has been created, the current thread
     must not have acquired it, otherwise deadlock ensues.  (This
     function is available even when thread support is disabled at
     compile time.)

The following macros are normally used without a trailing semicolon;
look for example usage in the Python source distribution.

`Py_BEGIN_ALLOW_THREADS'
     This macro expands to `{ PyThreadState *_save; _save =
     PyEval_SaveThread();'.  Note that it contains an opening brace; it
     must be matched with a following `Py_END_ALLOW_THREADS' macro.
     See above for further discussion of this macro.  It is a no-op
     when thread support is disabled at compile time.

`Py_END_ALLOW_THREADS'
     This macro expands to `PyEval_RestoreThread(_save); }'.  Note that
     it contains a closing brace; it must be matched with an earlier
     `Py_BEGIN_ALLOW_THREADS' macro.  See above for further discussion
     of this macro.  It is a no-op when thread support is disabled at
     compile time.

`Py_BLOCK_THREADS'
     This macro expands to `PyEval_RestoreThread(_save);': it is
     equivalent to `Py_END_ALLOW_THREADS' without the closing brace.
     It is a no-op when thread support is disabled at compile time.

`Py_UNBLOCK_THREADS'
     This macro expands to `_save = PyEval_SaveThread();': it is
     equivalent to `Py_BEGIN_ALLOW_THREADS' without the opening brace
     and variable declaration.  It is a no-op when thread support is
     disabled at compile time.

All of the following functions are only available when thread support
is enabled at compile time, and must be called only when the
interpreter lock has been created.

`PyInterpreterState* PyInterpreterState_New()'
     Create a new interpreter state object.  The interpreter lock need
     not be held, but may be held if it is necessary to serialize calls
     to this function.

`void PyInterpreterState_Clear(PyInterpreterState *interp)'
     Reset all information in an interpreter state object.  The
     interpreter lock must be held.

`void PyInterpreterState_Delete(PyInterpreterState *interp)'
     Destroy an interpreter state object.  The interpreter lock need
     not be held.  The interpreter state must have been reset with a
     previous call to `PyInterpreterState_Clear()'.

`PyThreadState* PyThreadState_New(PyInterpreterState *interp)'
     Create a new thread state object belonging to the given interpreter
     object.  The interpreter lock need not be held, but may be held if
     it is necessary to serialize calls to this function.

`void PyThreadState_Clear(PyThreadState *tstate)'
     Reset all information in a thread state object.  The interpreter
     lock must be held.

`void PyThreadState_Delete(PyThreadState *tstate)'
     Destroy a thread state object.  The interpreter lock need not be
     held.  The thread state must have been reset with a previous call
     to `PyThreadState_Clear()'.

`PyThreadState* PyThreadState_Get()'
     Return the current thread state.  The interpreter lock must be
     held.  When the current thread state is `NULL', this issues a fatal
     error (so that the caller needn't check for `NULL').

`PyThreadState* PyThreadState_Swap(PyThreadState *tstate)'
     Swap the current thread state with the thread state given by the
     argument TSTATE, which may be `NULL'.  The interpreter lock must
     be held.

`PyObject* PyThreadState_GetDict()'
     Return a dictionary in which extensions can store thread-specific
     state information.  Each extension should use a unique key to use
     to store state in the dictionary.  If this function returns
     `NULL', an exception has been raised and the caller should allow
     it to propogate.


automatically generated by info2www version 1.2.2.9