Ahh, threadsthose little things that seem so simple to understand and to use, but are so hard to use correctly. The less you have used threads, the simpler they appear, but speak to any experienced thread programmer, and they can regale you with tales of all-night sessions tracking down threading-related bugs nearly impossible to reproduce.
So why an appendix instead of a chapter? Indeed, the question should be why an appendix instead of a complete book! Threads are deceptively complex, usually due to the synchronization required between the different threads to help coordinate their activity. These considerations aren't specific to Windows, and there are plenty of excellent references available on threaded programming independent of the operating system or programming language available.
We make no attempt to cover the basics of thread programming. Instead, this appendix attempts to cover only some of the Windows specific threading-related issues. We don't attempt to teach thread programming, and certainly won't present any complex or real-world threaded examples. We simply restrict ourselves to discussions of threading-specific issues that may affect you when using Python on the Windows platform.
Python support threads through a number of built-in modules. For many threading-related tasks, you are likely to find these modules more than adequate, your code is portable across all platforms that support threading, and you have access to the copious documentation and examples available.
The most general-purpose of these modules is the threading module, which provides an interface modeled after Java's threading support. This module provides a
Thread class to manage your threads and also a number of synchronization objects necessary in any nontrivial threading application.
Here's a trivial example using the threading module. Let's create a new thread as a subclass of the threading.Thread class and override the run() method that implements the thread. Each thread loops five times, printing a message each time. It should be noted that subclassing the Thread class is just one way to implement this thread; it's also possible to implement the new thread in a normal function.
The main thread creates three worker threads, then waits for them all to complete, using the join() method provided by the base class:
# SimpleThreads.py
#
# Trivial example of using the Python threading module.
import Threading
import time
import random
class SimpleThread(Threading.Thread):
def run(self):
for i in range(5):
print self, i
time.sleep(random.random())
if __name__=='__main__':
threads = []
for i in range(3):
thread = SimpleThread()
thread.start()
threads.append(thread)
# Now we wait for them to finish.
for thread in threads:
thread.join()
print "All threads finished"
Running this script should produce output similar to:
<SimpleThread(Thread-8, started)> 0
<SimpleThread(Thread-9, started)> 0
<SimpleThread(Thread-10, started)> 0
<SimpleThread(Thread-8, started)> .1
<SimpleThread(Thread-10, started)> 1
<SimpleThread(Thread-8, started)> 2
<SimpleThread(Thread-9, started)> 1
<SimpleThread(Thread-9, started)> 2
<SimpleThread(Thread-10, started)> 2
<SimpleThread(Thread-8, started)> 3
<SimpleThread(Thread-10, started)> 3
<SimpleThread(Thread-9, started)> 3
<SimpleThread(Thread-8, started)> 4
<SimpleThread(Thread-10, started)> 4
<SimpleThread(Thread-9, started)> 4
All threads finished
As you can see, implementing, starting, and waiting for completion of threads is quite simple. Please see the Python documentation for more information on the threading module, including the various synchronization objects supported by the module.
As Python runs on many operating systems, the Python thread support is limited to a reasonable subset of what a platform can be expected to provide in the way of threading. Windows provides a number of additional features that relate to threads and synchronization, and we discuss some of these here.
The win32process module provides access to the beginthreadex() function provided by the Microsoft Visual C++ runtime library. This function allows you to specify a function as the thread, as well as some custom Win32 setting for the thread.
There are only a small number of situations when it's necessary to use this function in preference to the standard threading module. The first is when you need access to the Win32-specific features, such as the security for the object or flags that indicate the thread should be created in a suspended mode. Another common situation is when the main thread requires the Win32 HANDLE of the new thread; this is not easy using the other Python threading modules (where only the new thread itself has easy access to this information).
Each application has different synchronization requirements. Some programs may need to wait for threads to complete, while some threads may need to wait for a file operation to complete or mutexes to become available. To cater to these various requirements, Windows bases all its synchronization primitives around Windows HANDLES. When you wish to wait for something of significance, you usually pass a handle. For example, you can wait for a thread or process to complete by specifying its handle; you can wait for a file operation to complete by waiting on the handle in the OVERLAPPED object you specified. You can wait for the mutex, semaphore, event, or other objects by passing the handle you obtained when opening or creating the object. Thus, regardless of the type of object or event you are waiting for, you always use handles and can use the same Win32 functions.
There are three functions exposed by win32event that wait for Win32 objects: WaitForSingleObject(), WaitForMultipleObjects(), and MsgWaitForMultipleObjects(). Each of these functions allow you to wait for one or more handles to become signaled, but exactly what signaled means depends on the object. For example, a signaled synchronization object typically means you have acquired the object, a signaled thread or process handles mean the thread has terminated, and so forth.
Here are the three functions.
WaitForSingleObject()
As the name implies, this function allows you to wait for a single object to become signaled. It takes two parameters; the handle to the object you wish to wait for, and a timeout in milliseconds (or win32event.INFINITE for no timeout). The return value from the function is win32event.WAIT_OBJECT_0 if the object becomes signaled, win32event.WAIT_TIMEOUT if the timeout interval expired, or win32event.WAIT_ABANDONED in certain situations for mutexes (see the Win32 documentation).
WaitForMultipleObjects()
Allows you wait for one or all of a number of objects. The first parameter is a sequence (e.g., list or tuple) of handles, while the second is a boolean flag indicating if you wish the function to return when all objects are signaled (true) or when any one of the objects becomes signaled (false). The third parameter is a timeout interval, as for WaitForSingleObject(). The return code from this function is similar to WaitForSingleObject(), except the result may range from win32event.WAIT_OBJECT_0 to win32event.WAIT_OBJECT_0 + len(handles)-1. If you indicate you wish to wait for only one of the objects, this tells which object became signaled. Chapter 18, Windows NT Services, contains examples of using WaitForMultipleObjects() to wait for either a service control request or a client connection and demonstrates how to decode the return values.
MsgWaitForMultipleObjects()
Almost identical to WaitForMultipleObjects() but also allows you to detect that a Windows message is ready to be processed by the thread. This information is particularly relevant for both GUI programs that make use of threading, and objects that use apartment-threaded COM objects, as described later in this appendix. Please see Appendix C, The Python Database API Version 2.0, for a description of these functions or the final sample in this appendix for an example of this function's usage.
|
For various, and mainly historic, reasons, COM has the concept of
threading models. Most often it's an implementation detail of little
importance, so can safely be skipped by the casual COM user. A full
discussion of COM's threading models is beyond the scope of this book
(and seemingly beyond the scope of the COM documentation!), however some
detailed information about this esoteric part of COM may help explain
odd behavior you may encounter.
Each object lives in what COM terms a threading apartment, of which there are two types, free-threaded and single-threaded. A process can have zero or one free-threaded apartments and any number of single-threaded apartments (one for each thread with a single-threaded object).
The apartment is nothing more than a conceptual framework invented by COM to explain the rules and other nuances of using threading with COM. An apartment is a grouping of objects by their threading characteristics. Before a thread can use COM, it must indicate its threading model (that is, if a new single-threaded apartment should be created, or if this thread should live in the free-threaded apartment). The apartment an object lives in is determined either by the implementation of the object or the thread that created the object, as we discuss later.
The point of the COM-threading models is to allow simple objects that aren't written with threads in mind to be used by another object that is thread-aware. If an object is written with the assumption that concurrent access to the object isn't possible, then using such an object from multiple threads is likely to be disastrous. Therefore, threads that reside in the same apartment can make unrestricted use of all objects in that apartment, but whenever threads from different apartments (that is, two threads that are not both in the free-threaded apartment) need to use an object, COM steps in. COM uses what is known as a proxy to automatically synchronize the threads so the object is correctly called from a thread in that object's apartment. COM also impose rules to allow this mechanism to work.
The obvious question to arise from this is ''How do I control the apartment for my threads or objects?'' There is no simple answer.
Fortunately, the rules for threads are quite simple. Before a thread can use COM, it must call one of the CoInitialize() or CoInitializeEx() functions and when it's done with COM, it must call CoUninitialize(); these functions are exposed to Python by the pythoncom module. CoInitialize() predates the COM threading models, so it initializes a new single-threaded apartment for the thread. CoInitializeEx() takes an additional parameter that allows you to specify the threading model; thus, you must use this function to have your thread in the free-threading apartment. The first single-threaded apartment created (that is, the first thread that calls either CoInitialize() or CoInitializeEx() with the COINIT_APARTMENTTHREADED flag) is given special significance as we discuss later, and is known as the main single-threaded apartment.
To hide some of this complexity, Python calls CoInitializeEx() automatically as soon as the pythoncom module is imported, and this is significant for the following reasons:
� The threading apartment for the first Python thread that imports the pythoncom module is controlled by this automatic process. By default, this thread is initialized in a single-threaded apartment, but this can be controlled by adding a co_initflags attribute to the sys module before importing pythoncom (see the final sample in this appendix). If this attribute exists, it's passed unchanged to the CoInitializeEx() function by the PythonCOM framework. For example, you could execute the following code to ensure the main thread is initialized in the multithreaded apartment:
import sys
sys.coinit_flags = 0 # pythoncom.COINIT_MULTITHREADED = 0
import pythoncom
� As the default behavior is to initialize a single-threaded apartment, this Python thread may also become the main single-threaded apartment, as discussed previously. The implications for the main single-threaded apartment are discussed later in this appendix.
� Only this main thread has CoInitializeEx() called automatically.* Any other threads you create need to call pythoncom.CoInitializeEx() explicitly before using COM and pythoncom.CoUninitialize() when complete.
The rules for which object an apartment lives in are slightly more complex. If the COM object in question is implemented in any way other than an InProc DLL (for example, a LocalServer or RemoteServer EXE-based object), the question becomes moot, as the object is running in a separate process, and therefore can not be in the same apartment. For DLL-implemented objects, the apartment is determined both by the apartment of the thread creating the object and the threading models actually supported by the object.
When an InProc object is registered, part of the information written to the registry is the threading models supported by the object. This can be either Apartment, to indicate the object must live in a single-threaded apartment, Free to indicate the object must live in the multithreaded apartment, or Both if the object supports either technique. As discussed in Chapter 12, Advanced Python and COM, this is controlled for Python objects via the _reg_threading_ attribute, with the default for Python objects being Both.
If the thread creating the object and the object itself have compatible threading models, the object is created in the thread's apartment. If the object is an old COM
* CoUninitialize() isn't called for the main Python thread automatically, as doing so can often cause more problems than it solves. This function can still be called manually from the main thread. |
object (indicated by the lack of threading information in the object's registration information) the object may be created in the main single-threaded apartment. If a multithreaded apartment needs to create a single-threaded object, COM automatically creates a new single-threaded apartment for the new object.
For all this complicated machinery to work, there are a number of rules COM imposes on programs that use COM.
The synchronization of calls between different threads is achieved using Windows messages. This means that all threads in a single-threaded apartment must run a message loop to allow this mechanism to work. If the program is a GUI (such as PythonWin) this is no problem, but for most other applications, including Windows Services, this may not be an existing requirement. In practice, this means if any of your threads that exist in a single-threaded COM apartment need to wait on some synchronization object, you may need to use either the win32event.MsgWaitForSingleObject() or win32event.MsgWaitForMultipleObjects() calls so you can still process messages at the appropriate time. If you have no other message requirements, calling pythoncom.PumpWaitingMessages() processes all messages currently in the thread's queue. This technique is demonstrated in the example in the next section.
The other major rule imposed by the COM threading models is that it's illegal to pass COM interface pointers (and therefore the Python wrappers) between threads. As you may be passing the pointer from the same apartment to a different apartment, you may be avoiding or violating the synchronization mechanisms (and other optimizations) provided by COM. To pass interface objects between threads, you must use the pythoncom.CoMarshalInterThreadInterfaceInStream() and pythoncom.CoGetInterfaceAndReleaseStream() functions to transfer objects between threads. These functions are demonstrated next.
It's time to demonstrate some of these concepts. To do this, we develop three COM objects, each of which support one of the various threading models discussed previously. These COM objects are quite simple and expose only two methods: GetCreatedThreadId() to return the thread ID of the thread that created the object, and GetCurrentThreadId() to return the thread ID of the current thread (that is, the thread receiving the call). If you have read Chapter 12, there will be nothing new in this example. The only points worth mentioning are that we use win32api.GetCurrentThreadId() to obtain the Win32 Thread ID,
and that we use a Python base class for the raw COM functionality, and subclasses for the object-specific registration information. The COM objects are implemented in ThreadingModelsServer.py:
# ThreadingModelsServer.py
# Python COM objects that demonstrate COM threading models.
#
# Exposes 3 Python objects, all of which have identical functionality,
# but each indicate they support different threading models.
import win32api
# A Base class for our 2 trivial objects.
class ThreadDemoObject:
_public_methods_ = [ 'GetCurrentThreadId', 'GetCreatedThreadId' ]
def __init__(self):
self.created_id = win32api.GetCurrentThreadId()
def GetCreatedThreadId(self):
return self.created_id
def GetCurrentThreadId(self):
# Simply return an integer with the Win32 thread ID.
return win32api.GetCurrentThreadId()
class ThreadApartmentObject(ThreadDemoObject):
_reg_threading_ = "Apartment" # Tell COM to synchronize
_reg_progid_ = "PythonThreadDemo.Apartment"
_reg_clsid_ = "{511BB541-4625-11D3-855B-204C4F4F5020}"
class ThreadFreeObject(ThreadDemoObject):
_reg_threading_ = "Free"
_reg_progid_ = "PythonThreadDemo.Free"
_reg_clsid_ = "{511BB542-4625-11D3-855B-204C4F4F5020}"
class ThreadBothObject(ThreadDemoObject):
_reg_threading_ = "Both"
_reg_progid_ = "PythonThreadDemo.Both"
_reg_clsid_ = "{511BB543-4625-11D3-855B-204C4F4F5020}"
if __name__=='__main__':
import win32com.server.register
win32com.server.register.UseCommandLine(
ThreadApartmentObject,
ThreadFreeObject,
ThreadBothObject)
Before moving to the client sample code, these objects must be registered in the normal way.
The code that uses these COM objects is considerably more complex because it's here the COM object, and the threads that use it, are created. The general intent of the code is to create the single-threaded object we defined and then create three threads that use this object. The code confirms that so long as you follow the COM rules, COM ensures that regardless of the thread actually calling the object, the object will see the call on its single thread (i.e., the thread that created it.) You then execute the same code but create the free-threaded version of the object, and observe the differences.
Before launching into the code, there are some points to discuss:
� The main thread needs to wait for the subthreads to complete, but as you will be running single-threaded objects in the apartment, you need to process Windows messages. Therefore use the win32process.beginthreadex() function to create the thread, so that you can use the thread handles with win32event.MsgWaitForMultipleObjects().
� All the threads exist in separate single-threaded apartments; the main thread because we haven't overridden the default Python initialization by setting sys.coinit_flags, and each worker thread because each calls pythoncom.CoInitialize() rather than pythoncom.CoInitializeEx(). Because all the threads are in different apartments, you must use the CoMarshalInterThreadInterfaceInStream() and CoGetInterfaceAndReleaseStream() functions to transfer the COM object between threads.
� MsgWaitForMultipleObjects() has a quirk that usually prevents effective use of the bWaitAll parameter. If set to true, the function waits until all objects have been signaled, and input is available. Generally, you need to know when all objects are signaled, or input is available. You can avoid this restriction by setting bWaitAll to false, and each time a thread completes remove its handle from the list before waiting again.
� The main body of the sample code accepts the name of the COM object as a parameter. This lets you run the same code with both the single-threaded and free-threaded versions of the COM object.
The code is presented in SingleThreadedApartment.py:
# SingleThreadedApartment.py
# Demonstrate the use of multiple threads, each in their own
# single-threaded apartment.
# As we do not set sys.coinit_flags=0
# before the Pythoncom import, Python
# initializes the main thread for single threading.
from pythoncom import \
CoInitialize, CoUninitialize, IID_IDispatch,\
CoMarshalInterThreadInterfaceInStream, \
CoGetInterfaceAndReleaseStream, \
PumpWaitingMessages
from win32event import \
MsgWaitForMultipleObjects, \
QS_ALLINPUT, WAIT_TIMEOUT, WAIT_OBJECT_0
from win32com.client import Dispatch
from win32process import beginthreadex
from win32api import GetCurrentThreadId
def Demo( prog_id ):
# First create the object
object = Dispatch(prog_id)
print "Thread", GetCurrentThreadId(), "creating object"
created_id = object.GetCreatedThreadId()
print "Object reports it was created on thread", created_id
# Now create the threads, remembering the handles.
handles = []
for i in range(3):
# As we are not allowed to pass the object directly between
# apartments, we need to marshal it.
object_stream = CoMarshalInterThreadInterfaceInStream(
IID_IDispatch, object )
# Build an argument tuple for the thread.
args = (object_stream,)
handle, id = beginthreadex(None, 0, WorkerThread, args, 0)
handles.append(handle)
# Now we have all the threads running, wait for them to terminate.
# also remember how many times we are asked to pump messages.
num_pumps = 0
while handles:
# A quirk in MsgWaitForMultipleObjects means we must wait
# for each event one at a time
rc = MsgWaitForMultipleObjects(handles, 0, 5000, QS_ALLINPUT)
if rc >= WAIT_OBJECT_0 and rc < WAIT_OBJECT_0+len(handles):
# A thread finished - remove its handle.
del handles[rc-WAIT_OBJECT_0]
elif rc==WAIT_OBJECT_0 + len(handles):
# Waiting message
num_pumps = num_pumps + 1
PumpWaitingMessages()
else:
print "Nothing seems to be happening",
print "but I will keep waiting anyway�"
print "Pumped messages", num_pumps, "times"
print "Demo of", prog_id, "finished."
def WorkerThread(object_stream):
# First step - initialize COM
CoInitialize() # Single-threaded.
# Unmarshal the IDispatch object.
object = CoGetInterfaceAndReleaseStream(
object_stream, IID_IDispatch)
# The object we get back is a PyIDispatch, rather
# than a friendly Dispatch instance, so we convert
# to a usable object.
object = Dispatch(object)
this_id = GetCurrentThreadId()
that_id = object.GetCurrentThreadId()
message = "Thread is %d, and object is on thread %d" % \
(this_id, that_id)
print message
# Be a good citizen and finalize COM, but
# first remove our object references.
object = None
CoUninitialize()
if __name__=='__main__':
print "Running with Apartment Threaded object"
Demo("PythonThreadDemo.Apartment")
print "Running with Free Threaded object"
Demo("PythonThreadDemo.Free")
You should run this code from a command prompt rather than PythonWin or IDLE, just to ensure that the threading doesn't interfere with these applications. When run, the output from this script should be similar to:
Running with Apartment Threaded object
Thread 355 creating object
Object reports it was created on thread 355
Thread is 354, and object is on thread 355
Thread is 314, and object is on thread 355
Thread is 306, and object is on thread 355
Pumped messages 9 times
Demo of PythonThreadDemo.Apartment finished.
Running with Free Threaded object
Thread 355 creating object
Object reports it was created on thread 318
Thread is 326, and object is on thread 433
Thread is 399, and object is on thread 433
Thread is 362, and object is on thread 318
Pumped message 0 times
Demo of PythonThreadDemo.Free finished.
The output before the blank line represents the single-threaded object, so let's examine that first. The main Python thread reports itself as thread 355, and the object itself was also created on thread 355, as expected. Each of the three threads that started to use the object does indeed get a unique thread ID, but regardless of the thread making the call, the object always sees the call on thread 355, the thread that created the object. You can also see that while this simple test was running, and the main thread was waiting for the threads to terminate, you were called upon nine times to process Windows messages.
The output after the blank line represents the same test code, but uses the free-threaded object. As you can see, the same main thread is creating the COM object, but this time the object reports it was created on thread 318. As this thread is in a single-threaded apartment, and the COM object insists on a free-threading apartment. COM has spun up a new thread to host the object. As each thread calls the object, the object itself isn't restricted to receiving the call on a single thread. Because you're in different apartments the threads are still different, but the same single-threaded restrictions don't apply. Also notice that in this scenario, it's not strictly necessary to run a message pump, as there are no single-threaded COM objects being hosted on the main thread.
To help complete the picture, we now present a fully multithreaded example. It's almost identical to the one just presented, with the following changes:
� The main thread is forced into the free-threaded apartment by setting sys.coinit_flags to zero before importing pythoncom. Each worker thread is forced by calling pythoncom.CoInitializeEx().
� Since all the threads are in the free-threading apartment, you can freely pass COM objects between threads and avoid those functions with the huge names!
� Since you don't need to process messages, replace the convoluted MsgWaitForMultipleObjects() call with a single WaitForMultipleObjects() call. Ideally this code would use the Python threading model, but we've kept the basic code layout so it's easy to compare the differences.
The code is presented in FreeThreadedApartment.py:
# FreeThreadedApartment.py
# Demonstrate the use of multiple threads all in the same
# multithreading apartment.
# before the Pythoncom import, we specify free-threading.
import sys
sys.coinit_flags=0
from pythoncom import \
CoInitializeEx, CoUninitialize, \
COINIT_MULTITHREADED
from win32event import \
WaitForMultipleObjects, \
WAIT_ABANDONED
from win32com.client import Dispatch
from win32process import beginthreadex
from win32api import GetCurrentThreadId
def Demo( prog_id ):
# First create the object
object = Dispatch(prog_id)
print "Thread", GetCurrentThreadId(), "creating object"
created_id = object.GetCreatedThreadId()
print "Object reports it was created on thread", created_id
# Now create the threads, remembering the handles.
handles = []
for i in range(3):
# Multi-threaded - just pass the objects directly to the thread.
args = (object,)
handle, id = beginthreadex(None, 0, WorkerThread, args, 0)
handles.append(handle)
# Now we have all the threads running, wait for them to terminate.
# No need for message pump, so we can simply wait for all objects
# in one call.
rc = WaitForMultipleObjects(handles, 1, 5000)
if rc == WAIT_ABANDONED:
print "Gave up waiting for the threads to finish!"
print "Demo of", prog_id, "finished."
def WorkerThread(object):
# First step - initialize COM
CoInitializeEx(COINIT_MULTITHREADED)
this_id = GetCurrentThreadId()
that_id = object.GetCurrentThreadId()
message = "Thread is %d, and object is on thread %d" % \
(this_id, that_id)
print message
# Be a good citizen and finalize COM, but
# first remove our object references.
object = None
CoUninitialize()
if __name__=='__main__':
print "Running Free threaded with Free Threaded object"
Demo("PythonThreadDemo.Free")
When you run this script, the output should be similar to:
Running Free threaded with Free Threaded object
Thread 329 creating object
Object reports it was created on thread 329
Thread is 340, and object is on thread 340
Thread is 324, and object is on thread 324
Thread is 444, and object is on thread 444
Demo of PythonThreadDemo.Free finished.
This is exactly as expected: each call to the object is completely transparent, just like a regular function call, and always occurs on the thread that initiated the call. If you wish to get your hands even dirtier, you may wish to modify these examples to demonstrate every other possible combination of threads, objects, and threading apartments!
There are a number of technical articles and snippets from books available from Microsoft, from MSDN, or online at http://msdn.microsoft.com/. A good starting point for more information is Knowledge Base Article Q150777 (online at http://support.microsoft.com/support/kb/articles/q150/7/77.asp).
Windows has a fairly flexible threading interface when it comes to the windows and other parts of the user interface, but there are still a number of restrictions.
Almost any thread in the system can create a window, but only the thread that created the window can process messages for it. In addition, it's generally not a good idea to call window functions from threads other than the thread that created the window. Using the PostMessage functions is fine, but be careful using any function that either directly or indirectly causes a message to be sent a window bypassing the message queue. As this is a Win32 restriction rather than a Python one, the restriction applies whether you use PythonWin, Tkinter, wxPython or some other GUI framework on Win32.
Microsoft provide some excellent articles on threading considerations when using windows from the Win32 API, which you should review for further information.
This appendix has discussed some of the major threading issues you will encounter using Python on Win32. We have made absolutely no attempt to explain either thread programming in general, or the many threading and synchronization functions available on the Win32 platforms. We simply tried to explain the Python-specific issues when using these functions.
For further information on Python's standard threading capabilities, please see the Python documentation optionally installed with the Python binaries. For more information on the Win32 threading and synchronization capabilities, please see the Microsoft Win32 documentation.