When is a Lock not a Lock?

Tags: ASCOM, software engineering, software-engineering, threading

 

Answer: When it’s in the same thread.

locks are thread synchronization primitives, so if your code could somehow run twice in the same thread, then a lock will not protect you. Sounds impossible, right? How can your code be running twice in the same thread? Well, if you happen to be running inside a Single Threaded Apartment (STA) thread, then it is quite likely that it will happen. How come? Well, the STA threading model is a COM concept that was designed to allow developers to assume that there will only ever be one thread and ignore all the gnarly issues that come with multi-threaded programming. Seems like a nice idea, until you write programs in a .NET language where you don’t have to worry at all about single threaded apartments. Many .NET developers, myself included until quite recently, are oblivious to such arcane concepts. However, if your .NET code runs inside an application based on Windows Forms, for example, or is somehow loaded as a COM object via COM Interop as is the case for an ASCOM driver, then you can suddenly find yourself running in an STA thread. Sometimes STA threads are known as the ‘UI Thread’ because user interfaces such as Windows Forms are built on top of COM, are inherently not thread-safe and insist on running in an STA thread. There can be more than one apartment though; and they are not always user interface threads. If you’ve ever written a Windows Forms application, take a look at your Main() method, it will be decorated with an [STAThread] attribute, meaning it must run in a single threaded apartment.

Now STA threads are single threaded but they are not isolated from things like events, calls from other threads, asynchronous callbacks and so on. However, these things cannot call directly into the STA Thread. All calls to an object in an STA thread must come from within the STA thread itself. The thread can’t simply be interrupted by another thread, event or async callback. The COM runtime normally takes care of all this behind the scenes but in .NET applications where COM isn’t around to police the threading, things can get a bit messed up. You’ll experience this if you ever try to perform a cross-thread update of a Windows Forms control. If you are lucky, Visual Studio will catch the problem and complain loudly; if you are unlucky you’ll have a crazy bug on your hands. In .NET, when a cross-thread call to an STA thread is needed, a SynchornizationContext object is used (sometimes behind the scenes) to post an item into the Windows Message Queue for the destination STA thread. The message contains all the information needed to call the right code on the STA thread, but there it sits in the queue along with the mouse move and button click messages until something somehow processes it.

That thing is called a Message Pump. The heart of a WinForms application is an idle loop which pops messages off the message queue and dispatches (executes) them. Messages can result in your code being called, for example a button OnClick Handler. Once the handler is complete it returns back to the message pump loop and the next message is processed. Everything that happens on the STA thread goes through this process. In a typical .NET Windows Forms app, the message pump is started by the Application.Run() method in your Main() method.When the message pump loop exits, your application exits too.

One of the cast iron rules of STA threads is: thou shalt not block the thread. Doing so would be bad for several reasons, including freezing the user interface and the potential for creating deadlock. So what happens if we issue an instruction like Thread.Sleep(x), or ManualResetEvent.WaitOne()? In .NET applications, we expect those to be blocking calls, and in most circumstances they are! However, in an STA thread, that would break the grand taboo of ‘thou shalt not block’, so when a lock is encountered on an STA thread, the thread doesn’t actually stop dead. Most things that can stop the thread are implemented in such a way that they pump messages while your code is ‘waiting’. This can result in another part of your application being dispatched from the message queue all while you think your thread is asleep. The illusion of blocking is strong, but in fact you cannot convincingly stop an STA thread without trying really hard to do so.

The implication of all this is that your code that is ‘waiting’ in Thread.Sleep() or whatever, can get called again by another part of the program (or perhaps even a client application). This re-entrancy cannot be blocked by a construct such as lock() or any other thread synchronization lock, because it is on the same thread as the lock and locks only protect you from other threads.

This can lead to very subtle and hard to diagnose issues. The particular issue I have run into is with ASCOM in-process drivers being loaded into VB6 applications, especially if one of the applications is a hub (POTH for example). The symptoms are that commands and responses sent/received over the serial port will get jumbled up, despite careful thread synchronization.

So how do we fix this? Well, that will be the subject of another post as we are still trying various techniques to see what works best. One interesting line of investigation is to use the Reactive Extensions for .NET, sometimes known as ‘LINQ to Events’, which offer a novel approach to handling sequences of data and solves a lot of concurrency issues into the bargain. Other solutions could include:

  • Ensuring that the COM registration declares the threading state as FREE or MTA and not STA.
  • Ensuring that all serial I/O is handed off to a worker thread that does not run within the STA thread.

 

I will post again with some examples in code when we are confident that we have a cast iron solution.

No Comments

Add a Comment

Let Us Help You

Find us on Facebook