Introduction to Async and Parallel Programming in .NET 4

joe-hummel-v2Welcome to this review of the Pluralsight course Introduction to Async and Parallel Programming in .NET 4 by Dr Joe Hummel.

Joe has a PhD in the field of high-performance computing and has been specializing in Microsoft technologies since 1992.

He is well-versed in Microsoft’s High-Performance Computing initiative (HPC Server, Compute Cluster Server, MPI, MPI.NET, OpenMP, PFx), web technologies (ASP.NET and Ajax Extensions for ASP.NET), the desktop (WinForms), LINQ, .NET Framework, and its most popular languages (VC++, C#, F# and VB).

Introduction to Async and Parallel Programming in .NET 4

This course was originally released in early 2012 and does not include the more recent innovations such as async await.

To learn about the latest asynchronous technologies you can either watch:

Getting Started with Asynchronous Programming in .NET by Filip Ekberg, or

Asynchronous C# 5.0 by Jon Skeet

Nevertheless this course still includes many fundamental concepts of Async and Parallel programming which are not widely taught or covered elsewhere.

.NET 4 arrived with a new programming model based on Tasks. What is a Task and how do we use them for asynchronous programming?

Tasks and Task based programming

Motivation: Responsiveness and Performance

Joe begins by explaining two main topics of this course. Each has a separate motivation:

Async Programming

Responsiveness: hide latency of potentially long-running or blocking operations (e.g. I/O) by starting operations in the background

Parallel programming

Performance: reduce time of CPU bound operations by dividing workload and executing tasks simultaneously

.NET 4.0 introduced the Task Parallel Library.

Why do we need this approach we already have:

  • Threads
  • Async programming model (APM)
  • Event based async pattern (EBM) (e.g. BackgroundWorker)
  • QueueUserWorkItem

Joe explains that the Microsoft async model is evolutionary and evolving over time.

.NET 4.0 was an evolutionary step forward and provided benefits including:

  • Canceling
  • Easier exception handling
  • Higher-level constructs

What’s a task?

Joe gives a couple of definitions for a Task:

Simply: a unit of work

More precisely: an object denoting an ongoing operation or computation

Creating a task

We are walked through the following code:

using System.Threading.Tasks;

Task T = new Task(code); //computation to perform

T.Start(); //tells .NET that task “can” start, then returns immediately

Joe says we need to really train our minds to image what happens next.

The code splits, or as it is called “forks” and now two code streams execute concurrently. The code that was originally executing continues to execute, and the Task executes as another fork.

Execution model

The basic concept is code-based tasks are executed by a thread running on some processor in the machine.

A thread is dedicated to a task until the task completes.

To see this from two different perspectives, we are shown illustrations of a computer with a single core, and a server blade with 8 cores.

Single Core machine

On a single core machine, multiple threads will end up sharing that single core.

Joe says in many cases running a main thread and a worker thread on the same core makes perfect sense.

If the main thread has very little work to do (e.g. it’s just monitoring the UI) that allows lots of CPU cycles for worker threads even on a single core.

That is the idea behind asynchronous programming.

Multi-core machine

Here the main thread runs on one core, and each worker thread runs on another core.

So each thread can be running in parallel.

Task completion

Joe explains that the code block exits either naturally or by throwing an exception. When this happens the task in complete

Demo 1

We see a demonstration of asynchronous programming for responsiveness.

This is an Asian Options finance modeling application and we start off with a synchronous version of the program.

It takes 4.4 seconds to calculate the price.

The problem is we cannot get any other work done while this calculation is being done. For example we cannot move the window position until the calculation completes.

Adding Tasks

Joe explains that even on a single core machine, adding Task based programming will improve responsiveness.

Inside the Task Joe creates a lambda expression. If you’ve never used lambda expressions before you might want to see Deborah Kurata’s Practical LINQ course. We cover lambda expressions later on in this module however.

The whole code block is put inside the lambda, so that all of that code will run on a separate thread.

Finally we add T.Start() to kick off our new task.

Our app is now responsive, but our application has crashed!

We see the message “AsianOptions has stopped working. Windows is checking for a solution to the problem…”

Yes Windows, good luck finding that one!

Why did it crash?

Joe coded it this way intentionally so that we can see some of the ramifications of doing asynchronous and parallel programming.

In .NET the only thread that’s allowed to update the user interface is the thread that created it. This is known as the UI thread.

First solution attempt

When we start a task we can tell .NET which context we want the task to run in:

T.Start(TaskScheduler.FromCurrentSynchronizationContext()); //UI thread

The app no longer crashes but it is no longer responsive. So we’re back to square one!

Correct solution

We need to split the one task into two tasks:

  1. Computation task – for running the simulation
  2. UI update task 

We see how to split out the code into two blocks, and we use the TPL ContinueWith method:

Task T2 = T.ContinueWith((antecedent) => { //code block
}, TaskScheduler.FromCurrentSynchronizationContext() );

Demo 1 Summary

Run the simulation as a separate task
Update UI in context of UI thread

Creating tasks more efficiently

This code is equivalent, but slightly more efficient because it’s one call instead of two:

Task T = Task.Factory.StartNew( code );

Demo 2: programming for performance

Now that our app is responsive, how can we make it perform better?

We see that Joe’s PC has four cores, but only one of them is being used properly.

Joe just removes the code that disables the button after it is clicked, and the line that re-enables it at the end of the computation.

We can now click the button multiple times and we see that it gets run on a different core the second time.

Shared variables

There’s a slight problem to fix. We want to keep a count of how many tasks are running and only stop the spinning UI control when there are no tasks left.

Joe adds a label to the app and increments the counter each time the button is clicked. When the task completes we decrement the counter. We update the label and stop the spinner when the value is 0.

Our app is now both responsive and performant.

Verifying correctness

We should always think about shared resources when doing asynchronous and parallel programming.

This could be files, collections, or just a single scalar variable.

Joe demonstrates that in this particular program, we do not have a problem and the program is correct.

Language support: lambda expressions

TPL takes advantage of:

Lambda expressions – these are an unnamed block of code. They make it easier to create tasks

The => arrow is the key syntactic element identifying a lambda expression

Implementation of lambdas

lambda expression == custom class + delegate

Joe shows us an example of a lambda expression as a parameter to Task.Factory.StartNew, and the IL code that gets generated from it.

The lambda expression is turned into a class that contains a method with some arbitrary name.

Joe says a delegate is really nothing more than an object, which typically points to two things:

  1. to the method to be called (it’s like a function pointer)
  2. a pointer to an instance of the class

For more information on Delegates see Dan Wahlin’s course C# Events, Delegates and Lambdas

Language support: closures

Closures == code + supporting data environment

The compiler computes the closure in response to a lambda expression.

It analyzes the lambda expression and closes over the needed variables, and has to figure out a way to pass those variables to that block of code.

Joe says this is also an incredibly convenient form of parameter passing.

We can create our lambda expression and just wrap it with curly braces, and the compiler does the hard work of figuring out what parameters need to be passed into that code to make it compile and run.

Closures: Pass by reference!

How are closure variables passed? By reference!

Think pointers. There’s no copying going on. We have a pointer to a shared memory location.

This means closure variables become shared variables.

Beware if those variables are read and written!

Shared variables can lead to race conditions.

We can easily create the situation where we unknowingly have shared variables.

Implementation of closures

We previously learned that closure variables are passed by reference. That’s a good way to think about it conceptually.

But technically, that’s not accurate. Closure variables are actually stored within a compiler-generated class.

For every closure variable identified by the compiler, there’s a field that’s put into the same compiler-generated class to represent the shared variable.

Joe illustrates this better than I can explain it here using the C# code and the IL code side by side and drawing lines from the C# to the relevant IL.

Demo 3: reverse-engineer implementation

There’s one variable that’s shared by both tasks in our app, and being read and written.

That is the result variable. It is being written by Task T, and then read and displayed by Task T2.

We only know that we are safe because we know that T2 never runs until T has finished.

Joe uses Redgate Reflector to view the IL code. He opens MainWindow – the code for our WPF window – and see a class called DisplayClass2.

There’s one field for every variable that’s closed over.

Code vs. Facade tasks

Code tasks are thready – they have explicit code and require a thread to execute.

Another type of task is a facade over existing operations (e.g. asynchronous I/O)

The code to execute a facade is not explicitly provided, but implied elsewhere.

These tasks can be threadless – for example the hardware could be performing the operation.

To put a facade over an existing operation we create a new TaskCompletionSource and then grab the Task property of it.


Tasks are the new model for async & parallel programming

Tasks denote a unit of work, an ongoing computation

Your job as a developer is to create tasks
.NET’s job is to execute tasks as efficiently as possible

UI updates must be executed in context of UI thread
closures yield the possibility of shared variables

Working with tasks: creating, waiting and harvesting results

Technologies. in .NET 4

Async/Parallel components of .NET 4:

Concurrent Data Structures Parallel LINQ etc.
Task Parallel Library (TPL)
Task Scheduler
Resource Manager

Review – what’s a task?

Task == object representing an ongoing computation

Task object provides a means to check status, wait, harvest results, store exceptions, etc.

Code tasks: Task T1 = Task.Factory.StartNew(() => { /*code*/ });
T1 will contain an object reference to the underlying task object

Facade tasks:
var op = new TaskCompletionSource<T>();
Task T2 = op.Task;

Create a facade task to provide a common object model “task”
for interfacing with both asynchronous and Parallel operations.

Demo 1: Stock History app

min, max and average originally sequential

Code tasks:

Task t_min = Task.Factory.StartNew(()=>
decimal min = data.Prices.Min();

Task t_max = Task.Factory.StartNew(()=>
decimal min = data.Prices.Max();

Task t_avg = Task.Factory.StartNew(()=>
decimal min = data.Prices.Average();

Problem: we can’t output the min, max and average until the tasks have finished.

Facade tasks:
GetDataFromInternet: uses ISyncResult, WaitHandle – pre .NET 4 technologies
can wrap these with the new Task based API:

iars[0] = GetDataFromXXXAsync(…)
Task t_XXX = GetDataFromXXXAsync(…)

and add:
Task.WaitAny(new Task[]{ t_yahoo, t_nasdaq, t_msn });

private static Async GetDataFromXXXAsync(..)
private static Task GetDataFromXXXAsync(..)

and add:
TaskCompletionSource tc = new TaskCompletionSource(iar);
return tc.Task;

Review Complete, Agenda for rest of module

Waiting: Call .Wait on task object…
Task t = Task.Factory.StartNew( code );

t.Wait(); //returns immediately if already finished, or blocks the caller until finished
when finished, status is either:

Demo 2: Waiting for Tasks to finish

//declare at outer scope level (potentially dangerous)
decimal min, max, avg;
min = max = avg = 0;

Task t_min = Task.Factory.StartNew( ()=>
min = data.Prices.Min();
} //etc

//if no multi-core hardware, tasks will queue up sequentially in the same way

We see a whole lot of zeros because we forgot to wait for the tasks to finish!!
t_xxx.Wait(); //etc

stderr is still zero because stderr requires stddev as as input, so we must wait for it!

start without debugging: results slightly different!

one of the most common errors: race condition – the timing of how things run changes when in debug
We missed the wait for avg in t_stddev

alternative, within t_stddev:
decimal l_avg = data.Prices.Average();
//tradeoff: redundant computation, good thing if lots of cores, not if few or unknown number of cores

Harvesting results

MUCH Better way:

Task<int> t = Task.Factory.StartNew( code );

int r = t.Result; //implicit call to .Wait

Demo 3: Harvesting Task Results

remove wait calls
replace min with t_min.Result etc.

Waiting on multiple tasks

Task[] tasks = { t1,t2,t3};
Task.WaitAll( tasks); //wait for ALL to finish
int index = Task.WaitAny( tasks);

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s