Execute code in multiple threads (even with SharePoint)

Since SharePoint 2010 uses .NET 3.5, you can not use the fancy new functions from .NET 4 🙂

So if we need e.g. multi-threaded execution of code, we’ll need to write the code ourselves. But, as you can see, this really isn’t so hard. The basic idea behind this solution of executing code parallel in threads, is that you have an IEnumerable of some kind. This can be a List, or any other IEnumerable.

So let us for example take a list of Guids, which are the IDs of all SPWebs in a SiteCollection. Then we are iterating each web, and write the itemCount of all lists to the Console.

class ParallelExecutionTest
{
   private static int _overallItemCount;
   private static readonly object Lock = new object();

   public static void AddItemCount(int itemCount)
   {
      lock (Lock)
      {
         // only let one thread write to the setter 
         _overallItemCount += itemCount;
      }
   }

   public static void CountListitemsInAllWebs(string siteUrl)
   {
      using (var site = new SPSite(siteUrl))
      {
         // perform the method/action on any web in the sitecollection 
         site.AllWebs.Select(w => w.ID).EachParallel(webId =>
         {
            CountListitems(site.ID, webId);
         }, Environment.ProcessorCount);
         Console.WriteLine("Overall Itemcount: " + _overallItemCount);
      }
   }

   private static void CountListitems(Guid siteId, Guid webId)
   {
      // use new instances for each web 
      using (var site = new SPSite(siteId))
      using (var web = site.OpenWeb(webId))
      {
         var itemCount = web.Lists.Cast<SPList>().Sum(list => list.ItemCount);
         Console.WriteLine("Web {0} has {1} items in all lists.", web.Title, itemCount);
         AddItemCount(itemCount);
      }
   }
}

That doesn’t look too complicated, does it? The little method EachParallel is all it takes for running the code in multiple threads. You have to decide if your code can run parallel, and if makes sense!

Note: Remember that SharePoint will most likely not work, if you access the same objects in multiple threads. So to be safe, create new instances of SharePoint objects in each Thread!

The sample above will create as much threads, as your system has CPUs. On my notebook with i7 and HyperThreading in 8 threads. And here comes the point to remember. Think carefully about the pitfalls on running your code parallel. Here are some drawbacks, compared to the sequentiell execution:

  • Overhead for creating new SharePoint objects (calls to the SQL server)
  • Additional load on the SQL server by querying more data simultaneously (think about a 4 processor server board with x cores and HyperThreading)
  • Possibly more load on the local SharePoint server by writing logfiles
  • Exception handling. With sequential code you can abort. Multiple threads keep running

Enough for now. Lets look at the Extension method which makes all this possible.

public static class Extensions
{
   /// <summary> 
   /// Enumerates through each item and start the action in a new thread 
   /// </summary> 
   /// <typeparam name="T"></typeparam> 
   /// <param name="enumerable"></param> 
   /// <param name="action"></param> 
   /// <param name="maxHandles">e.g. Environment.ProcessorCount</param> 
   public static void EachParallel<T>(this IEnumerable<T> enumerable, Action<T> action, int maxHandles)
   {
      // enumerate the passed IEnumerable so it can't change during execution 
      var itemArray = enumerable.ToArray();
      var count = itemArray.Length;

      if (count == ) return;
      if (count == 1)
      {
         // if there's only one element, just execute 
         action(itemArray.First());
      }
      else
      {
         // maxHandles must not be greatet than the count of actions, or nothing will be done 
         if (maxHandles > count) maxHandles = count;
         var resetEvents = new ManualResetEvent[maxHandles];

         for (var offset = ; offset <= count / maxHandles; offset++)
         {
            EachAction(action, maxHandles, itemArray, offset, resetEvents);
            // Wait for all threads to execute 
            WaitHandle.WaitAll(resetEvents);
         }
      }
   }

   private static void EachAction<T>(Action<T> action, int maxHandles, IEnumerable<T> itemArray, int offset, ManualResetEvent[] resetEvents)
   {
      int i = ;
      foreach (var item in itemArray.Skip(offset * maxHandles).Take(maxHandles))
      {
         resetEvents[i] = new ManualResetEvent(false);

         ThreadPool.QueueUserWorkItem(data =>
         {
            var index = (int)((object[])data)[];
            try
            {
               // Execute the method and pass in the enumerated item 
               action((T)((object[])data)[1]);
            }
            catch (Exception ex)
            {
               // Exception handling 
               Console.WriteLine(ex.ToString());
            }

            // Tell the calling thread that we're done 
            resetEvents[index].Set();
         }, new object[] { i, item });
         i++;
      }
   }
}

All items in the IEnumerable are iterated. If there is free slot, the action will be executed in a new thread. There is no guarantee, that the code is executed in the same order, as the items in your IEnumerable. Here is an examples of IDs in an array, and the execution order:

Order in List Execution order 0 3 1 6 2 7 3 1 4 4 5 0 6 2 7 5

Summary: Depending on your code, and its requirements, multiple threads can be a good way to improve the speed of you code. It even can be a life-saver (thx Christopher!) for very long running operations. Take your time to think about it, before you implement the “little” change to your code to run in multiple threads!

One last word. I mentioned .NET 4 at the beginning. Here is a sample.

var ids = new List<int> { , 1, 2, 3, 4, 5 }; 
ids.AsParallel().ForAll(id => { Console.WriteLine("Id: " + id); });

Nice, ain’t it?