<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss'><id>tag:blogger.com,1999:blog-29331675</id><updated>2010-03-09T07:51:22.027+01:00</updated><title type='text'>The Delphi Geek</title><subtitle type='html'>random ramblings on Delphi, programming, Delphi programming, and all the rest</subtitle><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/posts/default'/><link rel='alternate' type='text/html' href='http://17slon.com/blogs/gabr/blogger.html'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><link rel='next' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/posts/default?start-index=26&amp;max-results=25'/><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://17slon.com/blogs/gabr/atom.xml'/><author><name>gabr</name><uri>http://www.blogger.com/profile/06903558857617342477</uri><email>noreply@blogger.com</email></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>176</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>25</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-29331675.post-565081378571836025</id><published>2010-03-08T09:49:00.001+01:00</published><updated>2010-03-08T09:49:16.626+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='open source'/><category scheme='http://www.blogger.com/atom/ns#' term='Delphi'/><category scheme='http://www.blogger.com/atom/ns#' term='OmniThreadLibrary'/><title type='text'>OmniThreadLibrary 1.05a</title><content type='html'>&lt;p&gt;OmniThreadLibrary 1.05a has just been released. It is available via   &lt;br /&gt;&lt;a href="http://omnithreadlibrary.googlecode.com/svn/tags/release-1.05a" target="_blank"&gt;SVN&lt;/a&gt; or as a &lt;a href="http://code.google.com/p/omnithreadlibrary/downloads/list" target="_blank"&gt;ZIP archive&lt;/a&gt;.&lt;/p&gt;  &lt;p&gt;This is mostly a bugfix release:&lt;/p&gt;  &lt;ul&gt;   &lt;li&gt;Bug fixed: TOmniTaskControl.OnMessage(eventHandler: TOmniTaskMessageEvent) was broken.&lt;/li&gt;    &lt;li&gt;Bug fixed: TOmniTaskControl.OnMessage/OnTerminate uses event monitor&amp;#160; created in the context of the task controller thread (was using a global event monitor created in the main thread). &lt;/li&gt;    &lt;li&gt;Implemented TOmniEventMonitorPool, per-thread TOmniEventMonitor&amp;#160; allocator.&lt;/li&gt; &lt;/ul&gt;  &lt;p&gt;Upgrade is recommended for all 1.05 users.&lt;/p&gt;  &lt;div class="blogger-post-footer"&gt;&lt;font size=-2&gt;---&lt;br/&gt;Published under the &lt;a href="http://creativecommons.org/licenses/by/3.0/"&gt;Creative Commons Attribution 3.0&lt;/a&gt; license&lt;/font&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/29331675-565081378571836025?l=17slon.com%2Fblogs%2Fgabr%2Fblogger.html' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/565081378571836025/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://17slon.com/blogs/gabr/2010/03/omnithreadlibrary-105a.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/posts/default/565081378571836025'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/posts/default/565081378571836025'/><link rel='alternate' type='text/html' href='http://17slon.com/blogs/gabr/2010/03/omnithreadlibrary-105a.html' title='OmniThreadLibrary 1.05a'/><author><name>gabr</name><uri>http://www.blogger.com/profile/06903558857617342477</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='12941540036853261067'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-29331675.post-5601362077707105561</id><published>2010-03-05T18:11:00.001+01:00</published><updated>2010-03-05T18:11:45.244+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Delphi'/><category scheme='http://www.blogger.com/atom/ns#' term='source code'/><category scheme='http://www.blogger.com/atom/ns#' term='The Delphi Magazine'/><title type='text'>TDM Rerun #15: Many Faces Of An Application</title><content type='html'>&lt;blockquote&gt;   &lt;p&gt;&lt;em&gt;That all sounds easy, but how can we combine the windows (forms-based) aspect of an application with something completely different, for example an SvCom-based service application? The problem here is that the GUI part of an application uses forms while the SvCom service is based on another Application object, based on the SvCom_NTService unit. How can we combine the GUI Application.Initialize (where Application is an object in the Forms unit) with a service Application.Initialize (where Application is an object in the SvCom_NTService unit)? By fully qualifying each object, of course.&lt;/em&gt;&lt;/p&gt;    &lt;p&gt;&lt;em&gt;- Many Faces Of An Application, The Delphi Magazine 107, July 2004&lt;/em&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;In the 2004 July issue I described an approach that allows the programmer to put multiple application front-ends inside one .exe file by manually tweaking the project’s .dpr file. This is the technique I’m still using in my programs. For example, most of the services I write can be configured by starting the exe with the &lt;em&gt;/config &lt;/em&gt;switch.&lt;/p&gt;  &lt;p&gt;Links: &lt;a title="TDM 107: Many Faces Of An Application [article]" href="http://17slon.com/blogs/gabr/TDM/tdm107-gp.pdf" target="_blank"&gt;article&lt;/a&gt; (PDF, 126 KB), &lt;a title="TDM 107: Many Faces Of An Application [source code]" href="http://17slon.com/blogs/gabr/TDM/tdm107-gp.zip" target="_blank"&gt;source code&lt;/a&gt; (ZIP, 1 MB).&lt;/p&gt;  &lt;div class="blogger-post-footer"&gt;&lt;font size=-2&gt;---&lt;br/&gt;Published under the &lt;a href="http://creativecommons.org/licenses/by/3.0/"&gt;Creative Commons Attribution 3.0&lt;/a&gt; license&lt;/font&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/29331675-5601362077707105561?l=17slon.com%2Fblogs%2Fgabr%2Fblogger.html' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/5601362077707105561/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://17slon.com/blogs/gabr/2010/03/tdm-rerun-15-many-faces-of-application.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/posts/default/5601362077707105561'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/posts/default/5601362077707105561'/><link rel='alternate' type='text/html' href='http://17slon.com/blogs/gabr/2010/03/tdm-rerun-15-many-faces-of-application.html' title='TDM Rerun #15: Many Faces Of An Application'/><author><name>gabr</name><uri>http://www.blogger.com/profile/06903558857617342477</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='12941540036853261067'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-29331675.post-5991488281661501944</id><published>2010-02-26T18:51:00.001+01:00</published><updated>2010-02-26T18:51:44.456+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='not programming'/><title type='text'>On satisfaction</title><content type='html'>&lt;p&gt;It is a great feeling when an elegant piece of code &lt;a href="http://17slon.com/blogs/gabr/2010/01/parallelfor.html" target="_blank"&gt;comes together&lt;/a&gt;. Even if it can’t be compiled yet.&lt;/p&gt;  &lt;pre class="pas-source"&gt;  Parallel.ForEach(nodeQueue &lt;span class="pas-kwd"&gt;as&lt;/span&gt; IOmniValueEnumerable)&lt;br /&gt;    .NumTasks(numTasks)&lt;br /&gt;    .CancelWith(cancelToken)&lt;br /&gt;    .Execute(&lt;br /&gt;      &lt;span class="pas-kwd"&gt;procedure&lt;/span&gt; (&lt;span class="pas-kwd"&gt;const&lt;/span&gt; elem: TOmniValue)&lt;br /&gt;      &lt;span class="pas-kwd"&gt;var&lt;/span&gt;&lt;br /&gt;        childNode: TNode;&lt;br /&gt;        node     : TNode;&lt;br /&gt;      &lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;        node := TNode(elem.AsObject);&lt;br /&gt;        &lt;span class="pas-kwd"&gt;if&lt;/span&gt; node.Value = value &lt;span class="pas-kwd"&gt;then&lt;/span&gt; &lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;          nodeResult := node;&lt;br /&gt;          nodeQueue.CompleteAdding;&lt;br /&gt;          cancelToken.Signal;&lt;br /&gt;        &lt;span class="pas-kwd"&gt;end&lt;/span&gt;&lt;br /&gt;        &lt;span class="pas-kwd"&gt;else&lt;/span&gt; &lt;span class="pas-kwd"&gt;for&lt;/span&gt; childNode &lt;span class="pas-kwd"&gt;in&lt;/span&gt; node.Children &lt;span class="pas-kwd"&gt;do&lt;/span&gt;&lt;br /&gt;          nodeQueue.TryAdd(childNode);&lt;br /&gt;      &lt;span class="pas-kwd"&gt;end&lt;/span&gt;);&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;p&gt;It is even a better feeling when a code that seems to be &lt;a href="http://17slon.com/blogs/gabr/2010/02/dynamic-lock-free-queue-doing-it-right.html" target="_blank"&gt;impossible to write&lt;/a&gt;, starts to work.&lt;/p&gt;&lt;br /&gt;&lt;br /&gt;&lt;p&gt;And the best one – that happens when the code is working so well that you are not afraid of &lt;a href="http://17slon.com/blogs/gabr/2010/02/omnithreadlibrary-105.html" target="_blank"&gt;releasing it to the public&lt;/a&gt;.&lt;/p&gt;&lt;br /&gt;&lt;br /&gt;&lt;p&gt;Well, make this almost the best. Because there’s something even better – when people call back to tell you that they like using the code and it is helping them to do their work faster and better.&lt;/p&gt;&lt;br /&gt;&lt;br /&gt;&lt;p&gt;The feeling that cannot be surpassed comes when such happy user says something like: “Thanks for the code, it helps me a lot. There’s an Amazon gift certificate, spend it as you like.”&lt;/p&gt;&lt;br /&gt;&lt;br /&gt;&lt;p&gt;&lt;a href="http://17slon.com/blogs/gabr/files/Onsatisfaction_6FD2/01.png" target="_blank"&gt;&lt;img style="border-right-width: 0px; display: block; float: none; border-top-width: 0px; border-bottom-width: 0px; margin-left: auto; border-left-width: 0px; margin-right: auto" title="01" border="0" alt="01" src="http://17slon.com/blogs/gabr/files/Onsatisfaction_6FD2/01_thumb.jpg" width="364" height="364" /&gt;&lt;/a&gt;&lt;/p&gt;&lt;br /&gt;&lt;br /&gt;&lt;p&gt;I can only respond with: “Rico, thanks!”. &lt;a href="http://otl.17slon.com" target="_blank"&gt;OmniThreadLibrary&lt;/a&gt; 1.05 is dedicated to you.&lt;/p&gt;  &lt;div class="blogger-post-footer"&gt;&lt;font size=-2&gt;---&lt;br/&gt;Published under the &lt;a href="http://creativecommons.org/licenses/by/3.0/"&gt;Creative Commons Attribution 3.0&lt;/a&gt; license&lt;/font&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/29331675-5991488281661501944?l=17slon.com%2Fblogs%2Fgabr%2Fblogger.html' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/5991488281661501944/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://17slon.com/blogs/gabr/2010/02/on-satisfaction.html#comment-form' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/posts/default/5991488281661501944'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/posts/default/5991488281661501944'/><link rel='alternate' type='text/html' href='http://17slon.com/blogs/gabr/2010/02/on-satisfaction.html' title='On satisfaction'/><author><name>gabr</name><uri>http://www.blogger.com/profile/06903558857617342477</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='12941540036853261067'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-29331675.post-8321930019599413454</id><published>2010-02-25T20:27:00.001+01:00</published><updated>2010-02-25T20:31:48.962+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='open source'/><category scheme='http://www.blogger.com/atom/ns#' term='Delphi'/><category scheme='http://www.blogger.com/atom/ns#' term='OmniThreadLibrary'/><category scheme='http://www.blogger.com/atom/ns#' term='multithreading'/><category scheme='http://www.blogger.com/atom/ns#' term='programming'/><title type='text'>OmniThreadLibrary 1.05</title><content type='html'>&lt;p&gt;As there were no error reports related to &lt;a href="http://otl.17slon.com" target="_blank"&gt;OmniThreadLibrary&lt;/a&gt; 1.05 RC, I’ve released final 1.05 version just few moments ago. There are almost no changes between the RC and final release – one demo was added and Parallel.Join code was tweaked a little.&lt;/p&gt;  &lt;p&gt;You can download OTL 1.05 from the &lt;a href="http://code.google.com/p/omnithreadlibrary/downloads/list"&gt;Google Code&lt;/a&gt;. Alternatively, you can update SVN trunk (&lt;a href="http://code.google.com/p/omnithreadlibrary/source/checkout"&gt;checkout instructions&lt;/a&gt;) or checkout the &lt;a href="http://omnithreadlibrary.googlecode.com/svn/tags/release-1.05"&gt;release-1.05 tag&lt;/a&gt;.&lt;/p&gt;  &lt;p&gt;Support is available on the &lt;a href="http://otl.17slon.com/forum/"&gt;web discussion forum&lt;/a&gt;.&lt;/p&gt;  &lt;h2&gt;Big rename&lt;/h2&gt;  &lt;p&gt;Many internal classes and interfaces was renamed. This should not affect most of the users.&lt;/p&gt;  &lt;ul&gt;   &lt;li&gt;TOmniBaseStack –&amp;gt; TOmniBaseBoundedStack &lt;/li&gt;    &lt;li&gt;TOmniStack –&amp;gt; TOmniBoundedStack &lt;/li&gt;    &lt;li&gt;TOmniBaseQueue –&amp;gt; TOmniBaseBoundedQueue &lt;/li&gt;    &lt;li&gt;TOmniQueue –&amp;gt; TOmniBoundedQueue &lt;/li&gt;    &lt;li&gt;IInterfaceDictionary –&amp;gt; IOmniInterfaceDictionary &lt;/li&gt;    &lt;li&gt;IInterfaceDictionaryEnumerator -&amp;gt; IOmniInterfaceDictionaryEnumerator, &lt;/li&gt;    &lt;li&gt;TInterfaceDictionaryPair –&amp;gt; TOmniInterfaceDictionaryPair &lt;/li&gt; &lt;/ul&gt;  &lt;p&gt;I’m sorry for that. Some names are badly chosen and some did not follow the OTL naming conventions.&lt;/p&gt;  &lt;h2&gt;Dynamic lock-free queue&lt;/h2&gt;  &lt;p&gt;Implemented dynamically allocated, O(1) enqueue and dequeue, threadsafe,&amp;#160; lock-free queue. Class TOmniBaseQueue contains base implementation while TOmniQueue adds observer support. Both classes live in the &lt;a href="http://code.google.com/p/omnithreadlibrary/source/browse/trunk/OtlContainers.pas" target="_blank"&gt;OtlContainers&lt;/a&gt; unit.&lt;/p&gt;  &lt;p&gt;Read more about the TOmniQueue: &lt;a href="http://17slon.com/blogs/gabr/2010/02/dynamic-lock-free-queue-doing-it-right.html" target="_blank"&gt;Dynamic lock-free queue – doing it right&lt;/a&gt;.&lt;/p&gt;  &lt;h2&gt; Inverse semaphore&lt;/h2&gt;  &lt;p&gt;Implemented resource counter with empty state signalling TOmniResourceCount (unit&amp;#160; &lt;a href="http://code.google.com/p/omnithreadlibrary/source/browse/trunk/OtlSync.pas" target="_blank"&gt;OtlSync&lt;/a&gt;).&lt;/p&gt;  &lt;p&gt;Read more: &lt;a href="http://17slon.com/blogs/gabr/2010/01/three-steps-to-blocking-collection-1.html" target="_blank"&gt;Three steps to the blocking collection: [1] Inverse semaphore&lt;/a&gt;.&lt;/p&gt;  &lt;h2&gt;Blocking collection&lt;/h2&gt;  &lt;p&gt;New unit &lt;a href="http://code.google.com/p/omnithreadlibrary/source/browse/trunk/OtlCollections.pas" target="_blank"&gt;OtlCollection&lt;/a&gt; which contains blocking collection implementation&amp;#160; TOmniBlockingCollection.&lt;/p&gt;  &lt;p&gt;Read more: &lt;a href="http://17slon.com/blogs/gabr/2010/02/three-steps-to-blocking-collection-3.html" target="_blank"&gt;Three steps to the blocking collection: [3] Blocking collection&lt;/a&gt;&lt;/p&gt;  &lt;h2&gt;Parallel&lt;/h2&gt;  &lt;p&gt;New high-level parallelism support (unit &lt;a href="http://code.google.com/p/omnithreadlibrary/source/browse/trunk/OtlParallel.pas" target="_blank"&gt;OtlParallel&lt;/a&gt;). Requires at least Delphi 2009.&lt;/p&gt;  &lt;p&gt;Two parallel control structures are supported: &lt;em&gt;for each&lt;/em&gt; (with optional aggregator) and &lt;em&gt;join&lt;/em&gt;.&lt;/p&gt;  &lt;p&gt;The demo for Parallel.ForEach can be found in project 35_ParallelFor. The same code is reprinted near the end of the &lt;a href="http://17slon.com/blogs/gabr/2010/02/three-steps-to-blocking-collection-3.html" target="_blank"&gt;Three steps to the blocking collection: [3] Blocking collection&lt;/a&gt; post.&lt;/p&gt;  &lt;p&gt;Parallel.ForEach.Aggregate was described in &lt;a href="http://17slon.com/blogs/gabr/2010/02/parallelforeachaggregate.html"&gt;Parallel.ForEach.Aggreate&lt;/a&gt; post and is demoed in project 36_ParallelAggregate.&lt;/p&gt;  &lt;p&gt;At the moment ForEach is fairly limited. It can iterate over a range of numbers or over a collection supporting the IOmniValueEnumerable interface (TOmniBlockingCollection, for example). The second limitation will be removed in the future. The plan is to support any collection that implements IEnumerable.&lt;/p&gt;  &lt;p&gt;Parallel.Join is very simple code that executes multiple tasks and waits for their completion. It was designed to execute simple tasks that don’t require communication with the owner. It is demoed in project 37_ParallelJoin.&lt;/p&gt;  &lt;h2&gt;Environment&lt;/h2&gt;  &lt;p&gt;Unit &lt;a href="http://code.google.com/p/omnithreadlibrary/source/browse/trunk/OtlCommon.pas"&gt;OtlCommon&lt;/a&gt; contains new interface IOmniEnvironment and function Environment that returns singleton of this type. Environment can be used to query some basic information on system, process and thread. Some information (for example process and thread affinity) can also be modified using the same interface.&lt;/p&gt;  &lt;pre class="pas-source"&gt;  IOmniAffinity = &lt;span class="pas-kwd"&gt;interface&lt;/span&gt;&lt;br /&gt;    &lt;span class="pas-kwd"&gt;property&lt;/span&gt; AsString: &lt;span class="pas-kwd"&gt;string&lt;/span&gt;;&lt;br /&gt;    &lt;span class="pas-kwd"&gt;property&lt;/span&gt; Count: integer;&lt;br /&gt;    &lt;span class="pas-kwd"&gt;property&lt;/span&gt; Mask: DWORD;&lt;br /&gt;  &lt;span class="pas-kwd"&gt;end&lt;/span&gt;;&lt;br /&gt;&lt;br /&gt;  IOmniProcessEnvironment = &lt;span class="pas-kwd"&gt;interface&lt;/span&gt;&lt;br /&gt;    &lt;span class="pas-kwd"&gt;property&lt;/span&gt; Affinity: IOmniAffinity;&lt;br /&gt;    &lt;span class="pas-kwd"&gt;property&lt;/span&gt; Memory: TOmniProcessMemoryCounters;&lt;br /&gt;    &lt;span class="pas-kwd"&gt;property&lt;/span&gt; PriorityClass: TOmniProcessPriorityClass;&lt;br /&gt;    &lt;span class="pas-kwd"&gt;property&lt;/span&gt; Times: TOmniProcessTimes;&lt;br /&gt;  &lt;span class="pas-kwd"&gt;end&lt;/span&gt;;&lt;br /&gt;&lt;br /&gt;  IOmniSystemEnvironment = &lt;span class="pas-kwd"&gt;interface&lt;/span&gt;&lt;br /&gt;    &lt;span class="pas-kwd"&gt;property&lt;/span&gt; Affinity: IOmniAffinity;&lt;br /&gt;  &lt;span class="pas-kwd"&gt;end&lt;/span&gt;;&lt;br /&gt;&lt;br /&gt;  IOmniThreadEnvironment = &lt;span class="pas-kwd"&gt;interface&lt;/span&gt;&lt;br /&gt;    &lt;span class="pas-kwd"&gt;property&lt;/span&gt; Affinity: IOmniAffinity;&lt;br /&gt;    &lt;span class="pas-kwd"&gt;property&lt;/span&gt; ID: cardinal;&lt;br /&gt;  &lt;span class="pas-kwd"&gt;end&lt;/span&gt;;&lt;br /&gt;&lt;br /&gt;  IOmniEnvironment = &lt;span class="pas-kwd"&gt;interface&lt;/span&gt;&lt;br /&gt;    &lt;span class="pas-kwd"&gt;property&lt;/span&gt; Process: IOmniProcessEnvironment;&lt;br /&gt;    &lt;span class="pas-kwd"&gt;property&lt;/span&gt; System: IOmniSystemEnvironment;&lt;br /&gt;    &lt;span class="pas-kwd"&gt;property&lt;/span&gt; Thread: IOmniThreadEnvironment;&lt;br /&gt;  &lt;span class="pas-kwd"&gt;end&lt;/span&gt;;&lt;/pre&gt;&lt;p&gt;Newer demos are using some parts of the Environment interface. For example, in demo 33_BlockingCollection, process affinity is set with&lt;/p&gt;&lt;pre class="pas-source"&gt;  Environment.Process.Affinity.Count := inpNumCPU.Value; &lt;/pre&gt;&lt;p&gt;while the demo 35_ParallelFor uses following code fragment to query process affinity&lt;/p&gt;&lt;pre class="pas-source"&gt;  numTasks := Environment.Process.Affinity.Count; &lt;/pre&gt;&lt;h2&gt;Cancellation token&lt;/h2&gt;&lt;p&gt;New interface IOmniCancellationToken is used in the Parallel.ForLoop (see post &lt;a href="http://17slon.com/blogs/gabr/2010/02/three-steps-to-blocking-collection-3.html" target="_blank"&gt;Three steps to the blocking collection: [3] Blocking collection&lt;/a&gt; for the example) and in IOmniTaskControl.TerminateWhen.&lt;/p&gt;&lt;p&gt;IOmniTaskControl and IOmniTask implement CancellationToken: IOmniCancellationToken&amp;#160; property which can be used by the task and task controller.&lt;/p&gt;&lt;p&gt;IOmniCancellationToken is just a simple wrapper around the Win32 event primitive and is defined in the &lt;a href="http://code.google.com/p/omnithreadlibrary/source/browse/trunk/OtlSync.pas" target="_blank"&gt;OtlSync&lt;/a&gt; unit.&lt;/p&gt;&lt;pre class="pas-source"&gt;  IOmniCancellationToken = &lt;span class="pas-kwd"&gt;interface&lt;/span&gt;&lt;br /&gt;    &lt;span class="pas-kwd"&gt;procedure&lt;/span&gt; Clear;&lt;br /&gt;    &lt;span class="pas-kwd"&gt;function&lt;/span&gt;  IsSignaled: boolean;&lt;br /&gt;    &lt;span class="pas-kwd"&gt;procedure&lt;/span&gt; Signal;&lt;br /&gt;    &lt;span class="pas-kwd"&gt;property&lt;/span&gt; Handle: THandle;&lt;br /&gt;  &lt;span class="pas-kwd"&gt;end&lt;/span&gt;; &lt;span class="pas-comment"&gt;{ IOmniCancellationToken }&lt;/span&gt;&lt;/pre&gt;&lt;h2&gt;Message dispatcher&lt;/h2&gt;&lt;p&gt;IOmniTaskControl now implements message dispatching setter in form OnMessage(msgID, handler). Use it to route specific message IDs to specific functions when global TOmniEventMonitor is not used.&lt;/p&gt;&lt;p&gt;An example from one of my applications:&lt;/p&gt;&lt;pre class="pas-source"&gt;  spmDatabaseConn := CreateTask(&lt;br /&gt;      TSttdbPlaylistDatabaseWorker.Create(), &lt;br /&gt;      &lt;span class="pas-str"&gt;'Playlist Monitor Database Connection'&lt;/span&gt;)&lt;br /&gt;    .SetParameters([serverAddress, serverPort, username, password])&lt;br /&gt;    .SetTimer(&lt;span class="pas-num"&gt;15&lt;/span&gt;*&lt;span class="pas-num"&gt;1000&lt;/span&gt;, @TSttdbPlaylistDatabaseWorker.CheckDBVersion)&lt;br /&gt;    .OnMessage(MSG_DB_ERROR,   HandleError)&lt;br /&gt;    .OnMessage(MSG_DB_STATUS,  HandleDatabaseStatus)&lt;br /&gt;    .OnMessage(MSG_DB_VERSION, HandleDatabaseVersion)&lt;br /&gt;    .Run; &lt;/pre&gt;&lt;h2&gt;UserData[]&lt;/h2&gt;&lt;p&gt;Implemented IOmniTaskControl.UserData[]. The application can store any values in this array. It can be accessed via the integer or string index. This storage are can only be access from the task controller side. Access is not thread-safe so you should use it only from one thread or create your own protection mechanism.&lt;/p&gt;&lt;h2&gt;Small changes&lt;/h2&gt;&lt;ul&gt;  &lt;li&gt;IOmniTask implements Implementor property which points back to the worker instance&amp;#160; (but only if worker is TOmniWorker-based). &lt;/li&gt;  &lt;li&gt;Refactored and enhanced TOmniValueContainer. &lt;/li&gt;  &lt;li&gt;TOmniTaskFunction now takes 'const' parameter.&amp;#160; &lt;br /&gt;TOmniTaskFunction = reference to procedure(const task: IOmniTask). &lt;/li&gt;  &lt;li&gt;Implemented TOmniValue.IsInteger. &lt;/li&gt;&lt;/ul&gt;&lt;h2&gt;Bugs fixed&lt;/h2&gt;&lt;ul&gt;  &lt;li&gt;TOmniEventMonitor.OnTaskUndeliveredMessage was missing 'message' parameter. &lt;/li&gt;  &lt;li&gt;Set package names and designtime/runtime type in D2009/D2010 packages. &lt;/li&gt;&lt;/ul&gt;&lt;h2&gt;New demos&lt;/h2&gt;&lt;ul&gt;  &lt;li&gt;32_Queue: Stress test for new TOmniBaseQueue and TOmniQueue. &lt;/li&gt;  &lt;li&gt;33_BlockingCollection: Stress test for new TOmniBlockingCollection, also demoes&amp;#160; the use of Environment to set process affinity. &lt;/li&gt;  &lt;li&gt;34_TreeScan: Parallel tree scan using TOmniBlockingCollection. &lt;/li&gt;  &lt;li&gt;35_ParallelFor: Parallel tree scan using Parallel.ForEach (Delphi 2009 and newer). &lt;/li&gt;  &lt;li&gt;36_ParallelAggregate: Parallel calculations using Parallel.ForEach.Aggregate&amp;#160; (Delphi 2009 and newer). &lt;/li&gt;&lt;/ul&gt;&lt;div class="blogger-post-footer"&gt;&lt;font size=-2&gt;---&lt;br/&gt;Published under the &lt;a href="http://creativecommons.org/licenses/by/3.0/"&gt;Creative Commons Attribution 3.0&lt;/a&gt; license&lt;/font&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/29331675-8321930019599413454?l=17slon.com%2Fblogs%2Fgabr%2Fblogger.html' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/8321930019599413454/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://17slon.com/blogs/gabr/2010/02/omnithreadlibrary-105.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/posts/default/8321930019599413454'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/posts/default/8321930019599413454'/><link rel='alternate' type='text/html' href='http://17slon.com/blogs/gabr/2010/02/omnithreadlibrary-105.html' title='OmniThreadLibrary 1.05'/><author><name>gabr</name><uri>http://www.blogger.com/profile/06903558857617342477</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='12941540036853261067'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-29331675.post-5422205136113908635</id><published>2010-02-22T20:54:00.000+01:00</published><updated>2010-02-22T20:54:34.122+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='open source'/><category scheme='http://www.blogger.com/atom/ns#' term='Delphi'/><category scheme='http://www.blogger.com/atom/ns#' term='OmniThreadLibrary'/><category scheme='http://www.blogger.com/atom/ns#' term='multithreading'/><category scheme='http://www.blogger.com/atom/ns#' term='source code'/><category scheme='http://www.blogger.com/atom/ns#' term='programming'/><title type='text'>Three steps to the blocking collection: [3] Blocking collection</title><content type='html'>&lt;p&gt;About two months ago I started working on Delphi clone of .NET 4 &lt;a href="http://msdn.microsoft.com/en-us/library/dd267312(VS.100).aspx" target="_blank"&gt;BlockingCollection&lt;/a&gt;. Initial release was completed just before the end of 2009 and I started to write a series of articles on TOmniBlockingCollection in early January but then I got stuck in the dynamic lock-free queue implementation. Instead of writing articles I spent most of my free time working on that code.&lt;/p&gt;  &lt;p&gt;Now it is (finally) time to complete the journey. Everything that had to be said about the infrastructure was told and I only have to show you the internal workings of the blocking collection itself.&lt;/p&gt;  &lt;p&gt;[Step 1: &lt;a href="http://17slon.com/blogs/gabr/2010/01/three-steps-to-blocking-collection-1.html" target="_blank"&gt;Three steps to the blocking collection: [1] Inverse semaphore&lt;/a&gt;]&lt;/p&gt;  &lt;p&gt;[Step 2: &lt;a href="http://17slon.com/blogs/gabr/2010/02/dynamic-lock-free-queue-doing-it-right.html" target="_blank"&gt;Dynamic lock-free queue – doing it right&lt;/a&gt;]&lt;/p&gt;  &lt;p&gt;The blocking collecting is exposed as an interface that lives in the &lt;a href="http://code.google.com/p/omnithreadlibrary/source/browse/trunk/OtlCollections.pas" target="_blank"&gt;OtlCollections&lt;/a&gt; unit.&lt;/p&gt;  &lt;pre class="pas-source"&gt;  IOmniBlockingCollection = &lt;span class="pas-kwd"&gt;interface&lt;/span&gt;(IGpTraceable) &lt;br /&gt;    [&lt;span class="pas-str"&gt;'{208EFA15-1F8F-4885-A509-B00191145D38}'&lt;/span&gt;]&lt;br /&gt;    &lt;span class="pas-kwd"&gt;procedure&lt;/span&gt; Add(&lt;span class="pas-kwd"&gt;const&lt;/span&gt; value: TOmniValue);&lt;br /&gt;    &lt;span class="pas-kwd"&gt;procedure&lt;/span&gt; CompleteAdding;&lt;br /&gt;    &lt;span class="pas-kwd"&gt;function&lt;/span&gt;  GetEnumerator: IOmniValueEnumerator;&lt;br /&gt;    &lt;span class="pas-kwd"&gt;function&lt;/span&gt;  IsCompleted: boolean;&lt;br /&gt;    &lt;span class="pas-kwd"&gt;function&lt;/span&gt;  Take(&lt;span class="pas-kwd"&gt;var&lt;/span&gt; value: TOmniValue): boolean;&lt;br /&gt;    &lt;span class="pas-kwd"&gt;function&lt;/span&gt;  TryAdd(&lt;span class="pas-kwd"&gt;const&lt;/span&gt; value: TOmniValue): boolean;&lt;br /&gt;    &lt;span class="pas-kwd"&gt;function&lt;/span&gt;  TryTake(&lt;span class="pas-kwd"&gt;var&lt;/span&gt; value: TOmniValue; timeout_ms: cardinal = &lt;span class="pas-num"&gt;0&lt;/span&gt;): boolean;&lt;br /&gt;  &lt;span class="pas-kwd"&gt;end&lt;/span&gt;; &lt;span class="pas-comment"&gt;{ IOmniBlockingCollection }&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;There’s also a class TOmniBlockingCollection which implements this interface. This class is public and can be used or reused in your code.&lt;/p&gt;&lt;p&gt;The blocking collection works in the following way:&lt;/p&gt;&lt;ul&gt;  &lt;li&gt;&lt;em&gt;Add&lt;/em&gt; will add new value to the collection (which is internally implemented as a queue (FIFO, first in, first out)). &lt;/li&gt;  &lt;li&gt;&lt;em&gt;CompleteAdding &lt;/em&gt;tells the collection that all data is in the queue. From now on, calling &lt;em&gt;Add &lt;/em&gt;will raise an exception. &lt;/li&gt;  &lt;li&gt;&lt;em&gt;TryAdd &lt;/em&gt;is the same as &lt;em&gt;Add &lt;/em&gt;except that it doesn’t raise an exception but returns False if the value can’t be added. &lt;/li&gt;  &lt;li&gt;&lt;em&gt;IsCompleted &lt;/em&gt;returns True after the &lt;em&gt;CompleteAdding &lt;/em&gt;has been called. &lt;/li&gt;  &lt;li&gt;&lt;em&gt;Take &lt;/em&gt;reads next value from the collection. If there’s no data in the collection, &lt;em&gt;Take&lt;/em&gt; will block until the next value is available. If, however, any other thread calls &lt;em&gt;CompleteAdding&lt;/em&gt; while the &lt;em&gt;Take&lt;/em&gt; is blocked, &lt;em&gt;Take&lt;/em&gt; will unblock and return False. &lt;/li&gt;  &lt;li&gt;&lt;em&gt;TryTake&lt;/em&gt; is the same as &lt;em&gt;Take&lt;/em&gt; except that it has a &lt;em&gt;timeout&lt;/em&gt; parameter specifying maximum time the call is allowed to wait for the next value. &lt;/li&gt;  &lt;li&gt;Enumerator calls &lt;em&gt;Take&lt;/em&gt; in the &lt;em&gt;MoveNext &lt;/em&gt;method and returns that value. Enumerator will therefore block when there is no data in the collection. The usual way to stop the enumerator is to call &lt;em&gt;CompleteAdding &lt;/em&gt;which will unblock all pending &lt;em&gt;MoveNext &lt;/em&gt;calls and stop enumeration. [For another approach see the example at the end of this article.]&lt;/li&gt;&lt;/ul&gt;&lt;h2&gt;The trivial parts&lt;/h2&gt;&lt;p&gt;Most of the blocking collection code is fairly trivial.&lt;/p&gt;&lt;p&gt;&lt;em&gt;Add&lt;/em&gt; just calls &lt;em&gt;TryAdd&lt;/em&gt; and raises an exception if &lt;em&gt;TryAdd&lt;/em&gt; fails.&lt;/p&gt;&lt;pre class="pas-source"&gt;&lt;span class="pas-kwd"&gt;procedure&lt;/span&gt; TOmniBlockingCollection.Add(&lt;span class="pas-kwd"&gt;const&lt;/span&gt; value: TOmniValue);&lt;br /&gt;&lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;  &lt;span class="pas-kwd"&gt;if&lt;/span&gt; &lt;span class="pas-kwd"&gt;not&lt;/span&gt; TryAdd(value) &lt;span class="pas-kwd"&gt;then&lt;/span&gt;&lt;br /&gt;    &lt;span class="pas-kwd"&gt;raise&lt;/span&gt; ECollectionCompleted.Create(&lt;span class="pas-str"&gt;'Adding to completed collection'&lt;/span&gt;);&lt;br /&gt;&lt;span class="pas-kwd"&gt;end&lt;/span&gt;; &lt;span class="pas-comment"&gt;{ TOmniBlockingCollection.Add }&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;&lt;em&gt;CompleteAdding&lt;/em&gt; sets two “completed” flags – one boolean flag and one Windows event. Former is used for speed in non-blocking tests while the latter is used when &lt;em&gt;TryTake&lt;/em&gt; has to block.&lt;/p&gt;&lt;pre class="pas-source"&gt;&lt;span class="pas-kwd"&gt;procedure&lt;/span&gt; TOmniBlockingCollection.CompleteAdding;&lt;br /&gt;&lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;  &lt;span class="pas-kwd"&gt;if&lt;/span&gt; &lt;span class="pas-kwd"&gt;not&lt;/span&gt; obcCompleted &lt;span class="pas-kwd"&gt;then&lt;/span&gt; &lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;    obcCompleted := true;&lt;br /&gt;    Win32Check(SetEvent(obcCompletedSignal));&lt;br /&gt;  &lt;span class="pas-kwd"&gt;end&lt;/span&gt;;&lt;br /&gt;&lt;span class="pas-kwd"&gt;end&lt;/span&gt;; &lt;span class="pas-comment"&gt;{ TOmniBlockingCollection.CompleteAdding }&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;&lt;em&gt;Take &lt;/em&gt;calls the&lt;em&gt; TryTake &lt;/em&gt;with the INFINITE timeout.&lt;/p&gt;&lt;pre class="pas-source"&gt;&lt;span class="pas-kwd"&gt;function&lt;/span&gt; TOmniBlockingCollection.Take(&lt;span class="pas-kwd"&gt;var&lt;/span&gt; value: TOmniValue): boolean;&lt;br /&gt;&lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;  Result := TryTake(value, INFINITE);&lt;br /&gt;&lt;span class="pas-kwd"&gt;end&lt;/span&gt;; &lt;span class="pas-comment"&gt;{ TOmniBlockingCollection.Take }&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;&lt;em&gt;TryAdd&lt;/em&gt; checks if &lt;em&gt;CompleteAdding&lt;/em&gt; has been called. If not, the value is stored in the dynamic queue.&lt;/p&gt;&lt;p&gt;There’s a potential problem hiding in the &lt;em&gt;TryAdd&lt;/em&gt; – between the time the &lt;em&gt;completed&lt;/em&gt; flag is checked and the time the value is enqueued, another thread may call &lt;em&gt;CompleteAdding&lt;/em&gt;. Strictly speaking, &lt;em&gt;TryAdd&lt;/em&gt; should not succeed in that case. However, I cannot foresee a parallel algorithm where this could cause a problem.&lt;/p&gt;&lt;pre class="pas-source"&gt;&lt;span class="pas-kwd"&gt;function&lt;/span&gt; TOmniBlockingCollection.TryAdd(&lt;span class="pas-kwd"&gt;const&lt;/span&gt; value: TOmniValue): boolean;&lt;br /&gt;&lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;  &lt;span class="pas-comment"&gt;// CompleteAdding and TryAdd are not synchronised&lt;/span&gt;&lt;br /&gt;  Result := &lt;span class="pas-kwd"&gt;not&lt;/span&gt; obcCompleted;&lt;br /&gt;  &lt;span class="pas-kwd"&gt;if&lt;/span&gt; Result &lt;span class="pas-kwd"&gt;then&lt;/span&gt;&lt;br /&gt;    obcCollection.Enqueue(value);&lt;br /&gt;&lt;span class="pas-kwd"&gt;end&lt;/span&gt;; &lt;span class="pas-comment"&gt;{ TOmniBlockingCollection.TryAdd }&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;Easy peasy.&lt;/p&gt;&lt;h2&gt;The not so trivial part&lt;/h2&gt;&lt;p&gt;And now for something completely different …&lt;/p&gt;&lt;p&gt;&lt;em&gt;TryTake&lt;/em&gt; is a whole different beast. It must:&lt;/p&gt;&lt;ul&gt;  &lt;li&gt;retrieve the data&lt;/li&gt;  &lt;li&gt;observe &lt;em&gt;IsCompleted&lt;/em&gt;&lt;/li&gt;  &lt;li&gt;block when there’s no data &lt;strong&gt;and&lt;/strong&gt; observer is &lt;em&gt;completed&lt;/em&gt;&lt;/li&gt;  &lt;li&gt;observe the timeout limitations&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Not so easy.&lt;/p&gt;&lt;p&gt;In addition to the &lt;em&gt;obcCompletedSignal&lt;/em&gt; (&lt;em&gt;completed&lt;/em&gt; event) and &lt;em&gt;obcCollection&lt;/em&gt; (dynamic data queue) it will also use &lt;em&gt;obcObserver&lt;/em&gt; (a queue change mechanism used inside the OTL) and &lt;em&gt;obcResourceCount&lt;/em&gt;, which is an instance of the TOmniResourceCount (inverse semaphore, introduced in &lt;a href="http://17slon.com/blogs/gabr/2010/01/three-steps-to-blocking-collection-1.html" target="_blank"&gt;Part 1&lt;/a&gt;). All these are created in the constructor:&lt;/p&gt;&lt;pre class="pas-source"&gt;&lt;span class="pas-kwd"&gt;constructor&lt;/span&gt; TOmniBlockingCollection.Create(numProducersConsumers: integer);&lt;br /&gt;&lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;  &lt;span class="pas-kwd"&gt;inherited&lt;/span&gt; Create;&lt;br /&gt;  &lt;span class="pas-kwd"&gt;if&lt;/span&gt; numProducersConsumers &amp;gt; &lt;span class="pas-num"&gt;0&lt;/span&gt; &lt;span class="pas-kwd"&gt;then&lt;/span&gt;&lt;br /&gt;    obcResourceCount := TOmniResourceCount.Create(numProducersConsumers);&lt;br /&gt;  obcCollection := TOmniQueue.Create;&lt;br /&gt;  obcCompletedSignal := CreateEvent(&lt;span class="pas-kwd"&gt;nil&lt;/span&gt;, true, false, &lt;span class="pas-kwd"&gt;nil&lt;/span&gt;);&lt;br /&gt;  obcObserver := CreateContainerWindowsEventObserver;&lt;br /&gt;  obcSingleThreaded := (Environment.Process.Affinity.Count = &lt;span class="pas-num"&gt;1&lt;/span&gt;);&lt;br /&gt;  &lt;span class="pas-kwd"&gt;if&lt;/span&gt; obcSingleThreaded &lt;span class="pas-kwd"&gt;then&lt;/span&gt;&lt;br /&gt;    obcCollection.ContainerSubject.Attach(obcObserver, coiNotifyOnAllInserts);&lt;br /&gt;&lt;span class="pas-kwd"&gt;end&lt;/span&gt;; &lt;span class="pas-comment"&gt;{ TOmniBlockingCollection.Create }&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;&lt;em&gt;TryTake&lt;/em&gt; is pretty long so I’ve split it into two parts. Let’s take a look at the non-blocking part first.&lt;/p&gt;&lt;p&gt;First, the code tries to retrieve data from the dynamic queue. If there’s data available, it is returned. End of story.&lt;/p&gt;&lt;p&gt;Otherwise, the &lt;em&gt;completed&lt;/em&gt; flag is checked. If &lt;em&gt;CompleteAdding&lt;/em&gt; has been called, &lt;em&gt;TryTake&lt;/em&gt; returns immediately. It also returns if &lt;em&gt;timeout&lt;/em&gt; is 0.&lt;/p&gt;&lt;p&gt;Otherwise, the code prepares for the blocking wait. Resource counter is allocated (reasons for this will be provided later), and observer is attached to the blocking collection. This observer will wake the blocking code when new value is stored in the collection. &lt;/p&gt;&lt;p&gt;[In the code below you can see a small optimization – if the code is running on a single core then the observer is attached in the TOmniBlockingCollection constructor and detached in the destructor. Before this optimization was introduced, Attach and Detach spent much too much time in busy-wait code (on a single-core computer).]&lt;/p&gt;&lt;p&gt;After all that is set, the code waits for the value (see the next code block), observer is detached from the queue and resource counter is released.&lt;/p&gt;&lt;pre class="pas-source"&gt;&lt;span class="pas-kwd"&gt;function&lt;/span&gt; TOmniBlockingCollection.TryTake(&lt;span class="pas-kwd"&gt;var&lt;/span&gt; value: TOmniValue;&lt;br /&gt;  timeout_ms: cardinal): boolean;&lt;br /&gt;&lt;span class="pas-kwd"&gt;var&lt;/span&gt;&lt;br /&gt;  awaited    : DWORD;&lt;br /&gt;  startTime  : int64;&lt;br /&gt;  waitHandles: &lt;span class="pas-kwd"&gt;array&lt;/span&gt; [&lt;span class="pas-num"&gt;0&lt;/span&gt;..&lt;span class="pas-num"&gt;2&lt;/span&gt;] &lt;span class="pas-kwd"&gt;of&lt;/span&gt; THandle;&lt;br /&gt;&lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;  &lt;span class="pas-kwd"&gt;if&lt;/span&gt; obcCollection.TryDequeue(value) &lt;span class="pas-kwd"&gt;then&lt;/span&gt;&lt;br /&gt;    Result := true&lt;br /&gt;  &lt;span class="pas-kwd"&gt;else&lt;/span&gt; &lt;span class="pas-kwd"&gt;if&lt;/span&gt; IsCompleted &lt;span class="pas-kwd"&gt;or&lt;/span&gt; (timeout_ms = &lt;span class="pas-num"&gt;0&lt;/span&gt;) &lt;span class="pas-kwd"&gt;then&lt;/span&gt;&lt;br /&gt;    Result := false&lt;br /&gt;  &lt;span class="pas-kwd"&gt;else&lt;/span&gt; &lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;    &lt;span class="pas-kwd"&gt;if&lt;/span&gt; assigned(obcResourceCount) &lt;span class="pas-kwd"&gt;then&lt;/span&gt;&lt;br /&gt;      obcResourceCount.Allocate;&lt;br /&gt;    &lt;span class="pas-kwd"&gt;try&lt;/span&gt;&lt;br /&gt;      &lt;span class="pas-kwd"&gt;if&lt;/span&gt; &lt;span class="pas-kwd"&gt;not&lt;/span&gt; obcSingleThreaded &lt;span class="pas-kwd"&gt;then&lt;/span&gt;&lt;br /&gt;        obcCollection.ContainerSubject.Attach(obcObserver, coiNotifyOnAllInserts);&lt;br /&gt;      &lt;span class="pas-kwd"&gt;try&lt;/span&gt;&lt;br /&gt;        &lt;span class="pas-comment"&gt;//wait for the value, see the next code block below&lt;/span&gt;&lt;br /&gt;      &lt;span class="pas-kwd"&gt;finally&lt;/span&gt;&lt;br /&gt;        &lt;span class="pas-kwd"&gt;if&lt;/span&gt; &lt;span class="pas-kwd"&gt;not&lt;/span&gt; obcSingleThreaded &lt;span class="pas-kwd"&gt;then&lt;/span&gt;&lt;br /&gt;          obcCollection.ContainerSubject.Detach(obcObserver, coiNotifyOnAllInserts);&lt;br /&gt;      &lt;span class="pas-kwd"&gt;end&lt;/span&gt;;&lt;br /&gt;    &lt;span class="pas-kwd"&gt;finally&lt;/span&gt;&lt;br /&gt;      &lt;span class="pas-kwd"&gt;if&lt;/span&gt; assigned(obcResourceCount) &lt;span class="pas-kwd"&gt;then&lt;/span&gt;&lt;br /&gt;        obcResourceCount.Release;&lt;br /&gt;    &lt;span class="pas-kwd"&gt;end&lt;/span&gt;;&lt;br /&gt;  &lt;span class="pas-kwd"&gt;end&lt;/span&gt;;&lt;br /&gt;&lt;span class="pas-kwd"&gt;end&lt;/span&gt;; &lt;span class="pas-comment"&gt;{ TOmniBlockingCollection.TryTake }&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;Blocking part starts by storing the current time (millisecond-accurate TimeGetTime is used) and preparing wait handles. Then it enters the loop which repeats until the &lt;em&gt;CompleteAdding&lt;/em&gt; has been called or timeout has elapsed (the &lt;em&gt;Elapsed&lt;/em&gt; function which I’m not showing here for the sake of simplicty; see the source) or a value was dequeued.&lt;/p&gt;&lt;p&gt;In the loop, the code tries again to dequeue a value from the dynamic queue and exits the loop if dequeue succeeds. Otherwise, a &lt;em&gt;WaitForMultipleObjects&lt;/em&gt; is called. This wait waits for one of three conditions:&lt;/p&gt;&lt;ul&gt;  &lt;li&gt;&lt;em&gt;Completed&lt;/em&gt; event. If this event is signalled, &lt;em&gt;CompleteAdding &lt;/em&gt;has been called and &lt;em&gt;TryTake&lt;/em&gt; must exit.&lt;/li&gt;  &lt;li&gt;&lt;em&gt;Observer&lt;/em&gt; event. If this event is signalled, new value was enqueued into the dynamic queue and code must try to dequeue this value.&lt;/li&gt;  &lt;li&gt;&lt;em&gt;Resource count&lt;/em&gt; event. If this event is signalled, all resources are used and the code must exit (more on that later).&lt;/li&gt;&lt;/ul&gt;&lt;pre class="pas-source"&gt;        startTime := DSiTimeGetTime64;&lt;br /&gt;        waitHandles[&lt;span class="pas-num"&gt;0&lt;/span&gt;] := obcCompletedSignal;&lt;br /&gt;        waitHandles[&lt;span class="pas-num"&gt;1&lt;/span&gt;] := obcObserver.GetEvent;&lt;br /&gt;        &lt;span class="pas-kwd"&gt;if&lt;/span&gt; assigned(obcResourceCount) &lt;span class="pas-kwd"&gt;then&lt;/span&gt;&lt;br /&gt;          waitHandles[&lt;span class="pas-num"&gt;2&lt;/span&gt;] := obcResourceCount.Handle;&lt;br /&gt;        Result := false;&lt;br /&gt;        &lt;span class="pas-kwd"&gt;while&lt;/span&gt; &lt;span class="pas-kwd"&gt;not&lt;/span&gt; (IsCompleted &lt;span class="pas-kwd"&gt;or&lt;/span&gt; Elapsed) &lt;span class="pas-kwd"&gt;do&lt;/span&gt; &lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;          &lt;span class="pas-kwd"&gt;if&lt;/span&gt; obcCollection.TryDequeue(value) &lt;span class="pas-kwd"&gt;then&lt;/span&gt; &lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;            Result := true;&lt;br /&gt;            break; &lt;span class="pas-comment"&gt;//while&lt;/span&gt;&lt;br /&gt;          &lt;span class="pas-kwd"&gt;end&lt;/span&gt;;&lt;br /&gt;          awaited := WaitForMultipleObjects(IFF(assigned(obcResourceCount), &lt;span class="pas-num"&gt;3&lt;/span&gt;, &lt;span class="pas-num"&gt;2&lt;/span&gt;),&lt;br /&gt;                       @waitHandles, false, TimeLeft_ms);&lt;br /&gt;          &lt;span class="pas-kwd"&gt;if&lt;/span&gt; awaited &amp;lt;&amp;gt; WAIT_OBJECT_1 &lt;span class="pas-kwd"&gt;then&lt;/span&gt; &lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;            &lt;span class="pas-kwd"&gt;if&lt;/span&gt; awaited = WAIT_OBJECT_2 &lt;span class="pas-kwd"&gt;then&lt;/span&gt; &lt;br /&gt;              CompleteAdding;&lt;br /&gt;            Result := false;&lt;br /&gt;            break; &lt;span class="pas-comment"&gt;//while&lt;/span&gt;&lt;br /&gt;          &lt;span class="pas-kwd"&gt;end&lt;/span&gt;;&lt;br /&gt;        &lt;span class="pas-kwd"&gt;end&lt;/span&gt;;&lt;/pre&gt;&lt;p&gt;If new value was enqueued into the dynamic queue, &lt;em&gt;TryDequeue&lt;/em&gt; is called again. It is entirely possible that another thread calls that function first and removes the value causing &lt;em&gt;TryDequeue&lt;/em&gt; to fail and &lt;em&gt;WaitForMultipleObjects&lt;/em&gt; to be called again. Such is life in the multithreaded world.&lt;/p&gt;&lt;h2&gt;Enumerating the blocking collection&lt;/h2&gt;&lt;p&gt;TOmniBlockingCollection enumerator is slightly more powerful than the usual Delphi enumerator. In addition to the usual methods it contains function &lt;em&gt;Take&lt;/em&gt; which is required by the &lt;em&gt;Parallel&lt;/em&gt; architecture (see &lt;a href="http://17slon.com/blogs/gabr/2010/01/parallelfor.html" target="_blank"&gt;Parallel.For&lt;/a&gt; and &lt;a href="http://17slon.com/blogs/gabr/2010/02/parallelforeachaggregate.html" target="_blank"&gt;Parallel.ForEach.Aggregate&lt;/a&gt; for more information).&lt;/p&gt;&lt;pre class="pas-source"&gt;  IOmniValueEnumerator = &lt;span class="pas-kwd"&gt;interface&lt;/span&gt; [&lt;span class="pas-str"&gt;'{F60EBBD8-2F87-4ACD-A014-452F296F4699}'&lt;/span&gt;]&lt;br /&gt;    &lt;span class="pas-kwd"&gt;function&lt;/span&gt;  GetCurrent: TOmniValue;&lt;br /&gt;    &lt;span class="pas-kwd"&gt;function&lt;/span&gt;  MoveNext: boolean;&lt;br /&gt;    &lt;span class="pas-kwd"&gt;function&lt;/span&gt;  Take(&lt;span class="pas-kwd"&gt;var&lt;/span&gt; value: TOmniValue): boolean;&lt;br /&gt;    &lt;span class="pas-kwd"&gt;property&lt;/span&gt; Current: TOmniValue &lt;span class="pas-kwd"&gt;read&lt;/span&gt; GetCurrent;&lt;br /&gt;  &lt;span class="pas-kwd"&gt;end&lt;/span&gt;; &lt;span class="pas-comment"&gt;{ IOmniValueEnumerator }&lt;/span&gt;&lt;/pre&gt;&lt;pre class="pas-source"&gt;  TOmniBlockingCollectionEnumerator = &lt;span class="pas-kwd"&gt;class&lt;/span&gt;(TInterfacedObject,&lt;br /&gt;                                            IOmniValueEnumerator)&lt;br /&gt;    &lt;span class="pas-kwd"&gt;constructor&lt;/span&gt; Create(collection: TOmniBlockingCollection);&lt;br /&gt;    &lt;span class="pas-kwd"&gt;function&lt;/span&gt; GetCurrent: TOmniValue; &lt;span class="pas-kwd"&gt;inline&lt;/span&gt;;&lt;br /&gt;    &lt;span class="pas-kwd"&gt;function&lt;/span&gt; MoveNext: boolean; &lt;span class="pas-kwd"&gt;inline&lt;/span&gt;;&lt;br /&gt;    &lt;span class="pas-kwd"&gt;function&lt;/span&gt; Take(&lt;span class="pas-kwd"&gt;var&lt;/span&gt; value: TOmniValue): boolean;&lt;br /&gt;    &lt;span class="pas-kwd"&gt;property&lt;/span&gt; Current: TOmniValue &lt;span class="pas-kwd"&gt;read&lt;/span&gt; GetCurrent;&lt;br /&gt;  &lt;span class="pas-kwd"&gt;end&lt;/span&gt;; &lt;span class="pas-comment"&gt;{ TOmniBlockingCollectionEnumerator }&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;The implementation is trivial.&lt;/p&gt;&lt;pre class="pas-source"&gt;&lt;span class="pas-kwd"&gt;constructor&lt;/span&gt; TOmniBlockingCollectionEnumerator.Create(collection: TOmniBlockingCollection);&lt;br /&gt;&lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;  obceCollection_ref := collection;&lt;br /&gt;&lt;span class="pas-kwd"&gt;end&lt;/span&gt;; &lt;span class="pas-comment"&gt;{ TOmniBlockingCollectionEnumerator.Create }&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="pas-kwd"&gt;function&lt;/span&gt; TOmniBlockingCollectionEnumerator.GetCurrent: TOmniValue;&lt;br /&gt;&lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;  Result := obceValue;&lt;br /&gt;&lt;span class="pas-kwd"&gt;end&lt;/span&gt;; &lt;span class="pas-comment"&gt;{ TOmniBlockingCollectionEnumerator.GetCurrent }&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="pas-kwd"&gt;function&lt;/span&gt; TOmniBlockingCollectionEnumerator.MoveNext: boolean;&lt;br /&gt;&lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;  Result := obceCollection_ref.Take(obceValue);&lt;br /&gt;&lt;span class="pas-kwd"&gt;end&lt;/span&gt;; &lt;span class="pas-comment"&gt;{ TOmniBlockingCollectionEnumerator.MoveNext }&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="pas-kwd"&gt;function&lt;/span&gt; TOmniBlockingCollectionEnumerator.Take(&lt;span class="pas-kwd"&gt;var&lt;/span&gt; value: TOmniValue): boolean;&lt;br /&gt;&lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;  Result := MoveNext;&lt;br /&gt;  &lt;span class="pas-kwd"&gt;if&lt;/span&gt; Result &lt;span class="pas-kwd"&gt;then&lt;/span&gt;&lt;br /&gt;    value := obceValue;&lt;br /&gt;&lt;span class="pas-kwd"&gt;end&lt;/span&gt;; &lt;span class="pas-comment"&gt;{ TOmniBlockingCollectionEnumerator.Take }&lt;/span&gt;&lt;/pre&gt;&lt;h2&gt;Example&lt;/h2&gt;&lt;p&gt;A not-so-simple &lt;em&gt;how to&lt;/em&gt; on using the blocking collection can be seen in the demo 34_TreeScan. It uses the blocking collection to scan a tree with multiple parallel threads. This demo works in Delphi 2007 and newer.&lt;/p&gt;&lt;p&gt;A better example of using the blocking collection is in the demo 35_ParallelFor. Actually, it uses the same approach as demo 34 to scan the tree, except that the code is implemented as an anonymous method which causes it to be much simpler than the D2007 version. Of course, this demo works only in Delphi 2009 and above. &lt;/p&gt;&lt;p&gt;This is the full parallel scanner from the 35_ParallelFor demo:&lt;/p&gt;&lt;pre class="pas-source"&gt;&lt;span class="pas-kwd"&gt;function&lt;/span&gt; TfrmParallelForDemo.ParaScan(rootNode: TNode; value: integer): TNode;&lt;br /&gt;&lt;span class="pas-kwd"&gt;var&lt;/span&gt;&lt;br /&gt;  cancelToken: IOmniCancellationToken;&lt;br /&gt;  nodeQueue  : IOmniBlockingCollection;&lt;br /&gt;  nodeResult : TNode;&lt;br /&gt;  numTasks   : integer;&lt;br /&gt;&lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;  nodeResult := &lt;span class="pas-kwd"&gt;nil&lt;/span&gt;;&lt;br /&gt;  cancelToken := CreateOmniCancellationToken;&lt;br /&gt;  numTasks := Environment.Process.Affinity.Count;&lt;br /&gt;  nodeQueue := TOmniBlockingCollection.Create(numTasks);&lt;br /&gt;  nodeQueue.Add(rootNode);&lt;br /&gt;  Parallel.ForEach(nodeQueue &lt;span class="pas-kwd"&gt;as&lt;/span&gt; IOmniValueEnumerable)&lt;br /&gt;    .NumTasks(numTasks) &lt;span class="pas-comment"&gt;// must be same number of task as in &lt;br /&gt;                           nodeQueue to ensure stopping&lt;/span&gt;&lt;br /&gt;    .CancelWith(cancelToken)&lt;br /&gt;    .Execute(&lt;br /&gt;      &lt;span class="pas-kwd"&gt;procedure&lt;/span&gt; (&lt;span class="pas-kwd"&gt;const&lt;/span&gt; elem: TOmniValue)&lt;br /&gt;      &lt;span class="pas-kwd"&gt;var&lt;/span&gt;&lt;br /&gt;        childNode: TNode;&lt;br /&gt;        node     : TNode;&lt;br /&gt;      &lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;        node := TNode(elem.AsObject);&lt;br /&gt;        &lt;span class="pas-kwd"&gt;if&lt;/span&gt; node.Value = value &lt;span class="pas-kwd"&gt;then&lt;/span&gt; &lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;          nodeResult := node;&lt;br /&gt;          nodeQueue.CompleteAdding;&lt;br /&gt;          cancelToken.Signal;&lt;br /&gt;        &lt;span class="pas-kwd"&gt;end&lt;/span&gt;&lt;br /&gt;        &lt;span class="pas-kwd"&gt;else&lt;/span&gt; &lt;span class="pas-kwd"&gt;for&lt;/span&gt; childNode &lt;span class="pas-kwd"&gt;in&lt;/span&gt; node.Children &lt;span class="pas-kwd"&gt;do&lt;/span&gt;&lt;br /&gt;          nodeQueue.TryAdd(childNode);&lt;br /&gt;      &lt;span class="pas-kwd"&gt;end&lt;/span&gt;);&lt;br /&gt;  Result := nodeResult;&lt;br /&gt;&lt;span class="pas-kwd"&gt;end&lt;/span&gt;; &lt;span class="pas-comment"&gt;{ TfrmParallelForDemo.ParaScan }&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;The code first creates a cancellation token which will be used to stop the &lt;em&gt;Parallel.ForEach&lt;/em&gt; loop. Number of tasks is set to number of cores accessible from the process and a blocking collection is created. Resource count for this collection is initialized to the number of tasks (parameter to the &lt;em&gt;TOmniBlockingCollection.Create&lt;/em&gt;). The root node of the tree is added to the blocking collection.&lt;/p&gt;&lt;p&gt;Then the &lt;em&gt;Parallel.ForEach&lt;/em&gt; is called. The &lt;em&gt;IOmniValueEnumerable&lt;/em&gt; aspect of the blocking collection is passed to the &lt;em&gt;ForEach&lt;/em&gt;. Currently, this is the only way to provide &lt;em&gt;ForEach&lt;/em&gt; with data. This interface just tells the &lt;em&gt;ForEach&lt;/em&gt; how to generate enumerator for each worker thread. [At the moment, each worker requires a separate enumerator. This may change in the future.]&lt;/p&gt;&lt;pre class="pas-source"&gt;  IOmniValueEnumerable = &lt;span class="pas-kwd"&gt;interface&lt;/span&gt; [&lt;span class="pas-str"&gt;'{50C1C176-C61F-41F5-AA0B-6FD215E5159F}'&lt;/span&gt;]&lt;br /&gt;    &lt;span class="pas-kwd"&gt;function&lt;/span&gt;  GetEnumerator: IOmniValueEnumerator;&lt;br /&gt;  &lt;span class="pas-kwd"&gt;end&lt;/span&gt;; &lt;span class="pas-comment"&gt;{ IOmniValueEnumerable }&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;The code also passes cancellation token to the&lt;em&gt; ForEach&lt;/em&gt; loop and starts the parallel execution (call to &lt;em&gt;Execute&lt;/em&gt;). In each parallel task, the following code is executed (this code is copied from the full &lt;em&gt;ParaScan&lt;/em&gt; example above):&lt;/p&gt;&lt;pre class="pas-source"&gt;      &lt;span class="pas-kwd"&gt;procedure&lt;/span&gt; (&lt;span class="pas-kwd"&gt;const&lt;/span&gt; elem: TOmniValue)&lt;br /&gt;      &lt;span class="pas-kwd"&gt;var&lt;/span&gt;&lt;br /&gt;        childNode: TNode;&lt;br /&gt;        node     : TNode;&lt;br /&gt;      &lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;        node := TNode(elem.AsObject);&lt;br /&gt;        &lt;span class="pas-kwd"&gt;if&lt;/span&gt; node.Value = value &lt;span class="pas-kwd"&gt;then&lt;/span&gt; &lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;          nodeResult := node;&lt;br /&gt;          nodeQueue.CompleteAdding;&lt;br /&gt;          cancelToken.Signal;&lt;br /&gt;        &lt;span class="pas-kwd"&gt;end&lt;/span&gt;&lt;br /&gt;        &lt;span class="pas-kwd"&gt;else&lt;/span&gt; &lt;span class="pas-kwd"&gt;for&lt;/span&gt; childNode &lt;span class="pas-kwd"&gt;in&lt;/span&gt; node.Children &lt;span class="pas-kwd"&gt;do&lt;/span&gt;&lt;br /&gt;          nodeQueue.TryAdd(childNode);&lt;br /&gt;      &lt;span class="pas-kwd"&gt;end&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;The code is provided with one element from the blocking collection (&lt;em&gt;ForEach&lt;/em&gt; takes care of that). If the &lt;em&gt;Value&lt;/em&gt; field is the value we’re searching for, &lt;em&gt;nodeResult&lt;/em&gt; is set, blocking collection is put into &lt;em&gt;CompleteAdding&lt;/em&gt; state (so that enumerators in other tasks will terminate blocking wait (if any)) and &lt;em&gt;ForEach&lt;/em&gt; is cancelled.&lt;/p&gt;&lt;p&gt;Otherwise (not the value we’re looking for), all the children of the current node are added to the blocking collection. &lt;em&gt;TryAdd&lt;/em&gt; is used (and its return value ignored) because another thread may call &lt;em&gt;CompleteAdding&lt;/em&gt; while the &lt;em&gt;for childNode&lt;/em&gt; loop is being executed.&lt;/p&gt;&lt;p&gt;That’s all! There is a blocking collection into which nodes are put (via the &lt;em&gt;for childNode&lt;/em&gt; loop) and from which they are removed (via the &lt;em&gt;ForEach&lt;/em&gt; infrastructure). If child nodes are not provided fast enough, blocking collection will block on &lt;em&gt;Take&lt;/em&gt; and one or more tasks may sleep for some time until new values appear. Only when the value is found, the blocking collection and &lt;em&gt;ForEach&lt;/em&gt; loop are completed/cancelled.&lt;/p&gt;&lt;p&gt;This is very similar to &lt;a href="http://blogs.msdn.com/pfxteam/archive/2009/11/06/9918363.aspx" target="_blank"&gt;the code&lt;/a&gt; that was my inspiration for writing the blocking collection:&lt;/p&gt;&lt;blockquote&gt;&lt;p&gt;var targetNode = …;  &lt;br&gt;var bc = new BlockingCollection&amp;lt;Node&amp;gt;(startingNodes); &lt;br&gt;// since we expect GetConsumingEnumerable to block, limit parallelism to the number of  &lt;br&gt;// procs, avoiding too much thread injection &lt;br&gt;var parOpts = new ParallelOptions() { MaxDegreeOfParallelism = Enivronment.ProcessorCount };  &lt;br&gt;Parallel.ForEach(bc.GetConsumingEnumerable(), parOpts, (node,loop) =&amp;gt;  &lt;br&gt;{    &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; if (node == targetNode)    &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; {&lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; Console.WriteLine(“hooray!”);    &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; bc.CompleteAdding();&lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; loop.Stop();&lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; }&lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; else    &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; {    &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; foreach(var neighbor in node.Neighbors) bc.Add(neighbor);    &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; }&lt;br /&gt;});&lt;/p&gt;&lt;/blockquote&gt;&lt;p&gt;However, this C# code exhibits a small problem. If the value is not to be found in the tree, the code never stops. Why? All tasks eventually block in the Take method (because complete tree has been scanned) and nobody calls &lt;em&gt;CompleteAdding&lt;/em&gt; and &lt;em&gt;loop.Stop&lt;/em&gt;. Does the Delphi code contains the very same problem?&lt;/p&gt;&lt;p&gt;Definitely not! That’s exactly why the resource counter was added to the blocking collection!&lt;/p&gt;&lt;p&gt;If the blocking collection is initialized with number of resources greater then zero, it will allocate a resource counter in the constructor. This resource counter is allocated just before the thread blocks in &lt;em&gt;TryTake&lt;/em&gt; and released after that. Each blocking wait in &lt;em&gt;TryTake&lt;/em&gt; waits for this resource counter to become signalled. If all threads try to execute blocking wait, this resource counter drops to zero, signals itself and unblocks all &lt;em&gt;TryTake&lt;/em&gt; calls!&lt;/p&gt;&lt;p&gt;This elegant solution has only one problem – resource counter &lt;strong&gt;must&lt;/strong&gt; be initialized to the number of threads that will be reading from the blocking collection. That’s why in the code above (&lt;em&gt;ParaScan&lt;/em&gt;) same number is passed to the blocking collection constructor (resource counter initialization) and to the &lt;em&gt;ForEach.NumTasks &lt;/em&gt;method (number of parallel threads).&lt;/p&gt;&lt;h2&gt;Download&lt;/h2&gt;&lt;p&gt;TOmniBlockingCollection will be available in the &lt;a href="http://otl.17slon.com" target="_blank"&gt;OmniThreadLibrary&lt;/a&gt; 1.05, which will be released in few days.&lt;/p&gt;&lt;p&gt;For the impatient there is &lt;a href="http://17slon.com/blogs/gabr/2010/02/omnithreadlibrary-105-release-candidate.html" target="_blank"&gt;OTL 1.05 Release Candidate&lt;/a&gt;. The only code that will change between 1.05 RC and release are possible bug fixes.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;font size=-2&gt;---&lt;br/&gt;Published under the &lt;a href="http://creativecommons.org/licenses/by/3.0/"&gt;Creative Commons Attribution 3.0&lt;/a&gt; license&lt;/font&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/29331675-5422205136113908635?l=17slon.com%2Fblogs%2Fgabr%2Fblogger.html' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/5422205136113908635/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://17slon.com/blogs/gabr/2010/02/three-steps-to-blocking-collection-3.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/posts/default/5422205136113908635'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/posts/default/5422205136113908635'/><link rel='alternate' type='text/html' href='http://17slon.com/blogs/gabr/2010/02/three-steps-to-blocking-collection-3.html' title='Three steps to the blocking collection: [3] Blocking collection'/><author><name>gabr</name><uri>http://www.blogger.com/profile/06903558857617342477</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='12941540036853261067'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-29331675.post-747113479395494077</id><published>2010-02-22T20:22:00.001+01:00</published><updated>2010-02-22T20:22:11.497+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Delphi'/><category scheme='http://www.blogger.com/atom/ns#' term='OmniThreadLibrary'/><title type='text'>OmniThreadLibrary 1.05 Release Candidate</title><content type='html'>&lt;p&gt;Next OTL release is coming soon. Brave souls are invited to download and test 1.05 RC. If you find any problem, make sure to report it in the &lt;a href="http://otl.17slon.com/forum/index.php/board,2.0.html" target="_blank"&gt;OmniThreadLibrary forum&lt;/a&gt; or here in comments.&lt;/p&gt;  &lt;p&gt;&lt;a href="http://omnithreadlibrary.googlecode.com/files/OmniThreadLibrary-1.05RC.zip" target="_blank"&gt;OmniThreadLibrary-1.05RC.zip&lt;/a&gt;    &lt;br /&gt;&lt;a href="https://omnithreadlibrary.googlecode.com/svn/tags/1.05-RC" target="_blank"&gt;1.05RC tag in SVN&lt;/a&gt;    &lt;br /&gt;&lt;a href="http://code.google.com/p/omnithreadlibrary/source/browse/trunk/history.txt" target="_blank"&gt;list of changes&lt;/a&gt;&lt;/p&gt;  &lt;div class="blogger-post-footer"&gt;&lt;font size=-2&gt;---&lt;br/&gt;Published under the &lt;a href="http://creativecommons.org/licenses/by/3.0/"&gt;Creative Commons Attribution 3.0&lt;/a&gt; license&lt;/font&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/29331675-747113479395494077?l=17slon.com%2Fblogs%2Fgabr%2Fblogger.html' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/747113479395494077/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://17slon.com/blogs/gabr/2010/02/omnithreadlibrary-105-release-candidate.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/posts/default/747113479395494077'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/posts/default/747113479395494077'/><link rel='alternate' type='text/html' href='http://17slon.com/blogs/gabr/2010/02/omnithreadlibrary-105-release-candidate.html' title='OmniThreadLibrary 1.05 Release Candidate'/><author><name>gabr</name><uri>http://www.blogger.com/profile/06903558857617342477</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='12941540036853261067'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-29331675.post-3472550202464156890</id><published>2010-02-18T18:52:00.003+01:00</published><updated>2010-02-19T14:52:42.232+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='open source'/><category scheme='http://www.blogger.com/atom/ns#' term='Delphi'/><category scheme='http://www.blogger.com/atom/ns#' term='OmniThreadLibrary'/><category scheme='http://www.blogger.com/atom/ns#' term='multithreading'/><category scheme='http://www.blogger.com/atom/ns#' term='source code'/><category scheme='http://www.blogger.com/atom/ns#' term='programming'/><title type='text'>Dynamic lock-free queue – doing it right</title><content type='html'>&lt;p&gt;&lt;em&gt;Some history required …&lt;/em&gt;&lt;/p&gt;&lt;p&gt;First there was a good idea with somewhat patchy implementation: &lt;a href="http://17slon.com/blogs/gabr/2010/01/three-steps-to-blocking-collection-2.html" target="_blank"&gt;Three steps to the blocking collection: [2] Dynamically allocated queue&lt;/a&gt;.&lt;/p&gt;  &lt;p&gt;Then there was a partial solution, depending on me being able to solve another problem. Still, it was a good solution: &lt;a href="http://17slon.com/blogs/gabr/2010/02/releasing-queue-memory-without-mrew.html" target="_blank"&gt;Releasing queue memory without the MREW lock&lt;/a&gt;.&lt;/p&gt;  &lt;p&gt;At the end, the final (actually, the original) problem was also solved: &lt;a href="http://17slon.com/blogs/gabr/2010/02/bypassing-aba-problem.html" target="_blank"&gt;Bypassing the ABA problem&lt;/a&gt;.&lt;/p&gt;  &lt;p&gt;&lt;em&gt;And now to the results …&lt;/em&gt;&lt;/p&gt;&lt;style type="text/css"&gt;.style1 {font-family: consolas, "Courier New", courier, monospace;}.example {font-family: consolas, "Courier New", courier, monospace; padding: 0.5ex 4em 0.5ex 4ex;}.style3 { text-align: center;}.style4 { text-align: center; font-variant: small-caps;}.style5 {color: #008000;}.style6 {text-align: center;color: #008000;}&lt;/style&gt;  &lt;p&gt;This article describes a &lt;a title="lock-freedom @ wikipedia" href="http://en.wikipedia.org/wiki/Lock-free#Lock-freedom" target="_blank"&gt;lock-free&lt;/a&gt;, (nearly) O(1) insert/remove, dynamically allocated queue that doesn’t require garbage collector. It can be implemented on any hardware that supports 8-byte compare-and-swap operation (in Intel world, that means at least a Pentium). The code uses 8-byte atomic move in some parts but they can be easily changed into 8-byte CAS in case the platform doesn’t support such operation. In the current implementation, Move64 (8-byte move) function uses SSE2 instructions and therefore requires Pentium 4. The code, however, can be conditionally compiled with CAS64 instead of Move64 thus enabling it to run on Pentium 1 to 3. (See the notes in the code for more information). The code requires memory manager that allows the memory to be released in a thread different from the thread where allocation occurred. [Obviously, Windows on Intel platform satisfies all conditions.]&lt;/p&gt;  &lt;p&gt;Although the dynamic queue has been designed with the &lt;a href="http://otl.17slon.com" target="_blank"&gt;OmniThreadLibrary&lt;/a&gt; (OTL for short) in mind, there’s also a small sample implementation that doesn’t depend on the OTL: &lt;a href="http://code.google.com/p/omnithreadlibrary/source/browse/trunk/src/GpLockFreeQueue.pas"&gt;GpLockFreeQueue.pas&lt;/a&gt;. This implementation can store int64 elements only (or everything you can cast into 8 bytes) while the OTL implementation from &lt;a href="http://code.google.com/p/omnithreadlibrary/source/browse/trunk/OtlContainers.pas"&gt;OtlContainers&lt;/a&gt; stores TOmniValue data. [The latter being a kind of variant record used inside the OTL to store “anything” from a byte to a string/wide string/object/interface.] Because of that, GpLockFreeQueue implementation is smaller, faster, but slightly more limited. Both are released under the &lt;a href="http://www.opensource.org/licenses/bsd-license.php"&gt;BSD license&lt;/a&gt;.&lt;/p&gt;  &lt;h2&gt;Memory layout&lt;/h2&gt;    &lt;p&gt;Data is stored in &lt;em&gt;slots&lt;/em&gt;. Each slot uses 16 bytes and contains byte-size &lt;em&gt;tag&lt;/em&gt;, word-size&lt;em&gt; offset&lt;/em&gt; and up to 13 bytes of data. The implementation in OtlContainers uses all of those 13 bytes to store TOmniValue while the implementation in GpLockFreeQueue uses only 8 bytes and keeps the rest unused.&lt;/p&gt;  &lt;p&gt;The following notation is used to represent a slot: &lt;font face="Consolas, Courier New"&gt;[&lt;em&gt;tag&lt;/em&gt;|&lt;em&gt;offset&lt;/em&gt;|&lt;em&gt;value&lt;/em&gt;].&lt;/font&gt;&lt;/p&gt;  &lt;p&gt;In reality, &lt;em&gt;value&lt;/em&gt; field is first in the record because it must be 4-aligned. The reason for that will be revealed in a moment. In GpLockFreeQueue, a slot is defined as:&lt;/p&gt; &lt;!-- Highlighted Pascal code generated by DelphiDabbler PasH --&gt;  &lt;pre class="pas-source"&gt;  TGpLFQueueTaggedValue = &lt;span class="pas-kwd"&gt;packed&lt;/span&gt; &lt;span class="pas-kwd"&gt;record&lt;/span&gt;&lt;br /&gt;    Value   : int64;&lt;br /&gt;    Tag     : TGpLFQueueTag;&lt;br /&gt;    Offset  : word;&lt;br /&gt;    Stuffing: &lt;span class="pas-kwd"&gt;array&lt;/span&gt; [&lt;span class="pas-num"&gt;1&lt;/span&gt;..&lt;span class="pas-num"&gt;5&lt;/span&gt;] &lt;span class="pas-kwd"&gt;of&lt;/span&gt; byte;&lt;br /&gt;  &lt;span class="pas-kwd"&gt;end&lt;/span&gt;; &lt;span class="pas-comment"&gt;{ TGpLFQueueTaggedValue }&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;Slots do not stand by themselves; they are allocated in &lt;em&gt;blocks&lt;/em&gt;. Default block size if 64 KB (4096 slots) but can be varied from 64 bytes (four slots) to 1 MB (65536 slots). In this article, I’ll be using 5-slot blocks, as they are big enough to demonstrate all the nooks and crannies of the algorithm and small enough to fit in one line of text.&lt;/p&gt;&lt;p&gt;During the allocation, each block is formatted as follows:&lt;/p&gt;&lt;p class="example"&gt;&lt;span class="style1"&gt;[&lt;em&gt;Header&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;] [&lt;em&gt;Sentinel&lt;/em&gt;|&lt;em&gt;1&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;em&gt;Free&lt;/em&gt;|&lt;em&gt;2&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;em&gt;Free&lt;/em&gt;|&lt;em&gt;3&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;em&gt;EndOfList&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;&lt;/span&gt;&lt;font face="Consolas, Courier New"&gt;]&lt;/font&gt;&lt;/p&gt;&lt;p&gt;The first slot is marked as a &lt;em&gt;Header&lt;/em&gt; and has the &lt;em&gt;value&lt;/em&gt; field initialized to “number of slots in the block minus one”. [The highest value that can be stored in the header’s value field is 65535; therefore the maximum number of slots in a block is 65536.] This value is atomically decremented each time a slot is dequeued. When the number drops to zero, block can be released. (More on that in: &lt;a href="http://17slon.com/blogs/gabr/2010/02/releasing-queue-memory-without-mrew.html"&gt;Releasing queue memory without the MREW lock&lt;/a&gt;.) InterlockedDecrement, which is used to decrement this value, requires its argument to be 4-aligned and that’s the reason for the &lt;em&gt;value&lt;/em&gt; field to be stored first in the slot.&lt;/p&gt;&lt;p&gt;The second slot is a &lt;em&gt;Sentinel&lt;/em&gt;. Slots from the third onwards are tagged &lt;em&gt;Free&lt;/em&gt; and are used to store data. The last slot is tagged &lt;em&gt;EndOfList &lt;/em&gt;and is used to link two blocks. All slots have the &lt;em&gt;offset&lt;/em&gt; field initialized to the sequence number of the slot – in the &lt;em&gt;Header&lt;/em&gt; this value is 0, in the &lt;em&gt;Sentinel &lt;/em&gt;1, and so on up to the &lt;em&gt;EnndOfList&lt;/em&gt; with the value set to 4 (number of slots in the block minus 1). This value is used in the Dequeue to calculate the address of the header slot just before the header’s value is decremented.&lt;/p&gt;&lt;p&gt;In addition to dynamically allocated (and released) memory blocks, the queue uses &lt;strong&gt;head &lt;/strong&gt;and &lt;strong&gt;tail &lt;/strong&gt;tagged pointers. Both are 8-byte values, consisting of two 4-byte fields – &lt;em&gt;slot&lt;/em&gt; and &lt;em&gt;tag&lt;/em&gt;. The following notation is used to represent a tagged pointer: &lt;font face="Consolas, Courier New"&gt;[&lt;em&gt;slot&lt;/em&gt;|&lt;em&gt;tag&lt;/em&gt;].&lt;/font&gt;&lt;/p&gt;&lt;p&gt;The &lt;em&gt;slot&lt;/em&gt; field contains the address of the current head/tail slot while the &lt;em&gt;tag&lt;/em&gt; field contains the tag of the current slot. The motivation behind this scheme is explained in the &lt;a href="http://17slon.com/blogs/gabr/2010/02/bypassing-aba-problem.html"&gt;Bypassing the ABA problem&lt;/a&gt; post.&lt;/p&gt;&lt;p&gt;Tail and head pointers are modified using 8-byte CAS and Move commands and must therefore be 8-aligned.&lt;/p&gt;&lt;p&gt;By putting all that together, we get a snapshot of the queue state. This is the initial state of a queue with five-slot blocks:&lt;/p&gt;&lt;p class="example"&gt;&lt;font face="Consolas, Courier New"&gt;&lt;strong&gt;Head:&lt;/strong&gt;[&lt;em&gt;B1:2&lt;/em&gt;|Free] &lt;br/&gt;&lt;strong&gt;Tail:&lt;/strong&gt;[&lt;em&gt;B1:1&lt;/em&gt;|&lt;em&gt;Sentinel&lt;/em&gt;] &lt;br/&gt;&lt;strong&gt;B1:&lt;/strong&gt;[&lt;em&gt;Header&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;] &lt;strong&gt;T:&lt;/strong&gt;[&lt;em&gt;Sentinel&lt;/em&gt;|&lt;em&gt;1&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] &lt;strong&gt;H:&lt;/strong&gt;[&lt;em&gt;Free&lt;/em&gt;|&lt;em&gt;2&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;em&gt;Free&lt;/em&gt;|&lt;em&gt;3&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;em&gt;EndOfList&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;]&lt;/font&gt;&lt;/p&gt;&lt;p&gt;The memory block begins at address &lt;strong&gt;B1&lt;/strong&gt; and contains five slots, initialized as described before. The &lt;strong&gt;tail&lt;/strong&gt; pointer points to the second slot of block B1 (&lt;em&gt;B1:1&lt;/em&gt;; I’m using the form &lt;em&gt;address:offset)&lt;/em&gt;, which is tagged &lt;em&gt;Sentinel&lt;/em&gt; and the &lt;strong&gt;head&lt;/strong&gt; pointer points to the third block (B1:2), the first &lt;em&gt;Free&lt;/em&gt; slot. Here we see the sole reason for the &lt;em&gt;Sentinel&lt;/em&gt; – it stands between the &lt;strong&gt;tail&lt;/strong&gt; and the &lt;strong&gt;head&lt;/strong&gt; when the queue is empty.&lt;/p&gt;&lt;h2&gt;Enqueue&lt;/h2&gt;&lt;p&gt;In theory, the enqueue operation is simple. The element is stored in the next available slot and queue head is advanced. In practice, however, multithreading makes things much more complicated. &lt;/p&gt;&lt;p&gt;To prevent thread conflicts, each enqueueing thread must first &lt;em&gt;take ownership&lt;/em&gt; of the head. It does this by swapping queue &lt;strong&gt;head&lt;/strong&gt; tag from &lt;em&gt;Free&lt;/em&gt; to &lt;em&gt;Allocating&lt;/em&gt; or from &lt;em&gt;EndOfList&lt;/em&gt; to &lt;em&gt;Extending&lt;/em&gt;. To prevent ABA problems, both head pointer and head tag are swapped with the same head pointer and new tag in one atomic 8-byte compare-and-swap.&lt;/p&gt;&lt;p&gt;Enqueue then does its work and at the end swaps (head pointer, tag) to (next head pointer, &lt;em&gt;Free|EndOfList&lt;/em&gt;) which allows other threads to proceed with their enqueue operation.&lt;/p&gt;&lt;p&gt;Let’s start with the empty list.&lt;/p&gt;&lt;p class="example"&gt;&lt;font face="Consolas, Courier New"&gt;&lt;strong&gt;Head:&lt;/strong&gt;[&lt;em&gt;B1:2&lt;/em&gt;|&lt;em&gt;Free&lt;/em&gt;] &lt;br/&gt;&lt;strong&gt;Tail:&lt;/strong&gt;[&lt;em&gt;B1:1&lt;/em&gt;|&lt;em&gt;Sentinel&lt;/em&gt;] &lt;br/&gt;&lt;strong&gt;B1:&lt;/strong&gt;[&lt;em&gt;Header&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;] &lt;strong&gt;T:&lt;/strong&gt;[&lt;em&gt;Sentinel&lt;/em&gt;|&lt;em&gt;1&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] &lt;strong&gt;H:&lt;/strong&gt;[&lt;em&gt;Free&lt;/em&gt;|&lt;em&gt;2&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;em&gt;Free&lt;/em&gt;|&lt;em&gt;3&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;em&gt;EndOfList&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;]&lt;/font&gt;&lt;/p&gt;&lt;p&gt;Enqueue first swaps &lt;font face="Consolas, Courier New"&gt;[&lt;em&gt;B1:2&lt;/em&gt;|&lt;em&gt;Free&lt;/em&gt;]&lt;/font&gt; with &lt;font face="Consolas, Courier New"&gt;[&lt;em&gt;B1:2&lt;/em&gt;|&lt;em&gt;Allocating&lt;/em&gt;]&lt;/font&gt;.&lt;/p&gt;&lt;p class="example"&gt;&lt;font face="Consolas, Courier New"&gt;&lt;strong&gt;Head:&lt;/strong&gt;[&lt;em&gt;&lt;strong&gt;&lt;font color="#008000"&gt;B1:2&lt;/font&gt;&lt;/strong&gt;&lt;/em&gt;|&lt;em&gt;&lt;font color="#008000"&gt;&lt;strong&gt;Allocating&lt;/strong&gt;&lt;/font&gt;&lt;/em&gt;] &lt;br/&gt;&lt;strong&gt;Tail:&lt;/strong&gt;[&lt;em&gt;B1:1&lt;/em&gt;|&lt;em&gt;Sentinel&lt;/em&gt;] &lt;br/&gt;&lt;strong&gt;B1:&lt;/strong&gt;[&lt;em&gt;Header&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;] &lt;strong&gt;T:&lt;/strong&gt;[&lt;em&gt;Sentinel&lt;/em&gt;|&lt;em&gt;1&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] &lt;strong&gt;H:&lt;/strong&gt;[&lt;em&gt;Free&lt;/em&gt;|&lt;em&gt;2&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;em&gt;Free&lt;/em&gt;|&lt;em&gt;3&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;em&gt;EndOfList&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;]&lt;/font&gt;&lt;/p&gt;&lt;p&gt;The green colour indicates an atomic change.&lt;/p&gt;&lt;p&gt;Only the head tag has changed, the data in the B1 memory block is not modified. &lt;strong&gt;Head&lt;/strong&gt; still points to a slot tagged &lt;em&gt;Free &lt;/em&gt;(slot B1:2). This is fine as enqueueing threads don’t take interest in this tag at all.&lt;/p&gt;&lt;p&gt;Data is then stored in the slot and its tag is changed to &lt;em&gt;Allocated&lt;/em&gt;. This again makes no change to enqueuers as the head slot in the header was not updated yet. It also doesn’t allow the dequeue operation on this slot to proceed because the &lt;strong&gt;head &lt;/strong&gt;is adjacent to the &lt;strong&gt;tail&lt;/strong&gt;, which points to a &lt;em&gt;Sentinel&lt;/em&gt;&lt;strong&gt;&amp;#160;&lt;/strong&gt;and in this case Dequeue treats the queue as empty (as we’ll see later).&lt;/p&gt;&lt;p class="example"&gt;&lt;font face="Consolas, Courier New"&gt;&lt;strong&gt;Head:&lt;/strong&gt;[&lt;em&gt;B1:2&lt;/em&gt;|&lt;em&gt;&lt;font color="#000000"&gt;Allocating&lt;/font&gt;&lt;/em&gt;] &lt;br/&gt;&lt;strong&gt;Tail:&lt;/strong&gt;[&lt;em&gt;B1:1&lt;/em&gt;|&lt;em&gt;Sentinel&lt;/em&gt;] &lt;br/&gt;&lt;strong&gt;B1:&lt;/strong&gt;[&lt;em&gt;Header&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;] &lt;strong&gt;T:&lt;/strong&gt;[&lt;em&gt;Sentinel&lt;/em&gt;|&lt;em&gt;1&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] &lt;strong&gt;H:&lt;/strong&gt;[&lt;em&gt;&lt;font color="#ff0000"&gt;&lt;strong&gt;Allocated&lt;/strong&gt;&lt;/font&gt;&lt;/em&gt;|&lt;em&gt;2&lt;/em&gt;|&lt;em&gt;&lt;font color="#ff0000"&gt;&lt;strong&gt;42&lt;/strong&gt;&lt;/font&gt;&lt;/em&gt;] [&lt;em&gt;Free&lt;/em&gt;|&lt;em&gt;3&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;em&gt;EndOfList&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;]&lt;/font&gt;&lt;/p&gt;&lt;p&gt;Red colour marks “unsafe” modification.&lt;/p&gt;&lt;p&gt;At the end, the &lt;strong&gt;head&lt;/strong&gt; is unlocked by storing address of the next slot (first free slot, B1:3) and next slot’s tag (&lt;em&gt;Free&lt;/em&gt;).&lt;/p&gt;&lt;p class="example"&gt;&lt;font face="Consolas, Courier New"&gt;&lt;strong&gt;Head:&lt;/strong&gt;[&lt;em&gt;&lt;font color="#008080"&gt;&lt;strong&gt;B1:3&lt;/strong&gt;&lt;/font&gt;&lt;/em&gt;|&lt;em&gt;&lt;font color="#008080"&gt;&lt;strong&gt;Free&lt;/strong&gt;&lt;/font&gt;&lt;/em&gt;] &lt;br/&gt;&lt;strong&gt;Tail:&lt;/strong&gt;[&lt;em&gt;B1:1&lt;/em&gt;|&lt;em&gt;Sentinel&lt;/em&gt;] &lt;br/&gt;&lt;strong&gt;B1:&lt;/strong&gt;[&lt;em&gt;Header&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;] &lt;strong&gt;T:&lt;/strong&gt;[&lt;em&gt;Sentinel&lt;/em&gt;|&lt;em&gt;1&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;font color="#000000"&gt;&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;2&lt;/em&gt;|&lt;em&gt;42&lt;/em&gt;&lt;/font&gt;] &lt;strong&gt;H:&lt;/strong&gt;[&lt;em&gt;Free&lt;/em&gt;|&lt;em&gt;3&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;em&gt;EndOfList&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;]&lt;/font&gt;&lt;/p&gt;&lt;p&gt;Teal colour marks an atomic 8-byte move used to move new data into the &lt;strong&gt;head&lt;/strong&gt; pointer. If the target platform doesn’t support such move, an 8-byte CAS could be used instead.&lt;/p&gt;&lt;p&gt;After those changes, &lt;strong&gt;head&lt;/strong&gt; is pointing to the next free slot and data is stored in the queue.&lt;/p&gt;&lt;p&gt;Let’s assume that another Enqueue is called and stores number 17 in the queue. Nothing new happens here.&lt;/p&gt;&lt;p class="example"&gt;&lt;font face="Consolas, Courier New"&gt;&lt;strong&gt;Head:&lt;/strong&gt;[&lt;font color="#000000"&gt;&lt;em&gt;B1:4&lt;/em&gt;|&lt;em&gt;EndOfList&lt;/em&gt;&lt;/font&gt;] &lt;br/&gt;&lt;strong&gt;Tail:&lt;/strong&gt;[&lt;em&gt;B1:1&lt;/em&gt;|&lt;em&gt;Sentinel&lt;/em&gt;] &lt;br/&gt;&lt;strong&gt;B1:&lt;/strong&gt;[&lt;em&gt;Header&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;] &lt;strong&gt;T:&lt;/strong&gt;[&lt;em&gt;Sentinel&lt;/em&gt;|&lt;em&gt;1&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;font color="#000000"&gt;&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;2&lt;/em&gt;|&lt;em&gt;42&lt;/em&gt;&lt;/font&gt;] [&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;3&lt;/em&gt;|&lt;em&gt;17&lt;/em&gt;] &lt;strong&gt;H:&lt;/strong&gt;[&lt;em&gt;EndOfList&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;]&lt;/font&gt;&lt;/p&gt;&lt;p&gt;The next Enqueue must do something new as there are no free slots in the current block. To extend the queue, thread first swaps the &lt;em&gt;EndOfList&lt;strong&gt; &lt;/strong&gt;&lt;/em&gt;tag with the &lt;em&gt;Extending &lt;/em&gt;tag. By doing this, the thread takes ownership of the queue &lt;strong&gt;head&lt;/strong&gt;.&lt;/p&gt;&lt;p class="example"&gt;&lt;font face="Consolas, Courier New"&gt;&lt;strong&gt;Head:&lt;/strong&gt;[&lt;font color="#000000"&gt;&lt;em&gt;&lt;strong&gt;&lt;font color="#008000"&gt;B1:4&lt;/font&gt;&lt;/strong&gt;&lt;/em&gt;|&lt;/font&gt;&lt;font color="#008000"&gt;&lt;em&gt;&lt;strong&gt;Extending&lt;/strong&gt;&lt;/em&gt;&lt;/font&gt;] &lt;br/&gt;&lt;strong&gt;Tail:&lt;/strong&gt;[&lt;em&gt;B1:1&lt;/em&gt;|&lt;em&gt;Sentinel&lt;/em&gt;] &lt;br/&gt;&lt;strong&gt;B1:&lt;/strong&gt;[&lt;em&gt;Header&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;] &lt;strong&gt;T:&lt;/strong&gt;[&lt;em&gt;Sentinel&lt;/em&gt;|&lt;em&gt;1&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;font color="#000000"&gt;&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;2&lt;/em&gt;|&lt;em&gt;42&lt;/em&gt;&lt;/font&gt;] [&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;3&lt;/em&gt;|&lt;em&gt;17&lt;/em&gt;] &lt;strong&gt;H:&lt;/strong&gt;[&lt;em&gt;EndOfList&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;]&lt;/font&gt;&lt;/p&gt;&lt;p&gt;A new block gets allocated and initialized (see chapter on memory management, below).&lt;/p&gt;&lt;p class="example"&gt;&lt;font face="Consolas, Courier New"&gt;&lt;strong&gt;Head:&lt;/strong&gt;[&lt;font color="#000000"&gt;&lt;em&gt;B1:4&lt;/em&gt;|&lt;em&gt;Extending&lt;/em&gt;&lt;/font&gt;] &lt;br/&gt;&lt;strong&gt;Tail:&lt;/strong&gt;[&lt;em&gt;B1:1&lt;/em&gt;|&lt;em&gt;Sentinel&lt;/em&gt;] &lt;br/&gt;&lt;strong&gt;B1:&lt;/strong&gt;[&lt;em&gt;Header&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;] &lt;strong&gt;T:&lt;/strong&gt;[&lt;em&gt;Sentinel&lt;/em&gt;|&lt;em&gt;1&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;font color="#000000"&gt;&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;2&lt;/em&gt;|&lt;em&gt;42&lt;/em&gt;&lt;/font&gt;] [&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;3&lt;/em&gt;|&lt;em&gt;17&lt;/em&gt;] &lt;strong&gt;H:&lt;/strong&gt;[&lt;em&gt;EndOfList&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] &lt;br/&gt;&lt;font color="#ff0000"&gt;&lt;strong&gt;B2:&lt;/strong&gt;[&lt;em&gt;Header&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;] [&lt;em&gt;Sentinel&lt;/em&gt;|&lt;em&gt;1&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;em&gt;Free&lt;/em&gt;|&lt;em&gt;2&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;em&gt;Free&lt;/em&gt;|&lt;em&gt;3&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;em&gt;EndOfList&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;]&lt;/font&gt;&lt;/font&gt;&lt;/p&gt;&lt;p&gt;Data is stored in the first free slot of the block B2.&lt;/p&gt;&lt;p class="example"&gt;&lt;font face="Consolas, Courier New"&gt;&lt;strong&gt;Head:&lt;/strong&gt;[&lt;font color="#000000"&gt;&lt;em&gt;B1:4&lt;/em&gt;|&lt;em&gt;Extending&lt;/em&gt;&lt;/font&gt;] &lt;br/&gt;&lt;strong&gt;Tail:&lt;/strong&gt;[&lt;em&gt;B1:1&lt;/em&gt;|&lt;em&gt;Sentinel&lt;/em&gt;] &lt;br/&gt;&lt;strong&gt;B1:&lt;/strong&gt;[&lt;em&gt;Header&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;] &lt;strong&gt;T:&lt;/strong&gt;[&lt;em&gt;Sentinel&lt;/em&gt;|&lt;em&gt;1&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;font color="#000000"&gt;&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;2&lt;/em&gt;|&lt;em&gt;42&lt;/em&gt;&lt;/font&gt;] [&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;3&lt;/em&gt;|&lt;em&gt;17&lt;/em&gt;] &lt;strong&gt;H:&lt;/strong&gt;[&lt;em&gt;EndOfList&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] &lt;br/&gt;&lt;strong&gt;B2:&lt;/strong&gt;[&lt;em&gt;Header&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;] [&lt;em&gt;Sentinel&lt;/em&gt;|&lt;em&gt;1&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;em&gt;&lt;strong&gt;&lt;font color="#ff0000"&gt;Allocated&lt;/font&gt;&lt;/strong&gt;&lt;/em&gt;|&lt;em&gt;2&lt;/em&gt;|&lt;em&gt;&lt;font color="#ff0000"&gt;&lt;strong&gt;57&lt;/strong&gt;&lt;/font&gt;&lt;/em&gt;] [&lt;em&gt;Free&lt;/em&gt;|&lt;em&gt;3&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;em&gt;EndOfList&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;]&lt;/font&gt;&lt;/p&gt;&lt;p&gt;Last slot of block B1 is modified to point to the first element in the second slot of the next block (&lt;em&gt;Sentinel&lt;/em&gt;). Also, a tag &lt;em&gt;BlockPointer &lt;/em&gt;is stored into that slot.&lt;/p&gt;&lt;p class="example"&gt;&lt;font face="Consolas"&gt;&lt;strong&gt;Head:&lt;/strong&gt;[&lt;font color="#000000"&gt;&lt;em&gt;B1:4&lt;/em&gt;|&lt;em&gt;Extending&lt;/em&gt;&lt;/font&gt;] &lt;br/&gt;&lt;strong&gt;Tail:&lt;/strong&gt;[&lt;em&gt;B1:1&lt;/em&gt;|&lt;em&gt;Sentinel&lt;/em&gt;] &lt;br/&gt;&lt;strong&gt;B1:&lt;/strong&gt;[&lt;em&gt;Header&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;] &lt;strong&gt;T:&lt;/strong&gt;[&lt;em&gt;Sentinel&lt;/em&gt;|&lt;em&gt;1&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;font color="#000000"&gt;&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;2&lt;/em&gt;|&lt;em&gt;42&lt;/em&gt;&lt;/font&gt;] [&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;3&lt;/em&gt;|&lt;em&gt;17&lt;/em&gt;] &lt;strong&gt;H:&lt;/strong&gt;[&lt;em&gt;&lt;strong&gt;&lt;font color="#ff0000"&gt;BlockPointer&lt;/font&gt;&lt;/strong&gt;&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;|&lt;em&gt;&lt;strong&gt;&lt;font color="#ff0000"&gt;B2:1&lt;/font&gt;&lt;/strong&gt;&lt;/em&gt;] &lt;br/&gt;&lt;strong&gt;B2:&lt;/strong&gt;[&lt;em&gt;Header&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;] [&lt;em&gt;Sentinel&lt;/em&gt;|&lt;em&gt;1&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;em&gt;&lt;font color="#000000"&gt;Allocated&lt;/font&gt;&lt;/em&gt;|&lt;em&gt;2&lt;/em&gt;|&lt;em&gt;&lt;font color="#000000"&gt;57&lt;/font&gt;&lt;/em&gt;] [&lt;em&gt;Free&lt;/em&gt;|&lt;em&gt;3&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;em&gt;EndOfList&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;]&lt;/font&gt;&lt;/p&gt;&lt;p&gt;At the end, the &lt;strong&gt;head&lt;/strong&gt; is updated to point to the first free slot (B2:3).&lt;/p&gt;&lt;p class="example"&gt;&lt;font face="Consolas"&gt;&lt;strong&gt;Head:&lt;/strong&gt;[&lt;font color="#000000"&gt;&lt;em&gt;&lt;strong&gt;&lt;font color="#008080"&gt;B2:3&lt;/font&gt;&lt;/strong&gt;&lt;/em&gt;|&lt;/font&gt;&lt;em&gt;&lt;strong&gt;&lt;font color="#008080"&gt;Free&lt;/font&gt;&lt;/strong&gt;&lt;/em&gt;] &lt;br/&gt;&lt;strong&gt;Tail:&lt;/strong&gt;[&lt;em&gt;B1:1&lt;/em&gt;|&lt;em&gt;Sentinel&lt;/em&gt;] &lt;br/&gt;&lt;strong&gt;B1:&lt;/strong&gt;[&lt;em&gt;Header&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;] &lt;strong&gt;T:&lt;/strong&gt;[&lt;em&gt;Sentinel&lt;/em&gt;|&lt;em&gt;1&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;font color="#000000"&gt;&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;2&lt;/em&gt;|&lt;em&gt;42&lt;/em&gt;&lt;/font&gt;] [&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;3&lt;/em&gt;|&lt;em&gt;17&lt;/em&gt;] [&lt;font color="#000000"&gt;&lt;em&gt;BlockPointer&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;|&lt;em&gt;B2:1&lt;/em&gt;&lt;/font&gt;] &lt;br/&gt;&lt;strong&gt;B2:&lt;/strong&gt;[&lt;em&gt;Header&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;] [&lt;em&gt;Sentinel&lt;/em&gt;|&lt;em&gt;1&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;em&gt;&lt;font color="#000000"&gt;Allocated&lt;/font&gt;&lt;/em&gt;|&lt;em&gt;2&lt;/em&gt;|&lt;em&gt;&lt;font color="#000000"&gt;57&lt;/font&gt;&lt;/em&gt;] &lt;strong&gt;H:&lt;/strong&gt;[&lt;em&gt;Free&lt;/em&gt;|&lt;em&gt;3&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;em&gt;EndOfList&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;]&lt;/font&gt; &lt;/p&gt;&lt;p&gt;That completes the Enqueue. List &lt;strong&gt;head&lt;/strong&gt; is now unlocked.&lt;/p&gt;&lt;p&gt;The actual code is not more complicated than this description (code taken from GpLockFreeQueue).&lt;/p&gt;&lt;!-- Highlighted Pascal code generated by DelphiDabbler PasH --&gt;&lt;pre class="pas-source"&gt;&lt;span class="pas-kwd"&gt;procedure&lt;/span&gt; TGpLockFreeQueue.Enqueue(&lt;span class="pas-kwd"&gt;const&lt;/span&gt; value: int64);&lt;br /&gt;&lt;span class="pas-kwd"&gt;var&lt;/span&gt;&lt;br /&gt;  extension: PGpLFQueueTaggedValue;&lt;br /&gt;  next     : PGpLFQueueTaggedValue;&lt;br /&gt;  head     : PGpLFQueueTaggedValue;&lt;br /&gt;&lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;  &lt;span class="pas-kwd"&gt;repeat&lt;/span&gt;&lt;br /&gt;    head := obcHeadPointer.Slot;&lt;br /&gt;    &lt;span class="pas-kwd"&gt;if&lt;/span&gt; (obcHeadPointer.Tag = tagFree)&lt;br /&gt;       &lt;span class="pas-kwd"&gt;and&lt;/span&gt; CAS64(head, Ord(tagFree), head, Ord(tagAllocating), obcHeadPointer^)&lt;br /&gt;    &lt;span class="pas-kwd"&gt;then&lt;/span&gt;&lt;br /&gt;      break &lt;span class="pas-comment"&gt;//repeat&lt;/span&gt;&lt;br /&gt;    &lt;span class="pas-kwd"&gt;else&lt;/span&gt; &lt;span class="pas-kwd"&gt;if&lt;/span&gt; (obcHeadPointer.Tag = tagEndOfList)&lt;br /&gt;            &lt;span class="pas-kwd"&gt;and&lt;/span&gt; CAS64(head, Ord(tagEndOfList), head, Ord(tagExtending), obcHeadPointer^)&lt;br /&gt;    &lt;span class="pas-kwd"&gt;then&lt;/span&gt;&lt;br /&gt;      break &lt;span class="pas-comment"&gt;//repeat&lt;/span&gt;&lt;br /&gt;    &lt;span class="pas-kwd"&gt;else&lt;/span&gt;  &lt;span class="pas-comment"&gt;// very temporary condition, retry quickly&lt;/span&gt;&lt;br /&gt;      &lt;span class="pas-kwd"&gt;asm&lt;/span&gt; &lt;span class="pas-asm"&gt;pause&lt;/span&gt;&lt;span class="pas-asm"&gt;;&lt;/span&gt; &lt;span class="pas-kwd"&gt;end&lt;/span&gt;;&lt;br /&gt;  &lt;span class="pas-kwd"&gt;until&lt;/span&gt; false;&lt;br /&gt;  &lt;span class="pas-kwd"&gt;if&lt;/span&gt; obcHeadPointer.Tag = tagAllocating &lt;span class="pas-kwd"&gt;then&lt;/span&gt; &lt;span class="pas-kwd"&gt;begin&lt;/span&gt; &lt;span class="pas-comment"&gt;// enqueueing&lt;/span&gt;&lt;br /&gt;    next := NextSlot(head);&lt;br /&gt;    head.Value := value;&lt;br /&gt;    head.Tag := tagAllocated;&lt;br /&gt;    Move64(next, Ord(next.Tag), obcHeadPointer^); &lt;span class="pas-comment"&gt;// release the lock&lt;/span&gt;&lt;br /&gt;  &lt;span class="pas-kwd"&gt;end&lt;/span&gt;&lt;br /&gt;  &lt;span class="pas-kwd"&gt;else&lt;/span&gt; &lt;span class="pas-kwd"&gt;begin&lt;/span&gt; &lt;span class="pas-comment"&gt;// allocating memory&lt;/span&gt;&lt;br /&gt;    extension := AllocateBlock; &lt;span class="pas-comment"&gt;// returns pointer to the header&lt;/span&gt;&lt;br /&gt;    Inc(extension, &lt;span class="pas-num"&gt;2&lt;/span&gt;);          &lt;span class="pas-comment"&gt;// move over header and sentinel to the first data slot&lt;/span&gt;&lt;br /&gt;    extension.Tag := tagAllocated;&lt;br /&gt;    extension.Value := value;&lt;br /&gt;    Dec(extension);             &lt;span class="pas-comment"&gt;// forward reference points to the sentinel&lt;/span&gt;&lt;br /&gt;    head.Value := int64(extension);&lt;br /&gt;    head.Tag := tagBlockPointer;&lt;br /&gt;    Inc(extension, &lt;span class="pas-num"&gt;2&lt;/span&gt;); &lt;span class="pas-comment"&gt;// get to the first free slot&lt;/span&gt;&lt;br /&gt;    Move64(extension, Ord(extension.Tag), obcHeadPointer^); &lt;span class="pas-comment"&gt;// release the lock&lt;/span&gt;&lt;br /&gt;    PreallocateMemory; &lt;span class="pas-comment"&gt;// preallocate memory block&lt;/span&gt;&lt;br /&gt;  &lt;span class="pas-kwd"&gt;end&lt;/span&gt;;&lt;br /&gt;&lt;span class="pas-kwd"&gt;end&lt;/span&gt;; &lt;span class="pas-comment"&gt;{ TGpLockFreeQueue.Enqueue }&lt;/span&gt;&lt;/pre&gt;&lt;h2&gt;Dequeue&lt;/h2&gt;&lt;p&gt;Enqueue is simple but Dequeue is a whole new bag of problems. It has to handle the &lt;em&gt;Sentinel &lt;/em&gt;slot and because of that there are five possible scenarios:&lt;/p&gt;&lt;ol&gt;&lt;li&gt;Skip the sentinel. &lt;/li&gt;&lt;li&gt;Read the data (tail doesn’t catch the head). &lt;/li&gt;&lt;li&gt;Read the data (tail does catch the head). &lt;/li&gt;&lt;li&gt;The queue is empty. &lt;/li&gt;&lt;li&gt;Follow the &lt;em&gt;BlockPointer&lt;/em&gt; tag. &lt;/li&gt;&lt;/ol&gt;&lt;p&gt;To prevent thread conflicts, dequeueing thread takes ownership of the &lt;strong&gt;tail&lt;/strong&gt;. It does this by swapping the tail tag from &lt;em&gt;Allocated&lt;/em&gt; or &lt;em&gt;Sentinel&lt;/em&gt; to &lt;em&gt;Removing&lt;/em&gt; or from &lt;em&gt;BlockPointer&lt;/em&gt; to &lt;em&gt;Destroying&lt;/em&gt;. Again, those changes are done atomically by swapping both tail pointer and tail tag in one go.&lt;/p&gt;&lt;p&gt;Let’s walk through all five scenarios now.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;1 – Skip the sentinel&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Let’s start with a queue state where two slots are allocated and &lt;strong&gt;head&lt;/strong&gt; points to the &lt;em&gt;EndOfList&lt;/em&gt; slot.&lt;/p&gt;&lt;p class="example"&gt;&lt;font face="Consolas, Courier New"&gt;&lt;strong&gt;Head:&lt;/strong&gt;[&lt;font color="#000000"&gt;&lt;em&gt;B1:4&lt;/em&gt;|&lt;em&gt;EndOfList&lt;/em&gt;&lt;/font&gt;]&lt;strong&gt; &lt;br/&gt;Tail:&lt;/strong&gt;[&lt;em&gt;B1:1&lt;/em&gt;|&lt;em&gt;Sentinel&lt;/em&gt;] &lt;br/&gt;&lt;strong&gt;B1:&lt;/strong&gt;[&lt;em&gt;Header&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;] &lt;strong&gt;T:&lt;/strong&gt;[&lt;em&gt;Sentinel&lt;/em&gt;|&lt;em&gt;1&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;font color="#000000"&gt;&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;2&lt;/em&gt;|&lt;em&gt;42&lt;/em&gt;&lt;/font&gt;] [&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;3&lt;/em&gt;|&lt;em&gt;17&lt;/em&gt;] &lt;strong&gt;H:&lt;/strong&gt;[&lt;em&gt;EndOfList&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;]&lt;/font&gt; &lt;/p&gt;&lt;p&gt;The code first locks the &lt;strong&gt;tail&lt;/strong&gt;.&lt;/p&gt;&lt;p class="example"&gt;&lt;font face="Consolas"&gt;&lt;strong&gt;Head:&lt;/strong&gt;[&lt;font color="#000000"&gt;&lt;em&gt;B1:4&lt;/em&gt;|&lt;em&gt;EndOfList&lt;/em&gt;&lt;/font&gt;] &lt;br/&gt;&lt;strong&gt;Tail:&lt;/strong&gt;[&lt;em&gt;&lt;strong&gt;&lt;font color="#008000"&gt;B1:1&lt;/font&gt;&lt;/strong&gt;&lt;/em&gt;|&lt;em&gt;&lt;strong&gt;&lt;font color="#008000"&gt;Removing&lt;/font&gt;&lt;/strong&gt;&lt;/em&gt;] &lt;br/&gt;&lt;strong&gt;B1:&lt;/strong&gt;[&lt;em&gt;Header&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;] &lt;strong&gt;T:&lt;/strong&gt;[&lt;em&gt;Sentinel&lt;/em&gt;|&lt;em&gt;1&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;font color="#000000"&gt;&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;2&lt;/em&gt;|&lt;em&gt;42&lt;/em&gt;&lt;/font&gt;] [&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;3&lt;/em&gt;|&lt;em&gt;17&lt;/em&gt;] &lt;strong&gt;H:&lt;/strong&gt;[&lt;em&gt;EndOfList&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;]&lt;/font&gt;&lt;/p&gt;&lt;p&gt;As there is no data in the &lt;em&gt;Sentinel&lt;/em&gt; slot, the &lt;strong&gt;tail&lt;/strong&gt; is immediately updated to point to the next slot.&lt;/p&gt;&lt;p class="example"&gt;&lt;font face="Consolas"&gt;&lt;strong&gt;Head:&lt;/strong&gt;[&lt;font color="#000000"&gt;&lt;em&gt;B1:4&lt;/em&gt;|&lt;em&gt;EndOfList&lt;/em&gt;&lt;/font&gt;] &lt;br/&gt;&lt;strong&gt;Tail:&lt;/strong&gt;[&lt;font color="#000000"&gt;&lt;em&gt;&lt;strong&gt;&lt;font color="#008080"&gt;B1:2&lt;/font&gt;&lt;/strong&gt;&lt;/em&gt;|&lt;/font&gt;&lt;font color="#008080"&gt;&lt;em&gt;&lt;strong&gt;Allocated&lt;/strong&gt;&lt;/em&gt;&lt;/font&gt;] &lt;br/&gt;&lt;strong&gt;B1:&lt;/strong&gt;[&lt;em&gt;Header&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;] [&lt;em&gt;Sentinel&lt;/em&gt;|&lt;em&gt;1&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] &lt;strong&gt;T:&lt;/strong&gt;[&lt;font color="#000000"&gt;&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;2&lt;/em&gt;|&lt;em&gt;42&lt;/em&gt;&lt;/font&gt;] [&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;3&lt;/em&gt;|&lt;em&gt;17&lt;/em&gt;] &lt;strong&gt;H:&lt;/strong&gt;[&lt;em&gt;EndOfList&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;]&lt;/font&gt;&lt;/p&gt;&lt;p&gt;There’s no need to update the tag in slot 1 as no other thread can reach it again. Because the slot is now unreachable, the code now decrements the count in the B1’s &lt;em&gt;Header&lt;/em&gt; slot (from 4 to 3).&lt;/p&gt;&lt;p class="example"&gt;&lt;font face="Consolas"&gt;&lt;strong&gt;Head:&lt;/strong&gt;[&lt;font color="#000000"&gt;&lt;em&gt;B1:4&lt;/em&gt;|&lt;em&gt;EndOfList&lt;/em&gt;&lt;/font&gt;]&lt;strong&gt; &lt;br/&gt;Tail:&lt;/strong&gt;[&lt;font color="#000000"&gt;&lt;em&gt;B1:2&lt;/em&gt;|&lt;em&gt;Allocated&lt;/em&gt;&lt;/font&gt;] &lt;br/&gt;&lt;strong&gt;B1:&lt;/strong&gt;[&lt;em&gt;Header&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;|&lt;em&gt;&lt;strong&gt;&lt;font color="#008000"&gt;3&lt;/font&gt;&lt;/strong&gt;&lt;/em&gt;] [&lt;em&gt;Sentinel&lt;/em&gt;|&lt;em&gt;1&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] &lt;strong&gt;T:&lt;/strong&gt;[&lt;font color="#000000"&gt;&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;2&lt;/em&gt;|&lt;em&gt;42&lt;/em&gt;&lt;/font&gt;] [&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;3&lt;/em&gt;|&lt;em&gt;17&lt;/em&gt;] &lt;strong&gt;H:&lt;/strong&gt;[&lt;em&gt;EndOfList&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;]&lt;/font&gt;&lt;/p&gt;&lt;p&gt;Because the original tag was &lt;em&gt;Sentinel&lt;/em&gt;, the code retries from beginning immediately. The queue is now in scenario 2 (data, the &lt;strong&gt;tail &lt;/strong&gt;is not immediately before the &lt;strong&gt;head&lt;/strong&gt;).&lt;/p&gt;&lt;p&gt;&lt;strong&gt;2 - Read the data (tail doesn’t catch the head)&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Again, the &lt;strong&gt;tail&lt;/strong&gt; is locked.&lt;/p&gt;&lt;p class="example"&gt;&lt;font face="Consolas"&gt;&lt;strong&gt;Head:&lt;/strong&gt;[&lt;font color="#000000"&gt;&lt;em&gt;B1:4&lt;/em&gt;|&lt;em&gt;EndOfList&lt;/em&gt;&lt;/font&gt;] &lt;br/&gt;&lt;strong&gt;Tail:&lt;/strong&gt;[&lt;font color="#000000"&gt;&lt;em&gt;&lt;strong&gt;&lt;font color="#008000"&gt;B1:2&lt;/font&gt;&lt;/strong&gt;&lt;/em&gt;|&lt;/font&gt;&lt;font color="#008000"&gt;&lt;em&gt;&lt;strong&gt;Removing&lt;/strong&gt;&lt;/em&gt;&lt;/font&gt;] &lt;br/&gt;&lt;strong&gt;B1:&lt;/strong&gt;[&lt;em&gt;Header&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;|&lt;em&gt;&lt;font color="#000000"&gt;3&lt;/font&gt;&lt;/em&gt;] [&lt;em&gt;Sentinel&lt;/em&gt;|&lt;em&gt;1&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] &lt;strong&gt;T:&lt;/strong&gt;[&lt;font color="#000000"&gt;&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;2&lt;/em&gt;|&lt;em&gt;42&lt;/em&gt;&lt;/font&gt;] [&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;3&lt;/em&gt;|&lt;em&gt;17&lt;/em&gt;] &lt;strong&gt;H:&lt;/strong&gt;[&lt;em&gt;EndOfList&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;]&lt;/font&gt;&lt;/p&gt;&lt;p&gt;The code then reads the value from the slot (42) and advances the &lt;strong&gt;tail&lt;/strong&gt; to the slot B1:3.&lt;/p&gt;&lt;p class="example"&gt;&lt;font face="Consolas"&gt;&lt;strong&gt;Head:&lt;/strong&gt;[&lt;font color="#000000"&gt;&lt;em&gt;B1:4&lt;/em&gt;|&lt;em&gt;EndOfList&lt;/em&gt;&lt;/font&gt;] &lt;br/&gt;&lt;strong&gt;Tail:&lt;/strong&gt;[&lt;font color="#000000"&gt;&lt;em&gt;&lt;strong&gt;&lt;font color="#008080"&gt;B1:3&lt;/font&gt;&lt;/strong&gt;&lt;/em&gt;|&lt;/font&gt;&lt;em&gt;&lt;strong&gt;&lt;font color="#008080"&gt;Allocated&lt;/font&gt;&lt;/strong&gt;&lt;/em&gt;] &lt;br/&gt;&lt;strong&gt;B1:&lt;/strong&gt;[&lt;em&gt;Header&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;|&lt;em&gt;&lt;font color="#000000"&gt;3&lt;/font&gt;&lt;/em&gt;] [&lt;em&gt;Sentinel&lt;/em&gt;|&lt;em&gt;1&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;font color="#000000"&gt;&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;2&lt;/em&gt;|&lt;em&gt;42&lt;/em&gt;&lt;/font&gt;] &lt;strong&gt;T:&lt;/strong&gt;[&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;3&lt;/em&gt;|&lt;em&gt;17&lt;/em&gt;] &lt;strong&gt;H:&lt;/strong&gt;[&lt;em&gt;EndOfList&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;]&lt;/font&gt;&lt;/p&gt;&lt;p&gt;Again, there is no need to change the slot tag. The slot 2 is now unreachable and the &lt;em&gt;Header&lt;/em&gt; count is decremented.&lt;/p&gt;&lt;p class="example"&gt;&lt;font face="Consolas"&gt;&lt;strong&gt;Head:&lt;/strong&gt;[&lt;font color="#000000"&gt;&lt;em&gt;B1:4&lt;/em&gt;|&lt;em&gt;EndOfList&lt;/em&gt;&lt;/font&gt;]&lt;strong&gt; &lt;br/&gt;Tail:&lt;/strong&gt;[&lt;font color="#000000"&gt;&lt;em&gt;B1:3&lt;/em&gt;|&lt;em&gt;Allocated&lt;/em&gt;&lt;/font&gt;] &lt;br/&gt;&lt;strong&gt;B1:&lt;/strong&gt;[&lt;em&gt;Header&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;|&lt;em&gt;&lt;strong&gt;&lt;font color="#008000"&gt;2&lt;/font&gt;&lt;/strong&gt;&lt;/em&gt;] [&lt;em&gt;Sentinel&lt;/em&gt;|&lt;em&gt;1&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;font color="#000000"&gt;&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;2&lt;/em&gt;|&lt;em&gt;42&lt;/em&gt;&lt;/font&gt;] &lt;strong&gt;T:&lt;/strong&gt;[&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;3&lt;/em&gt;|&lt;em&gt;17&lt;/em&gt;] &lt;strong&gt;H:&lt;/strong&gt;[&lt;em&gt;EndOfList&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;]&lt;/font&gt;&lt;/p&gt;&lt;p&gt;The code has retrieved the data and can now return from the Dequeue method.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;3 - Read the data (tail does catch the head)&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;If the Dequeue is now called for the second time, we have the scenario 3 – there is data in the queue, but the &lt;strong&gt;head&lt;/strong&gt; pointer is next to the &lt;strong&gt;tail&lt;/strong&gt; pointer. Because of the, the &lt;strong&gt;tail&lt;/strong&gt; cannot be incremented. Instead of that, the code replaces the &lt;strong&gt;tail&lt;/strong&gt; slot tag with the &lt;em&gt;Sentinel.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;It is entirely possible that the &lt;strong&gt;head&lt;/strong&gt; will change the very next moment which means that the &lt;em&gt;Sentinel&lt;/em&gt; would not be really needed. Luckily, that doesn’t hurt much – the next Dequeue would skip the &lt;em&gt;Sentinel&lt;/em&gt;, retry and fetch the next element from the queue.&lt;/p&gt;&lt;p&gt;The code starts in a well-known manner, by taking ownership of the &lt;strong&gt;tail&lt;/strong&gt;.&lt;/p&gt;&lt;p class="example"&gt;&lt;font face="Consolas"&gt;&lt;strong&gt;Head:&lt;/strong&gt;[&lt;font color="#000000"&gt;&lt;em&gt;B1:4&lt;/em&gt;|&lt;em&gt;EndOfList&lt;/em&gt;&lt;/font&gt;] &lt;br/&gt;&lt;strong&gt;Tail:&lt;/strong&gt;[&lt;font color="#000000"&gt;&lt;em&gt;&lt;strong&gt;&lt;font color="#008000"&gt;B1:3&lt;/font&gt;&lt;/strong&gt;&lt;/em&gt;|&lt;/font&gt;&lt;font color="#008000"&gt;&lt;em&gt;&lt;strong&gt;Removing&lt;/strong&gt;&lt;/em&gt;&lt;/font&gt;] &lt;br/&gt;&lt;strong&gt;B1:&lt;/strong&gt;[&lt;em&gt;Header&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;|&lt;em&gt;&lt;font color="#000000"&gt;2&lt;/font&gt;&lt;/em&gt;] [&lt;em&gt;Sentinel&lt;/em&gt;|&lt;em&gt;1&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;font color="#000000"&gt;&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;2&lt;/em&gt;|&lt;em&gt;42&lt;/em&gt;&lt;/font&gt;] &lt;strong&gt;T:&lt;/strong&gt;[&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;3&lt;/em&gt;|&lt;em&gt;17&lt;/em&gt;] &lt;strong&gt;H:&lt;/strong&gt;[&lt;em&gt;EndOfList&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;]&lt;/font&gt;&amp;#160;&lt;/p&gt;&lt;p&gt;The code then reads the value from the slot, but because the &lt;strong&gt;head&lt;/strong&gt; was next to &lt;strong&gt;tail&lt;/strong&gt; when Dequeue was called, the code doesn’t increment the &lt;strong&gt;tail&lt;/strong&gt; and doesn’t decrement the &lt;em&gt;Header&lt;/em&gt; counter. Instead of that, the &lt;em&gt;Sentinel&lt;/em&gt; tag is put into the &lt;strong&gt;head&lt;/strong&gt; tag.&lt;/p&gt;&lt;p class="example"&gt;&lt;font face="Consolas"&gt;&lt;strong&gt;Head:&lt;/strong&gt;[&lt;font color="#000000"&gt;&lt;em&gt;B1:4&lt;/em&gt;|&lt;em&gt;EndOfList&lt;/em&gt;&lt;/font&gt;]&lt;strong&gt; &lt;br/&gt;Tail:&lt;/strong&gt;[&lt;font color="#000000"&gt;&lt;em&gt;&lt;strong&gt;&lt;font color="#008080"&gt;B1:3&lt;/font&gt;&lt;/strong&gt;&lt;/em&gt;|&lt;/font&gt;&lt;font color="#008080"&gt;&lt;em&gt;&lt;strong&gt;Sentinel&lt;/strong&gt;&lt;/em&gt;&lt;/font&gt;] &lt;br/&gt;&lt;strong&gt;B1:&lt;/strong&gt;[&lt;em&gt;Header&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;|&lt;em&gt;&lt;font color="#000000"&gt;2&lt;/font&gt;&lt;/em&gt;] [&lt;em&gt;Sentinel&lt;/em&gt;|&lt;em&gt;1&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;font color="#000000"&gt;&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;2&lt;/em&gt;|&lt;em&gt;42&lt;/em&gt;&lt;/font&gt;] &lt;strong&gt;T:&lt;/strong&gt;[&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;3&lt;/em&gt;|&lt;em&gt;17&lt;/em&gt;] &lt;strong&gt;H:&lt;/strong&gt;[&lt;em&gt;EndOfList&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;]&lt;/font&gt;&amp;#160;&lt;/p&gt;&lt;p&gt;It doesn’t matter that the slot tag is still &lt;em&gt;Allocated&lt;/em&gt; as no-one will read it again.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;4 - The queue is empty&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;If the Dequeue would be called now, it would return immediately with status &lt;em&gt;empty&lt;/em&gt; because the &lt;strong&gt;tail&lt;/strong&gt; tag is &lt;em&gt;Sentinel&lt;strong&gt; &lt;/strong&gt;&lt;/em&gt;and because the &lt;strong&gt;tail&lt;/strong&gt; has caught the &lt;strong&gt;head&lt;/strong&gt;.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;5 - Follow the &lt;em&gt;BlockPointer&lt;/em&gt; tag&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;In the last scenario, the &lt;strong&gt;tail&lt;/strong&gt; is pointing to a &lt;em&gt;BlockPointer&lt;/em&gt;.&lt;/p&gt;&lt;p class="example"&gt;&lt;font face="Consolas"&gt;&lt;strong&gt;Head:&lt;/strong&gt;[&lt;font color="#000000"&gt;&lt;em&gt;B2:3&lt;/em&gt;|&lt;em&gt;Free&lt;/em&gt;&lt;/font&gt;] &lt;br/&gt;&lt;strong&gt;Tail:&lt;/strong&gt;[&lt;em&gt;B1:4&lt;/em&gt;|&lt;em&gt;EndOfList&lt;/em&gt;] &lt;br/&gt;&lt;strong&gt;B1:&lt;/strong&gt;[&lt;em&gt;Header&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;|&lt;em&gt;1&lt;/em&gt;] [&lt;em&gt;Sentinel&lt;/em&gt;|&lt;em&gt;1&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;font color="#000000"&gt;&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;2&lt;/em&gt;|&lt;em&gt;42&lt;/em&gt;&lt;/font&gt;] [&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;3&lt;/em&gt;|&lt;em&gt;17&lt;/em&gt;] &lt;strong&gt;T:&lt;/strong&gt;[&lt;font color="#000000"&gt;&lt;em&gt;BlockPointer&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;|&lt;em&gt;B2:1&lt;/em&gt;&lt;/font&gt;] &lt;br/&gt;&lt;strong&gt;B2:&lt;/strong&gt;[&lt;em&gt;Header&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;] [&lt;em&gt;Sentinel&lt;/em&gt;|&lt;em&gt;1&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;em&gt;&lt;font color="#000000"&gt;Allocated&lt;/font&gt;&lt;/em&gt;|&lt;em&gt;2&lt;/em&gt;|&lt;em&gt;&lt;font color="#000000"&gt;57&lt;/font&gt;&lt;/em&gt;] &lt;strong&gt;H:&lt;/strong&gt;[&lt;em&gt;Free&lt;/em&gt;|&lt;em&gt;3&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;em&gt;EndOfList&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;]&lt;/font&gt;&lt;/p&gt;&lt;p&gt;As expected, the code first takes the ownership of the &lt;strong&gt;tail.&lt;/strong&gt;&lt;/p&gt;&lt;p class="example"&gt;&lt;font face="Consolas"&gt;&lt;strong&gt;Head:&lt;/strong&gt;[&lt;font color="#000000"&gt;&lt;em&gt;B2:3&lt;/em&gt;|&lt;em&gt;Free&lt;/em&gt;&lt;/font&gt;] &lt;br/&gt;&lt;strong&gt;Tail:&lt;/strong&gt;[&lt;em&gt;&lt;strong&gt;&lt;font color="#008000"&gt;B1:4&lt;/font&gt;&lt;/strong&gt;&lt;/em&gt;|&lt;em&gt;&lt;strong&gt;&lt;font color="#008000"&gt;Destroying&lt;/font&gt;&lt;/strong&gt;&lt;/em&gt;] &lt;br/&gt;&lt;strong&gt;B1:&lt;/strong&gt;[&lt;em&gt;Header&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;|&lt;em&gt;1&lt;/em&gt;] [&lt;em&gt;Sentinel&lt;/em&gt;|&lt;em&gt;1&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;font color="#000000"&gt;&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;2&lt;/em&gt;|&lt;em&gt;42&lt;/em&gt;&lt;/font&gt;] [&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;3&lt;/em&gt;|&lt;em&gt;17&lt;/em&gt;] &lt;strong&gt;T:&lt;/strong&gt;[&lt;font color="#000000"&gt;&lt;em&gt;BlockPointer&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;|&lt;em&gt;B2:1&lt;/em&gt;&lt;/font&gt;] &lt;br/&gt;&lt;strong&gt;B2:&lt;/strong&gt;[&lt;em&gt;Header&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;] [&lt;em&gt;Sentinel&lt;/em&gt;|&lt;em&gt;1&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;em&gt;&lt;font color="#000000"&gt;Allocated&lt;/font&gt;&lt;/em&gt;|&lt;em&gt;2&lt;/em&gt;|&lt;em&gt;&lt;font color="#000000"&gt;57&lt;/font&gt;&lt;/em&gt;] &lt;strong&gt;H:&lt;/strong&gt;[&lt;em&gt;Free&lt;/em&gt;|&lt;em&gt;3&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;em&gt;EndOfList&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;]&lt;/font&gt; &lt;/p&gt;&lt;p&gt;We know that the first slot in the next block is &lt;em&gt;Sentinel&lt;/em&gt;. We also know that the &lt;strong&gt;head&lt;/strong&gt; is not pointing to this slot because that’s how Enqueue works (when new block is allocated, &lt;strong&gt;head&lt;/strong&gt; points to the first slot &lt;em&gt;after&lt;/em&gt; the &lt;em&gt;Sentinel&lt;/em&gt;.). Therefore, it is safe to update the &lt;strong&gt;tail&lt;/strong&gt; to point to the &lt;em&gt;Sentinel&lt;/em&gt; slot of the B2 block.&lt;/p&gt;&lt;p class="example"&gt;&lt;font face="Consolas"&gt;&lt;strong&gt;Head:&lt;/strong&gt;[&lt;font color="#000000"&gt;&lt;em&gt;B2:3&lt;/em&gt;|&lt;em&gt;Free&lt;/em&gt;&lt;/font&gt;] &lt;br/&gt;&lt;strong&gt;Tail:&lt;/strong&gt;[&lt;em&gt;&lt;strong&gt;&lt;font color="#008080"&gt;B2:1&lt;/font&gt;&lt;/strong&gt;&lt;/em&gt;|&lt;em&gt;&lt;strong&gt;&lt;font color="#008080"&gt;Sentinel&lt;/font&gt;&lt;/strong&gt;&lt;/em&gt;] &lt;br/&gt;&lt;strong&gt;B1:&lt;/strong&gt;[&lt;em&gt;Header&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;|&lt;em&gt;1&lt;/em&gt;] [&lt;em&gt;Sentinel&lt;/em&gt;|&lt;em&gt;1&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;font color="#000000"&gt;&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;2&lt;/em&gt;|&lt;em&gt;42&lt;/em&gt;&lt;/font&gt;] [&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;3&lt;/em&gt;|&lt;em&gt;17&lt;/em&gt;] [&lt;font color="#000000"&gt;&lt;em&gt;BlockPointer&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;|&lt;em&gt;B2:1&lt;/em&gt;&lt;/font&gt;] &lt;br/&gt;&lt;strong&gt;B2:&lt;/strong&gt;[&lt;em&gt;Header&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;] &lt;strong&gt;T:&lt;/strong&gt;[&lt;em&gt;Sentinel&lt;/em&gt;|&lt;em&gt;1&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;em&gt;&lt;font color="#000000"&gt;Allocated&lt;/font&gt;&lt;/em&gt;|&lt;em&gt;2&lt;/em&gt;|&lt;em&gt;&lt;font color="#000000"&gt;57&lt;/font&gt;&lt;/em&gt;] &lt;strong&gt;H:&lt;/strong&gt;[&lt;em&gt;Free&lt;/em&gt;|&lt;em&gt;3&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;em&gt;EndOfList&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;]&lt;/font&gt; &lt;/p&gt;&lt;p&gt;By doing the swap, the ownership of the &lt;strong&gt;tail&lt;/strong&gt; is released.&lt;/p&gt;&lt;p&gt;The &lt;em&gt;Header&lt;/em&gt; count is then decremented.&lt;/p&gt;&lt;p class="example"&gt;&lt;font face="Consolas"&gt;&lt;strong&gt;Head:&lt;/strong&gt;[&lt;font color="#000000"&gt;&lt;em&gt;B2:3&lt;/em&gt;|&lt;em&gt;Free&lt;/em&gt;&lt;/font&gt;] &lt;br/&gt;&lt;strong&gt;Tail:&lt;/strong&gt;[&lt;font color="#000000"&gt;&lt;em&gt;B2:1&lt;/em&gt;|&lt;em&gt;Sentinel&lt;/em&gt;&lt;/font&gt;] &lt;br/&gt;&lt;strong&gt;B1:&lt;/strong&gt;[&lt;em&gt;Header&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;|&lt;em&gt;&lt;strong&gt;&lt;font color="#008000"&gt;0&lt;/font&gt;&lt;/strong&gt;&lt;/em&gt;] [&lt;em&gt;Sentinel&lt;/em&gt;|&lt;em&gt;1&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;font color="#000000"&gt;&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;2&lt;/em&gt;|&lt;em&gt;42&lt;/em&gt;&lt;/font&gt;] [&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;3&lt;/em&gt;|&lt;em&gt;17&lt;/em&gt;] [&lt;font color="#000000"&gt;&lt;em&gt;BlockPointer&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;|&lt;em&gt;B2:1&lt;/em&gt;&lt;/font&gt;] &lt;br/&gt;&lt;strong&gt;B2:&lt;/strong&gt;[&lt;em&gt;Header&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;] &lt;strong&gt;T:&lt;/strong&gt;[&lt;em&gt;Sentinel&lt;/em&gt;|&lt;em&gt;1&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;em&gt;&lt;font color="#000000"&gt;Allocated&lt;/font&gt;&lt;/em&gt;|&lt;em&gt;2&lt;/em&gt;|&lt;em&gt;&lt;font color="#000000"&gt;57&lt;/font&gt;&lt;/em&gt;] &lt;strong&gt;H:&lt;/strong&gt;[&lt;em&gt;Free&lt;/em&gt;|&lt;em&gt;3&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;em&gt;EndOfList&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;]&lt;/font&gt; &lt;/p&gt;&lt;p&gt;Because the count is now zero, the code destroys the B1 block. (Note that the &lt;em&gt;Header&lt;/em&gt; count decrement is atomic and only one thread can actually reach the zero.) While the block is being destroyed, other threads may be calling Dequeue.&lt;/p&gt;&lt;p class="example"&gt;&lt;font face="Consolas"&gt;&lt;strong&gt;Head:&lt;/strong&gt;[&lt;font color="#000000"&gt;&lt;em&gt;B2:3&lt;/em&gt;|&lt;em&gt;Free&lt;/em&gt;&lt;/font&gt;] &lt;br/&gt;&lt;strong&gt;Tail:&lt;/strong&gt;[&lt;font color="#000000"&gt;&lt;em&gt;B2:1&lt;/em&gt;|&lt;em&gt;Sentinel&lt;/em&gt;&lt;/font&gt;] &lt;br/&gt;&lt;strike&gt;&lt;strong&gt;B1:&lt;/strong&gt;[&lt;em&gt;Header&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;|&lt;em&gt;&lt;strong&gt;&lt;font color="#008000"&gt;0&lt;/font&gt;&lt;/strong&gt;&lt;/em&gt;] [&lt;em&gt;Sentinel&lt;/em&gt;|&lt;em&gt;1&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;font color="#000000"&gt;&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;2&lt;/em&gt;|&lt;em&gt;42&lt;/em&gt;&lt;/font&gt;] [&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;3&lt;/em&gt;|&lt;em&gt;17&lt;/em&gt;] [&lt;font color="#000000"&gt;&lt;em&gt;BlockPointer&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;|&lt;em&gt;B2:1&lt;/em&gt;&lt;/font&gt;] &lt;br/&gt;&lt;/strike&gt;&lt;strong&gt;B2:&lt;/strong&gt;[&lt;em&gt;Header&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;] &lt;strong&gt;T:&lt;/strong&gt;[&lt;em&gt;Sentinel&lt;/em&gt;|&lt;em&gt;1&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;em&gt;&lt;font color="#000000"&gt;Allocated&lt;/font&gt;&lt;/em&gt;|&lt;em&gt;2&lt;/em&gt;|&lt;em&gt;&lt;font color="#000000"&gt;57&lt;/font&gt;&lt;/em&gt;] &lt;strong&gt;H:&lt;/strong&gt;[&lt;em&gt;Free&lt;/em&gt;|&lt;em&gt;3&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;em&gt;EndOfList&lt;/em&gt;|&lt;em&gt;4&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;]&lt;/font&gt; &lt;/p&gt;&lt;p&gt;Because the &lt;strong&gt;tail&lt;/strong&gt; tag was originally &lt;em&gt;BlockPointer&lt;/em&gt;, the code retries immediately and continues with the scenario 1.&lt;/p&gt;&lt;p&gt;The actual code is tricky because some of the code path is shared between scenarios (code taken from GpLockFreeQueue).&lt;/p&gt;&lt;!-- Highlighted Pascal code generated by DelphiDabbler PasH --&gt;&lt;pre class="pas-source"&gt;&lt;span class="pas-kwd"&gt;function&lt;/span&gt; TGpLockFreeQueue.Dequeue(&lt;span class="pas-kwd"&gt;var&lt;/span&gt; value: int64): boolean;&lt;br /&gt;&lt;span class="pas-kwd"&gt;var&lt;/span&gt;&lt;br /&gt;  caughtTheHead: boolean;&lt;br /&gt;  tail         : PGpLFQueueTaggedValue;&lt;br /&gt;  header       : PGpLFQueueTaggedValue;&lt;br /&gt;  next         : PGpLFQueueTaggedValue;&lt;br /&gt;  tag          : TGpLFQueueTag;&lt;br /&gt;&lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;  tag := tagSentinel;&lt;br /&gt;  Result := true;&lt;br /&gt;  &lt;span class="pas-kwd"&gt;while&lt;/span&gt; Result &lt;span class="pas-kwd"&gt;and&lt;/span&gt; (tag = tagSentinel) &lt;span class="pas-kwd"&gt;do&lt;/span&gt; &lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;    &lt;span class="pas-kwd"&gt;repeat&lt;/span&gt;&lt;br /&gt;      tail := obcTailPointer.Slot;&lt;br /&gt;      caughtTheHead := NextSlot(obcTailPointer.Slot) = obcHeadPointer.Slot; &lt;br /&gt;      &lt;span class="pas-kwd"&gt;if&lt;/span&gt; (obcTailPointer.Tag = tagAllocated)&lt;br /&gt;         &lt;span class="pas-kwd"&gt;and&lt;/span&gt; CAS64(tail, Ord(tagAllocated), tail, Ord(tagRemoving), obcTailPointer^) &lt;span class="pas-kwd"&gt;then&lt;/span&gt;&lt;br /&gt;      &lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;        tag := tagAllocated;&lt;br /&gt;        break; &lt;span class="pas-comment"&gt;//repeat&lt;/span&gt;&lt;br /&gt;      &lt;span class="pas-kwd"&gt;end&lt;/span&gt;&lt;br /&gt;      &lt;span class="pas-kwd"&gt;else&lt;/span&gt; &lt;span class="pas-kwd"&gt;if&lt;/span&gt; (obcTailPointer.Tag = tagSentinel) &lt;span class="pas-kwd"&gt;then&lt;/span&gt; &lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;        &lt;span class="pas-kwd"&gt;if&lt;/span&gt; caughtTheHead &lt;span class="pas-kwd"&gt;then&lt;/span&gt; &lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;          Result := false;&lt;br /&gt;          break; &lt;span class="pas-comment"&gt;//repeat&lt;/span&gt;&lt;br /&gt;        &lt;span class="pas-kwd"&gt;end&lt;/span&gt;&lt;br /&gt;        &lt;span class="pas-kwd"&gt;else&lt;/span&gt; &lt;span class="pas-kwd"&gt;if&lt;/span&gt; CAS64(tail, Ord(tagSentinel), tail, Ord(tagRemoving), obcTailPointer^) &lt;span class="pas-kwd"&gt;then&lt;/span&gt; &lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;          tag := tagSentinel;&lt;br /&gt;          break; &lt;span class="pas-comment"&gt;//repeat&lt;/span&gt;&lt;br /&gt;        &lt;span class="pas-kwd"&gt;end&lt;/span&gt;&lt;br /&gt;      &lt;span class="pas-kwd"&gt;end&lt;/span&gt;&lt;br /&gt;      &lt;span class="pas-kwd"&gt;else&lt;/span&gt; &lt;span class="pas-kwd"&gt;if&lt;/span&gt; (obcTailPointer.Tag = tagBlockPointer)&lt;br /&gt;              &lt;span class="pas-kwd"&gt;and&lt;/span&gt; CAS64(tail, Ord(tagBlockPointer), tail, Ord(tagDestroying), obcTailPointer^) &lt;span class="pas-kwd"&gt;then&lt;/span&gt;&lt;br /&gt;      &lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;        tag := tagBlockPointer;&lt;br /&gt;        break; &lt;span class="pas-comment"&gt;//repeat&lt;/span&gt;&lt;br /&gt;      &lt;span class="pas-kwd"&gt;end&lt;/span&gt;&lt;br /&gt;      &lt;span class="pas-kwd"&gt;else&lt;/span&gt;&lt;br /&gt;        &lt;span class="pas-kwd"&gt;asm&lt;/span&gt; &lt;span class="pas-asm"&gt;pause&lt;/span&gt;&lt;span class="pas-asm"&gt;;&lt;/span&gt; &lt;span class="pas-kwd"&gt;end&lt;/span&gt;;&lt;br /&gt;    &lt;span class="pas-kwd"&gt;until&lt;/span&gt; false;&lt;br /&gt;    &lt;span class="pas-kwd"&gt;if&lt;/span&gt; Result &lt;span class="pas-kwd"&gt;then&lt;/span&gt; &lt;span class="pas-kwd"&gt;begin&lt;/span&gt; &lt;span class="pas-comment"&gt;// dequeueing&lt;/span&gt;&lt;br /&gt;      header := tail;&lt;br /&gt;      Dec(header, header.Offset);&lt;br /&gt;      &lt;span class="pas-kwd"&gt;if&lt;/span&gt; tag &lt;span class="pas-kwd"&gt;in&lt;/span&gt; [tagSentinel, tagAllocated] &lt;span class="pas-kwd"&gt;then&lt;/span&gt; &lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;        next := NextSlot(tail);&lt;br /&gt;        &lt;span class="pas-kwd"&gt;if&lt;/span&gt; tag = tagAllocated &lt;span class="pas-kwd"&gt;then&lt;/span&gt; &lt;span class="pas-comment"&gt;// sentinel doesn't contain any useful value&lt;/span&gt;&lt;br /&gt;          value := tail.Value;&lt;br /&gt;        &lt;span class="pas-kwd"&gt;if&lt;/span&gt; caughtTheHead &lt;span class="pas-kwd"&gt;then&lt;/span&gt; &lt;span class="pas-kwd"&gt;begin &lt;font color="#000000"&gt; &lt;/font&gt;&lt;span class="pas-comment"&gt;// release the lock; as this is the last element, don't move forward&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;          Move64(tail, Ord(tagSentinel), obcTailPointer^);&lt;br /&gt;          header := &lt;span class="pas-kwd"&gt;nil&lt;/span&gt;; &lt;span class="pas-comment"&gt;// do NOT decrement the counter; this slot will be retagged again&lt;/span&gt;&lt;br /&gt;        &lt;span class="pas-kwd"&gt;end&lt;/span&gt;&lt;br /&gt;        &lt;span class="pas-kwd"&gt;else&lt;/span&gt;&lt;br /&gt;          Move64(next, Ord(next.Tag), obcTailPointer^); &lt;span class="pas-comment"&gt;// release the lock&lt;/span&gt;&lt;br /&gt;      &lt;span class="pas-kwd"&gt;end&lt;/span&gt;&lt;br /&gt;      &lt;span class="pas-kwd"&gt;else&lt;/span&gt; &lt;span class="pas-kwd"&gt;begin&lt;/span&gt; &lt;span class="pas-comment"&gt;// releasing memory&lt;/span&gt;&lt;br /&gt;        next := PGpLFQueueTaggedValue(tail.Value); &lt;span class="pas-comment"&gt;// next points to the sentinel&lt;/span&gt;&lt;br /&gt;        Move64(next, Ord(tagSentinel), obcTailPointer^); &lt;span class="pas-comment"&gt;// release the lock&lt;/span&gt;&lt;br /&gt;        tag := tagSentinel; &lt;span class="pas-comment"&gt;// retry&lt;/span&gt;&lt;br /&gt;      &lt;span class="pas-kwd"&gt;end&lt;/span&gt;;&lt;br /&gt;      &lt;span class="pas-kwd"&gt;if&lt;/span&gt; assigned(header) &lt;span class="pas-kwd"&gt;and&lt;/span&gt; (InterlockedDecrement(PInteger(header)^) = &lt;span class="pas-num"&gt;0&lt;/span&gt;) &lt;span class="pas-kwd"&gt;then&lt;/span&gt;&lt;br /&gt;        ReleaseBlock(header);&lt;br /&gt;    &lt;span class="pas-kwd"&gt;end&lt;/span&gt;;&lt;br /&gt;  &lt;span class="pas-kwd"&gt;end&lt;/span&gt;; &lt;span class="pas-comment"&gt;//while Result and (tag = tagSentinel)&lt;/span&gt;&lt;br /&gt;&lt;span class="pas-kwd"&gt;end&lt;/span&gt;; &lt;span class="pas-comment"&gt;{ TGpLockFreeQueue.Dequeue }&lt;/span&gt;&lt;/pre&gt;&lt;h2&gt;Memory management&lt;/h2&gt;&lt;p&gt;In the dynamic queue described above, special consideration goes to memory allocation and deallocation because most of the time that will be the slowest part of the enqueue/dequeue.&lt;/p&gt;&lt;p&gt;Memory is always released &lt;em&gt;after&lt;/em&gt; the queue &lt;strong&gt;tail&lt;em&gt; &lt;/em&gt;&lt;/strong&gt;is unlocked. That way, other threads may dequeue from the same queue while the thread is releasing the memory.&lt;/p&gt;&lt;p&gt;The allocation is trickier, because the Enqueue only knows that it will need the memory &lt;em&gt;after&lt;/em&gt; the &lt;strong&gt;head&lt;/strong&gt; is locked. The trick here is to use one preallocated memory block which is reused inside the Enqueue. This is much faster than calling the allocator. After the &lt;strong&gt;head&lt;/strong&gt; is unlocked, Enqueue preallocates next block of memory. This will slow down the current thread, but will not block other threads from enqueueing into the same queue.&lt;/p&gt;&lt;p&gt;Dequeue also tries to help with that. If the preallocated block is not present when a block must be released, Dequeue will store the released block away for the next Enqueue to use.&lt;/p&gt;&lt;p&gt;Also, there's one such block preallocated when the queue is initially created.&lt;/p&gt;&lt;p&gt;If this explanation is unclear, look at the program flow below. It describes the code flow through the Enqueue that has to allocate a memory block and through the Dequeue that has to release a memory block. Identifiers in parenthesis represent methods listed below.&lt;/p&gt;&lt;p&gt;Enqueue:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;lock the head &lt;/li&gt;&lt;li&gt;detect &lt;em&gt;EndOfList&lt;/em&gt; &lt;/li&gt;&lt;li&gt;use the cached block if available, otherwise allocate a new block (AllocateBlock) &lt;/li&gt;&lt;li&gt;unlock the head &lt;/li&gt;&lt;li&gt;if there is no cached block, allocate new block and store it away (PreallocateMemory) &lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Dequeue:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;lock the tail &lt;/li&gt;&lt;li&gt;process last slot in the block &lt;/li&gt;&lt;li&gt;unlock the tail &lt;/li&gt;&lt;li&gt;decrement the header count &lt;/li&gt;&lt;li&gt;as the header count has dropped to zero: &lt;ul&gt;&lt;li&gt;if the cached block is empty, store this one away (ReleaseBlock) &lt;/li&gt;      &lt;li&gt;otherwise release the block &lt;/li&gt;    &lt;/ul&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;All manipulations with the cached block are done atomically. All allocations are optimistic – if the preallocated block is empty, new memory block is allocated, partitioned and only then the code tries to swap it into the preallocated block variable. If compare-and-swap fails at this point, other thread went through the same routine, just slightly faster, and the allocated (and partitioned) block is thrown away. Looks like there may be quite some work done in vain but in reality the preallocated block is rarely thrown away.&lt;/p&gt;&lt;p&gt;It tested other, more complicated schemes (for example small 4-slot stack) but they invariably behaved worse than this simple approach.&lt;/p&gt;&lt;!-- Highlighted Pascal code generated by DelphiDabbler PasH --&gt;&lt;pre class="pas-source"&gt;&lt;span class="pas-kwd"&gt;function&lt;/span&gt; TGpLockFreeQueue.AllocateBlock: PGpLFQueueTaggedValue;&lt;br /&gt;&lt;span class="pas-kwd"&gt;var&lt;/span&gt;&lt;br /&gt;  cached: PGpLFQueueTaggedValue;&lt;br /&gt;&lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;  cached := obcCachedBlock;&lt;br /&gt;  &lt;span class="pas-kwd"&gt;if&lt;/span&gt; assigned(cached) &lt;span class="pas-kwd"&gt;and&lt;/span&gt; CAS32(cached, &lt;span class="pas-kwd"&gt;nil&lt;/span&gt;, obcCachedBlock) &lt;span class="pas-kwd"&gt;then&lt;/span&gt;&lt;br /&gt;    Result := cached&lt;br /&gt;  &lt;span class="pas-kwd"&gt;else&lt;/span&gt; &lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;    Result := AllocMem(obcBlockSize);&lt;br /&gt;    PartitionMemory(Result);&lt;br /&gt;  &lt;span class="pas-kwd"&gt;end&lt;/span&gt;;&lt;br /&gt;&lt;span class="pas-kwd"&gt;end&lt;/span&gt;; &lt;span class="pas-comment"&gt;{ TGpLockFreeQueue.AllocateBlock }&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="pas-kwd"&gt;&lt;br /&gt;procedure&lt;/span&gt; TGpLockFreeQueue.PreallocateMemory;&lt;br /&gt;&lt;span class="pas-kwd"&gt;var&lt;/span&gt;&lt;br /&gt;  memory: PGpLFQueueTaggedValue;&lt;br /&gt;&lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;  &lt;span class="pas-kwd"&gt;if&lt;/span&gt; &lt;span class="pas-kwd"&gt;not&lt;/span&gt; assigned(obcCachedBlock) &lt;span class="pas-kwd"&gt;then&lt;/span&gt; &lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;    memory := AllocMem(obcBlockSize);&lt;br /&gt;    PartitionMemory(memory);&lt;br /&gt;    &lt;span class="pas-kwd"&gt;if&lt;/span&gt; &lt;span class="pas-kwd"&gt;not&lt;/span&gt; CAS32(&lt;span class="pas-kwd"&gt;nil&lt;/span&gt;, memory, obcCachedBlock) &lt;span class="pas-kwd"&gt;then&lt;/span&gt;&lt;br /&gt;      FreeMem(memory);&lt;br /&gt;  &lt;span class="pas-kwd"&gt;end&lt;/span&gt;;&lt;br /&gt;&lt;span class="pas-kwd"&gt;end&lt;/span&gt;; &lt;span class="pas-comment"&gt;{ TGpLockFreeQueue.PreallocateMemory }&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="pas-kwd"&gt;procedure&lt;/span&gt; TGpLockFreeQueue.ReleaseBlock(firstSlot: PGpLFQueueTaggedValue; forceFree: boolean);&lt;br /&gt;&lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;  &lt;span class="pas-kwd"&gt;if&lt;/span&gt; forceFree &lt;span class="pas-kwd"&gt;or&lt;/span&gt; assigned(obcCachedBlock) &lt;span class="pas-kwd"&gt;then&lt;/span&gt;&lt;br /&gt;    FreeMem(firstSlot)&lt;br /&gt;  &lt;span class="pas-kwd"&gt;else&lt;/span&gt; &lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;    ZeroMemory(firstSlot, obcBlockSize);&lt;br /&gt;    PartitionMemory(firstSlot);&lt;br /&gt;    &lt;span class="pas-kwd"&gt;if&lt;/span&gt; &lt;span class="pas-kwd"&gt;not&lt;/span&gt; CAS32(&lt;span class="pas-kwd"&gt;nil&lt;/span&gt;, firstSlot, obcCachedBlock) &lt;span class="pas-kwd"&gt;then&lt;/span&gt;&lt;br /&gt;      FreeMem(firstSlot);&lt;br /&gt;  &lt;span class="pas-kwd"&gt;end&lt;/span&gt;;&lt;br /&gt;&lt;span class="pas-kwd"&gt;end&lt;/span&gt;; &lt;span class="pas-comment"&gt;{ TGpLockFreeQueue.ReleaseBlock }&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;As you can see in the code fragments above, memory is also initialized (formatted into slots) when memory is allocated. This also helps with the general performance.&lt;/p&gt;&lt;h2&gt;Performance&lt;/h2&gt;&lt;p&gt;Tests were again performed using the 32_Queue project in the Tests branch of the OTL tree.&lt;/p&gt;&lt;p&gt;The test framework sets up the following data path:&lt;/p&gt;&lt;p class="example"&gt;source queue –&amp;gt; N threads –&amp;gt; channel queue –&amp;gt; M threads –&amp;gt; destination queue&lt;/p&gt;&lt;p&gt;Source queue is filled with numbers from 1 to 1.000.000. Then 1 to 8 threads are set up to read from the source queue and write into the channel queue and another 1 to 8 threads are set up to read from the channel queue and write to the destination queue. Application then starts the clock and starts all threads. When all numbers are moved to the destination queue, clock is stopped and contents of the destination queue are verified. Thread creation time is not included in the measured time.&lt;/p&gt;&lt;p&gt;All in all this results in 2 million reads and 2 million writes distributed over three queues. Tests are very brutal as all threads are just hammering on the queues, doing nothing else. The table below contains average, min and max time of 5 runs on a 2.67 GHz computer with two 4-core CPUs. Data from the current implementation (&amp;quot;new code&amp;quot;) is compared to the &lt;a href="http://17slon.com/blogs/gabr/2010/01/three-steps-to-blocking-collection-2.html"&gt;original implementation&lt;/a&gt; (&amp;quot;old code&amp;quot;). Best times are marked &lt;span class="style5"&gt;green&lt;/span&gt;.&lt;/p&gt;&lt;div align="center"&gt;&lt;table border="0" cellspacing="0" cellpadding="2" width="800" align="center"&gt;&lt;tbody&gt; &lt;tr&gt;        &lt;td class="style3" valign="top" width="158"&gt;&amp;#160;&lt;/td&gt;        &lt;td class="style4" valign="top" width="235" colspan="2"&gt;&lt;strong&gt;New code&lt;/strong&gt;&lt;/td&gt;        &lt;td class="style4" valign="top" width="205" colspan="2"&gt;&lt;strong&gt;Old code&lt;/strong&gt;&lt;/td&gt;      &lt;/tr&gt;      &lt;tr&gt;        &lt;td class="style3" valign="top" width="158"&gt;&amp;#160;&lt;/td&gt;        &lt;td class="style3" valign="top" width="235"&gt;&lt;strong&gt;average [min-max] &lt;br/&gt;&lt;/strong&gt;all data in milliseconds&lt;/td&gt;        &lt;td class="style3" valign="top" width="205"&gt;&lt;strong&gt;millions of queue operations per second&lt;/strong&gt;&lt;/td&gt;        &lt;td class="style3" valign="top" width="205"&gt;&lt;strong&gt;average [min-max]&lt;br/&gt;&lt;/strong&gt;all data in milliseconds&lt;/td&gt;        &lt;td class="style3" valign="top" width="205"&gt;&lt;strong&gt;millions of queue operations per second&lt;/strong&gt;&lt;/td&gt;      &lt;/tr&gt;      &lt;tr&gt;        &lt;td class="style3" valign="top" width="159"&gt;N = 1, M = 1&lt;/td&gt;        &lt;td class="style3" valign="top" width="235"&gt;590 [559 – 682]&lt;/td&gt;        &lt;td class="style6" valign="top" width="204"&gt;6.78&lt;/td&gt;        &lt;td class="style3" valign="top" width="204"&gt;707 [566-834]&lt;/td&gt;        &lt;td class="style3" valign="top" width="204"&gt;5.66&lt;/td&gt;      &lt;/tr&gt;      &lt;tr&gt;        &lt;td class="style3" valign="top" width="159"&gt;N = 2, M = 2&lt;/td&gt;        &lt;td class="style3" valign="top" width="235"&gt;838 [758 – 910]&lt;/td&gt;        &lt;td class="style6" valign="top" width="204"&gt;4.77&lt;/td&gt;        &lt;td class="style3" valign="top" width="204"&gt;996 [950-1031]&lt;/td&gt;        &lt;td class="style3" valign="top" width="204"&gt;4.02&lt;/td&gt;      &lt;/tr&gt;      &lt;tr&gt;        &lt;td class="style3" valign="top" width="159"&gt;N = 3, M = 3&lt;/td&gt;        &lt;td class="style3" valign="top" width="235"&gt;1095 [1054 – 1173]&lt;/td&gt;        &lt;td class="style3" valign="top" width="204"&gt;3.65&lt;/td&gt;        &lt;td class="style3" valign="top" width="204"&gt;1065 [1055-1074]&lt;/td&gt;        &lt;td class="style6" valign="top" width="204"&gt;3.76&lt;/td&gt;      &lt;/tr&gt;      &lt;tr&gt;        &lt;td class="style3" valign="top" width="159"&gt;N = 4, M = 4&lt;/td&gt;        &lt;td class="style3" valign="top" width="235"&gt;1439 [1294 – 1535]&lt;/td&gt;        &lt;td class="style3" valign="top" width="204"&gt;2.78&lt;/td&gt;        &lt;td class="style3" valign="top" width="204"&gt;1313 [1247-1358]&lt;/td&gt;        &lt;td class="style6" valign="top" width="204"&gt;3.04&lt;/td&gt;      &lt;/tr&gt;      &lt;tr&gt;        &lt;td class="style3" valign="top" width="159"&gt;N = 8, M = 8&lt;/td&gt;        &lt;td class="style3" valign="top" width="235"&gt;1674 [1303 – 2217]&lt;/td&gt;        &lt;td class="style3" valign="top" width="204"&gt;2.39&lt;/td&gt;        &lt;td class="style3" valign="top" width="204"&gt;1520 [1482-1574]&lt;/td&gt;        &lt;td class="style6" valign="top" width="204"&gt;2.63&lt;/td&gt;      &lt;/tr&gt;      &lt;tr&gt;        &lt;td class="style3" valign="top" width="159"&gt;N = 1, M = 7&lt;/td&gt;        &lt;td class="style3" valign="top" width="235"&gt;1619 [1528 – 1822]&lt;/td&gt;        &lt;td class="style6" valign="top" width="204"&gt;2.47&lt;/td&gt;        &lt;td class="style3" valign="top" width="204"&gt;3880 [3559-4152]&lt;/td&gt;        &lt;td class="style3" valign="top" width="204"&gt;1.03&lt;/td&gt;      &lt;/tr&gt;      &lt;tr&gt;        &lt;td class="style3" valign="top" width="159"&gt;N = 7, M = 1&lt;/td&gt;        &lt;td class="style3" valign="top" width="235"&gt;1525 [1262 – 1724]&lt;/td&gt;        &lt;td class="style3" valign="top" width="204"&gt;2.62&lt;/td&gt;        &lt;td class="style3" valign="top" width="204"&gt;1314 [1299-1358]&lt;/td&gt;        &lt;td class="style6" valign="top" width="204"&gt;3.04&lt;/td&gt;      &lt;/tr&gt;    &lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;&lt;/div&gt;&lt;p&gt;The new implementation is faster when less threads are used and slightly slower when number of threads increases. The best thing is that there is no weird speed drop in N = 1, M = 7 case. The small slowdown with higher number of threads doesn't bother me much as this test case really stresses the queue. In all practical applications, there should be much more code that does real work and queue load would rapidly drop down. &lt;/p&gt;&lt;p&gt;If your code depends on accessing a shared queue from many multiple threads that enqueue/dequeue most of the time, there's a simple solution - change the code! I believe that multithreaded code should not fight for each data, but cooperate. A possible solution is to split the data in packets and schedule packets to the shared queue. Each thread would then dequeue one packet and process all data stored within.&lt;/p&gt;&lt;h2&gt;Wrapup&lt;/h2&gt;&lt;p&gt;The code will be released in OmniThreadLibrary 1.5 (but you can use it already if you fetch the HEAD from the &lt;a href="http://code.google.com/p/omnithreadlibrary/source/checkout" target="_blank"&gt;SVN&lt;/a&gt;). It passed very rigorous stress test and I believe it is working. If you find any problems, please let me know. I’m also interested in any ports to different languages (a C version would be nice).&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;font size=-2&gt;---&lt;br/&gt;Published under the &lt;a href="http://creativecommons.org/licenses/by/3.0/"&gt;Creative Commons Attribution 3.0&lt;/a&gt; license&lt;/font&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/29331675-3472550202464156890?l=17slon.com%2Fblogs%2Fgabr%2Fblogger.html' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/3472550202464156890/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://17slon.com/blogs/gabr/2010/02/dynamic-lock-free-queue-doing-it-right.html#comment-form' title='7 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/posts/default/3472550202464156890'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/posts/default/3472550202464156890'/><link rel='alternate' type='text/html' href='http://17slon.com/blogs/gabr/2010/02/dynamic-lock-free-queue-doing-it-right.html' title='Dynamic lock-free queue – doing it right'/><author><name>gabr</name><uri>http://www.blogger.com/profile/06903558857617342477</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='12941540036853261067'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>7</thr:total></entry><entry><id>tag:blogger.com,1999:blog-29331675.post-7147081862518285190</id><published>2010-02-10T20:07:00.001+01:00</published><updated>2010-02-18T16:18:38.613+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='open source'/><category scheme='http://www.blogger.com/atom/ns#' term='Delphi'/><category scheme='http://www.blogger.com/atom/ns#' term='OmniThreadLibrary'/><category scheme='http://www.blogger.com/atom/ns#' term='multithreading'/><category scheme='http://www.blogger.com/atom/ns#' term='source code'/><category scheme='http://www.blogger.com/atom/ns#' term='programming'/><title type='text'>Bypassing the ABA problem</title><content type='html'>&lt;p&gt;On Saturday me and my wife visited a classical music concert. Although I like this kind of music, the particular combination of instruments (harp and violin) didn’t really interest me that much, especially when they played modernist Slovenian composers. [Debussy, on the other hand, was superb.]&lt;/p&gt;  &lt;p&gt;Anyway, I got submerged into music and half of my brain switched of and then I got all sorts of weird programming ideas. The first was how to solve the ABA problem in the &lt;a href="http://17slon.com/blogs/gabr/2010/01/three-steps-to-blocking-collection-2.html" target="_blank"&gt;initial dynamic queue implementation&lt;/a&gt;. [Total failure, that idea, it didn’t work at all.] The second, however, proved to be very useful as it &lt;a href="http://17slon.com/blogs/gabr/2010/02/releasing-queue-memory-without-mrew.html" target="_blank"&gt;solved the memory release problem&lt;/a&gt; (provided that ABA gets fixed, of course).&lt;/p&gt;  &lt;p&gt;The new memory release scheme brought with it a new strength of will. If I had solved that one, then maybe, just maybe, I can also solve the ABA problem, I thought to myself and returned to the code. And then it dawned on me …&lt;/p&gt;  &lt;p&gt;The problem with the initial implementation was that the head/tail pointer and corresponding tag were access asynchronously. In the multithreading environment, that is always a problem. Somehow I had to modify them at the same time, but that didn’t look feasible as the tag was living in the dynamically allocated block and the head/tail pointer was stored in the object itself. I couldn’t put the head/tail into the block, but I could put a tag near to the head/tail pointer! [A copy of the tag, actually, as I still needed the tags to be stored in the data block.] Then I could use 8-byte compare-and-swap to change both the pointer and the tag at the same time!&lt;/p&gt;  &lt;p&gt;There was one problem, though. In the initial implementation, the tail was allowed to catch the head. If that happened with the new scheme, both head and tail pointers would be the same (and pointing to a &lt;em&gt;tagFree&lt;/em&gt; slot) but the first enqueue operation would only modify the head tag, although the tail tag would in reality also change! It seemed like I was just pushing the ABA problem from place to place :(&lt;/p&gt;  &lt;p&gt;Still, there is a simple (at least for some values of that word) solution to such problems – introduce the sentinel. This is a special element signifying that some pointer (tail, in my case) has reached the end of list. A good idea, but could it be made to work?&lt;/p&gt;  &lt;p&gt;I fired up my trusty spreadsheed (very good stuff for simulations) and in few hours I had the basic plan laid out.&lt;/p&gt;  &lt;p&gt;&lt;a href="http://17slon.com/blogs/gabr/files/BypassingtheABAproblem_A5A8/image.png"&gt;&lt;img style="border-bottom: 0px; border-left: 0px; display: block; float: none; margin-left: auto; border-top: 0px; margin-right: auto; border-right: 0px" title="image" border="0" alt="image" src="http://17slon.com/blogs/gabr/files/BypassingtheABAproblem_A5A8/image_thumb.png" width="484" height="260" /&gt;&lt;/a&gt;&lt;/p&gt;  &lt;p align="center"&gt;[Yes, that’s the picture from the &lt;a href="http://17slon.com/blogs/gabr/2010/02/aba-problem.html" target="_blank"&gt;yesterday’s teaser&lt;/a&gt;.]&lt;/p&gt;  &lt;p&gt;It was a really simple work to convert this to the code. After fixing few bugs, I had the new queue running, faster then ever before!&lt;/p&gt;  &lt;p&gt;I’ll put together a long article describing all the tricks inside the new dynamic queue, but that will take some time, sorry. In the meantime, you can checkout the &lt;a href="http://code.google.com/p/omnithreadlibrary/source/browse/trunk/OtlContainers.pas" target="_blank"&gt;current OtlContainers&lt;/a&gt; and read the pseudo-code documentation.&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas, Courier New"&gt;TOmniQueue        &lt;br /&gt;=============== &lt;/font&gt;&lt;/p&gt;    &lt;p&gt;&lt;font face="Consolas, Courier New"&gt;tags:        &lt;br /&gt;&amp;#160; tagFree         &lt;br /&gt;&amp;#160; tagAllocating         &lt;br /&gt;&amp;#160; tagAllocated         &lt;br /&gt;&amp;#160; tagRemoving         &lt;br /&gt;&amp;#160; tagEndOfList         &lt;br /&gt;&amp;#160; tagExtending         &lt;br /&gt;&amp;#160; tagBlockPointer         &lt;br /&gt;&amp;#160; tagDestroying         &lt;br /&gt;&amp;#160; tagHeader         &lt;br /&gt;&amp;#160; tagSentinel &lt;/font&gt;&lt;/p&gt;    &lt;p&gt;&lt;font face="Consolas, Courier New"&gt;header contains:        &lt;br /&gt;&amp;#160; head         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; slot = 4 bytes         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; tag&amp;#160; = 4 bytes         &lt;br /&gt;&amp;#160; tail         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; slot = 4 bytes         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; tag&amp;#160; = 4 bytes         &lt;br /&gt;all are 4-aligned &lt;/font&gt;&lt;/p&gt;    &lt;p&gt;&lt;font face="Consolas, Courier New"&gt;slot contains:        &lt;br /&gt;&amp;#160; TOmniValue = 13 bytes         &lt;br /&gt;&amp;#160; tag&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; = 1 byte         &lt;br /&gt;&amp;#160; offset&amp;#160;&amp;#160;&amp;#160;&amp;#160; = 2 bytes         &lt;br /&gt;TOmniValues are 4-aligned &lt;/font&gt;&lt;/p&gt;    &lt;p&gt;&lt;font face="Consolas, Courier New"&gt;block is initialized to:        &lt;br /&gt;[tagHeader, num slots - 1, 0] [tagSentinel, 0, 1] [tagFree 0, 2] .. [tagFree, 0, num slots - 2] [tagEndOfList, 0, num slots - 1] &lt;/font&gt;&lt;/p&gt;    &lt;p&gt;&lt;font face="Consolas, Courier New"&gt;Enqueue:        &lt;br /&gt;&amp;#160; repeat         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; tail = header.tail.slot         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; old_tag = header.tail.tag         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; if header.tail.CAS(tail, tagFree, tail, tagAllocating) then         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; tail.tag = tagAllocating         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; break         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; else if header.tail.CAS(tail, tagEndOfList, tail, tagExtending) then         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; tail.tag = tagExtending         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; break         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; else         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; yield         &lt;br /&gt;&amp;#160; forever         &lt;br /&gt;&amp;#160; if old_tag = tagFree then         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; store &amp;lt;value, tagAllocated&amp;gt; into slot         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; header.tail.CAS(tail, tagAllocating, NextSlot(tail), NextSlot(tail).tag)         &lt;br /&gt;&amp;#160; else         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; allocate block // from cache, if possible         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; next = second data slot in the new block         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; set next to &amp;lt;tagAllocated, value&amp;gt;         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; set last slot in the original block to &amp;lt;new block address, tagBlockPointer&amp;gt;         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; header.tail.CAS(tail, tagExtending, next, next.tag)         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; // queue is now unlocked         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; preallocate block &lt;/font&gt;&lt;/p&gt;    &lt;p&gt;&lt;font face="Consolas, Courier New"&gt;Dequeue:        &lt;br /&gt;&amp;#160; repeat         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; if header.head.tag = tagFree then         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; return false         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; head = header.head.slot         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; old_tag = header.head.tag         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; caughtTheTail := NextSlot(header.head.slot) = header.tail.slot;         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; if head.head.CAS(head, tagAllocated, head, tagRemoving) then         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; head.tag = tagRemovings         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; break         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; else if header.head.Tag = tagSentinel then         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; if caughtTheTail then         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; return false         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; else if header.head.CAS(head, tagSentinel, head, tagRemoving) then         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; head.tag = tagRemoving         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; break         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; else if header.head.CAS(head, tagBlockPointer, head, tagDestrogin) then         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; head.tag = tagDestroying         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; break         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; else         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; yield         &lt;br /&gt;&amp;#160; forever         &lt;br /&gt;&amp;#160; firstSlot = head - head.Offset // point to first slot         &lt;br /&gt;&amp;#160; if old_tag in [tagSentinel, tagAllocated] then         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; next = NextSlot(head)         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; if tag = tagAllocated then         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; fetch stored value         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; if caughtTheTail then         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; header.head.CAS(head, tagRemoving, head, tagSentinel)         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; firstSlot = nil // do not decrement the header counter         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; else         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; header.head.CAS(head, tagRemoving, next, next.tag)         &lt;br /&gt;&amp;#160; else         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; next = head.value // points to the next block's sentinel         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; header.head.CAS(head, tagDestroying, next, tagSentinel)         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; old_tag = tagSentinel // force retry         &lt;br /&gt;&amp;#160; // queue is now unlocked         &lt;br /&gt;&amp;#160; if assigned(firstSlot) and (InterlockedDecrement(firstSlot.value) = 0) then         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; release block         &lt;br /&gt;&amp;#160; if old_tag = tagSentinel         &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; retry from beginning&lt;/font&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;div class="blogger-post-footer"&gt;&lt;font size="-2"&gt;---     &lt;br /&gt;Published under the &lt;a href="http://creativecommons.org/licenses/by/3.0/"&gt;Creative Commons Attribution 3.0&lt;/a&gt; license&lt;/font&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;font size=-2&gt;---&lt;br/&gt;Published under the &lt;a href="http://creativecommons.org/licenses/by/3.0/"&gt;Creative Commons Attribution 3.0&lt;/a&gt; license&lt;/font&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/29331675-7147081862518285190?l=17slon.com%2Fblogs%2Fgabr%2Fblogger.html' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/7147081862518285190/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://17slon.com/blogs/gabr/2010/02/bypassing-aba-problem.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/posts/default/7147081862518285190'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/posts/default/7147081862518285190'/><link rel='alternate' type='text/html' href='http://17slon.com/blogs/gabr/2010/02/bypassing-aba-problem.html' title='Bypassing the ABA problem'/><author><name>gabr</name><uri>http://www.blogger.com/profile/06903558857617342477</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='12941540036853261067'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-29331675.post-4238479547675882195</id><published>2010-02-10T08:38:00.001+01:00</published><updated>2010-02-10T08:38:33.818+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='not programming'/><title type='text'>Technical problems</title><content type='html'>&lt;p&gt;For the last two days, the Delphi Geek was down due to a faulty hard drive and a RAID that didn’t want to rebuild itself when a new drive was inserter :(&lt;/p&gt;  &lt;p&gt;Data has been restored from the latest backup, two last articles reposted and now everything should be in order. If you find any problems, please notify me in comments.&lt;/p&gt;  &lt;p&gt;I apologize for the inconvenience.&lt;/p&gt;  &lt;div class="blogger-post-footer"&gt;&lt;font size=-2&gt;---&lt;br/&gt;Published under the &lt;a href="http://creativecommons.org/licenses/by/3.0/"&gt;Creative Commons Attribution 3.0&lt;/a&gt; license&lt;/font&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/29331675-4238479547675882195?l=17slon.com%2Fblogs%2Fgabr%2Fblogger.html' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/4238479547675882195/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://17slon.com/blogs/gabr/2010/02/technical-problems.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/posts/default/4238479547675882195'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/posts/default/4238479547675882195'/><link rel='alternate' type='text/html' href='http://17slon.com/blogs/gabr/2010/02/technical-problems.html' title='Technical problems'/><author><name>gabr</name><uri>http://www.blogger.com/profile/06903558857617342477</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='12941540036853261067'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-29331675.post-4571528880743854301</id><published>2010-02-07T18:44:00.002+01:00</published><updated>2010-02-07T18:47:37.736+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='multithreading'/><category scheme='http://www.blogger.com/atom/ns#' term='programming'/><title type='text'>The ABA problem</title><content type='html'>&lt;img style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; margin-left: 0px; border-left-width: 0px; margin-right: 0px" title="queue" border="0" alt="queue" align="right" src="http://17slon.com/blogs/gabr/files/Thelastproblem_10599/queue.png" width="244" height="129" /&gt;   &lt;p&gt;I have discovered a truly remarkable solution which this margin is too small to contain.&lt;/p&gt;  &lt;p&gt;&lt;img alt="Devil" src="http://us.i1.yimg.com/us.yimg.com/i/mesg/emoticons7/19.gif" /&gt;&lt;/p&gt;  &lt;p&gt;Jokes aside, I’m running the stress test suite now and the results look good. The queue is even faster than before. If the test survives 12 hours, I’ll check the code into the SVN. (And then I’ll write the article. Promise.)&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;font size=-2&gt;---&lt;br/&gt;Published under the &lt;a href="http://creativecommons.org/licenses/by/3.0/"&gt;Creative Commons Attribution 3.0&lt;/a&gt; license&lt;/font&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/29331675-4571528880743854301?l=17slon.com%2Fblogs%2Fgabr%2Fblogger.html' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/4571528880743854301/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://17slon.com/blogs/gabr/2010/02/aba-problem.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/posts/default/4571528880743854301'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/posts/default/4571528880743854301'/><link rel='alternate' type='text/html' href='http://17slon.com/blogs/gabr/2010/02/aba-problem.html' title='The ABA problem'/><author><name>gabr</name><uri>http://www.blogger.com/profile/06903558857617342477</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='12941540036853261067'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-29331675.post-7060248330588069444</id><published>2010-02-07T13:12:00.003+01:00</published><updated>2010-02-18T16:19:52.842+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Delphi'/><category scheme='http://www.blogger.com/atom/ns#' term='multithreading'/><category scheme='http://www.blogger.com/atom/ns#' term='programming'/><title type='text'>Releasing queue memory without the MREW lock</title><content type='html'>&lt;p&gt;I know how to implement a no-wait release in my &lt;a href="http://17slon.com/blogs/gabr/2010/01/three-steps-to-blocking-collection-2.html" target="_blank"&gt;dynamic queue&lt;/a&gt; &lt;strong&gt;if&lt;/strong&gt; somehow the ABA problem gets solved. (I also had some ideas on how to solve the ABA problem &lt;strong&gt;if&lt;/strong&gt;&lt;em&gt; &lt;/em&gt;the tail pointer never catches the head one. But that’s still very much in the design phase.)&lt;/p&gt;  &lt;p&gt;Each block gets a header element (with a &lt;em&gt;tagHeader&lt;/em&gt; tag). Each slot in the block uses the previously unused bytes (stuffing) to store its position (index) inside the block.&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas, Courier New"&gt;[&lt;em&gt;Header&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;|&lt;em&gt;1023&lt;/em&gt;] [&lt;em&gt;Free&lt;/em&gt;|&lt;em&gt;1&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] … [&lt;em&gt;Free&lt;/em&gt;|&lt;em&gt;1022&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;em&gt;EndOfList&lt;/em&gt;|&lt;em&gt;1023&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;]&lt;/font&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;A number of all not-yet-released slots is stored in the header’s value field.&lt;/p&gt;  &lt;p&gt;The second part of the Dequeue code is changed:&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;&lt;font face="Consolas, Courier New"&gt;if tag = tagAllocated then       &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; get value        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; increment tail        &lt;br /&gt;else        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; set tail to new block's slot 1        &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; get value        &lt;br /&gt;// new code that executes in all code paths:        &lt;br /&gt;interlocked decrement number of not-yet-released slots in the header        &lt;br /&gt;if the decrement resulted in value 0, release the block&lt;/font&gt;&lt;font face="Consolas, Courier New"&gt;&lt;/font&gt;&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;So what is going on here?&lt;/p&gt;  &lt;ul&gt;   &lt;li&gt;Additional index in the previously unused bytes is used to quickly jump to the header slot.&lt;/li&gt;    &lt;li&gt;Decrement-and-test is executed last in the Dequeue code and we know that the code won’t reference the current block anymore. Therefore it is safe to release the block at this point.&lt;/li&gt;    &lt;li&gt;Ever change (tagAllocated –&amp;gt; tagReleasing, tagBlockPointer –&amp;gt; tagDestroying) decrements this counter. Only when all tags are set to tagReleasing/tagDestroying the block is destroyed.&lt;/li&gt;    &lt;li&gt;We know that at this point no other thread may be referencing the block (because it has already decremented the counter – otherwise the counter wouldn’t be 0 yet).&lt;/li&gt; &lt;/ul&gt;  &lt;p&gt;I tested this approach by putting initial tag switching (tagFree –&amp;gt; tagAllocating etc) into a critical section, thusly bypassing the ABA problem. It works but the critical section really killed the performance when number of threads got high (N = 4, M = 4 case worked well, N = 8, M = 8 did not).&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;font size=-2&gt;---&lt;br/&gt;Published under the &lt;a href="http://creativecommons.org/licenses/by/3.0/"&gt;Creative Commons Attribution 3.0&lt;/a&gt; license&lt;/font&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/29331675-7060248330588069444?l=17slon.com%2Fblogs%2Fgabr%2Fblogger.html' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/7060248330588069444/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://17slon.com/blogs/gabr/2010/02/releasing-queue-memory-without-mrew.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/posts/default/7060248330588069444'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/posts/default/7060248330588069444'/><link rel='alternate' type='text/html' href='http://17slon.com/blogs/gabr/2010/02/releasing-queue-memory-without-mrew.html' title='Releasing queue memory without the MREW lock'/><author><name>gabr</name><uri>http://www.blogger.com/profile/06903558857617342477</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='12941540036853261067'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-29331675.post-1488081948103738418</id><published>2010-02-04T11:11:00.005+01:00</published><updated>2010-02-18T16:22:42.245+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='open source'/><category scheme='http://www.blogger.com/atom/ns#' term='Delphi'/><category scheme='http://www.blogger.com/atom/ns#' term='OmniThreadLibrary'/><category scheme='http://www.blogger.com/atom/ns#' term='multithreading'/><category scheme='http://www.blogger.com/atom/ns#' term='source code'/><category scheme='http://www.blogger.com/atom/ns#' term='programming'/><title type='text'>Three steps to the blocking collection: [2] Dynamically allocated queue</title><content type='html'>&lt;p&gt;[Step one: &lt;a href="http://17slon.com/blogs/gabr/2010/01/three-steps-to-blocking-collection-1.html" target="_blank"&gt;Inverse semaphore&lt;/a&gt;.]&lt;/p&gt;  &lt;p&gt;When I started thinking about the blocking collection internals (see &lt;a href="http://17slon.com/blogs/gabr/2010/01/three-steps-to-blocking-collection-1.html" target="_blank"&gt;step one&lt;/a&gt;) two facts become fairly obvious:&lt;/p&gt;  &lt;ul&gt;   &lt;li&gt;The underlying data storage should be some kind of a queue. Blocking collection only needs the data storage to support enqueue (Add) and dequeue (Take). &lt;/li&gt;    &lt;li&gt;The underlying data storage should be allocated dynamically. Users will be using blocking collection on structures for which the size cannot be determined quickly (trees, for example) and therefore the code cannot preallocate »just big enough« data storage. &lt;/li&gt; &lt;/ul&gt;  &lt;p&gt;I was all set on using locking queue for the data storage but then I had an idea about how to implement dynamically allocated queue with microlocking. Instead of locking the whole queue, each thread would lock only one element, and that for such short time that other competing threads would just wait in busy-wait (spinning in a tight loop).&lt;/p&gt;  &lt;p&gt;Then the usual thing happened. I run into the &lt;a href="http://en.wikipedia.org/wiki/ABA_problem" target="_blank"&gt;ABA problem&lt;/a&gt;. And I solved it – in a way. The queue works, behaves well and is extremely useful. It’s just not as perfect as I thought it would be.&lt;/p&gt;  &lt;p&gt;So here it is - microlocking, (mostly) O(1) insert/remove, dynamically allocated queue with garbage collector. All yours for a measly 16 bytes per one unit of data. Hey, you have to pay the price at some point!&lt;/p&gt;  &lt;p&gt;[I “discovered” this approach all by myself. That doesn’t mean that this is an original work; most probably this is just a variation of some well known method. If you know of any similar approach, practical or only theoretical, please post the link in comments.]&lt;/p&gt;  &lt;h2&gt;Tagged elements&lt;/h2&gt;  &lt;p&gt;The basic queue element is made of two parts, a &lt;em&gt;tag &lt;/em&gt;and a &lt;em&gt;value&lt;/em&gt;. As this implementation is to be used in the OmniThreadLibrary framework, the value is represented by a TOmniValue record. This record can handle almost anything from a &lt;em&gt;byte&lt;/em&gt; to an &lt;em&gt;int64&lt;/em&gt;, and can also store interfaces. The only downside is that it uses 13 bytes of memory.&lt;/p&gt;  &lt;p&gt;As the &lt;em&gt;tag&lt;/em&gt; uses only one byte and 1+13 = 14, which is not an elegant value, a queue element also contains two unused bytes. That rounds its size to a pretty 16 bytes. [I’m joking, of course. There is a very good reasons why the size must be divisible by 4. I’ll come back to that later.]&lt;/p&gt;  &lt;pre class="pas-source"&gt;&lt;span class="pas-kwd"&gt;type&lt;/span&gt;&lt;br /&gt;  TOmniQueueTag = (tagFree, tagAllocating, tagAllocated, tagRemoving, &lt;br /&gt;    tagEndOfList, tagExtending, tagBlockPointer, tagDestroying);&lt;br /&gt;  TOmniTaggedValue = &lt;span class="pas-kwd"&gt;packed&lt;/span&gt; &lt;span class="pas-kwd"&gt;record&lt;/span&gt; &lt;br /&gt;    Tag     : TOmniQueueTag; &lt;br /&gt;    Stuffing: word; &lt;br /&gt;    Value   : TOmniValue; &lt;br /&gt;  &lt;span class="pas-kwd"&gt;end&lt;/span&gt;;&lt;/pre&gt;&lt;p&gt;In the following expose, I’ll use shorthand &lt;font face="Consolas, Courier New"&gt;[&lt;em&gt;tag&lt;/em&gt;|&lt;em&gt;value&lt;/em&gt;]&lt;/font&gt; to represent an instance of the TOmniTaggedValue record.&lt;/p&gt;&lt;p&gt;Queue data is managed in &lt;em&gt;blocks&lt;/em&gt;. The size of a block is 64 KB. Divide this by 16 and you’ll find that a block contains 4096 elements. Upon allocation, each block is formatted as&lt;/p&gt;&lt;blockquote&gt;  &lt;p&gt;&lt;font face="Consolas, Courier New"&gt;[&lt;em&gt;Free&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;em&gt;Free&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] … [&lt;em&gt;Free&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;em&gt;EndOfList&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;]&lt;/font&gt;&lt;/p&gt;&lt;/blockquote&gt;&lt;p&gt;In other words, block is mostly initialized to zero (as Ord(tagFree) = 0). When the queue object is created, one block is allocated and both &lt;em&gt;tail &lt;/em&gt;and &lt;em&gt;head&lt;/em&gt; pointers point to the first element.&lt;/p&gt;&lt;blockquote&gt;  &lt;p&gt;&lt;font face="Consolas, Courier New"&gt;&lt;strong&gt;H&lt;/strong&gt;:&lt;strong&gt;T&lt;/strong&gt;:[&lt;em&gt;Free&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;em&gt;Free&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] … [&lt;em&gt;Free&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;em&gt;EndOfList&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;]&lt;/font&gt;&lt;/p&gt;&lt;/blockquote&gt;&lt;br /&gt;&lt;h2&gt;Enqueue&lt;/h2&gt;&lt;p&gt;The first thing Enqueue does is to lock the head element. To do this it first checks if &lt;em&gt;head&lt;/em&gt; points to a &lt;em&gt;tagFree&lt;/em&gt; or &lt;em&gt;tagEndOfList&lt;/em&gt;. If that’s not the case, another thread has just locked this element and the current thread must wait a little and retry.&lt;/p&gt;&lt;p&gt;Then it (atomically!) swaps current tag value with either &lt;em&gt;tagAllocating &lt;/em&gt;(if previous value was &lt;em&gt;tagFree&lt;/em&gt;) or &lt;em&gt;tagExtending&lt;/em&gt; (if it was &lt;em&gt;tagEndOfList&lt;/em&gt;). If this atomic swap fails, another thread has overtaken this one and the thread has to retry from beginning.&lt;/p&gt;&lt;p&gt;But let’s assume that the tag was properly swapped. We now have a following situation:&lt;/p&gt;&lt;blockquote&gt;  &lt;p&gt;&lt;font face="Consolas, Courier New"&gt;&lt;strong&gt;H&lt;/strong&gt;:&lt;strong&gt;T&lt;/strong&gt;:[&lt;em&gt;Allocating&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;em&gt;Free&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] … [&lt;em&gt;Free&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;em&gt;EndOfList&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;]&lt;/font&gt;&lt;/p&gt;&lt;/blockquote&gt;&lt;p&gt;The thread then increments the head pointer to the next slot …&lt;/p&gt;&lt;blockquote&gt;  &lt;p&gt;&lt;font face="Consolas, Courier New"&gt;&lt;strong&gt;H&lt;/strong&gt;:[&lt;em&gt;Allocating&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] &lt;strong&gt;T&lt;/strong&gt;:[&lt;em&gt;Free&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] … [&lt;em&gt;Free&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;em&gt;EndOfList&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;]&lt;/font&gt;&lt;/p&gt;&lt;/blockquote&gt;&lt;p&gt;… and stores&amp;#160; &lt;font face="Consolas"&gt;[&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;value&lt;/em&gt;] &lt;/font&gt;&lt;font face="Georgia"&gt;in the slot it has previously locked.&lt;/font&gt;&lt;/p&gt;&lt;blockquote&gt;  &lt;p&gt;&lt;font face="Consolas, Courier New"&gt;&lt;strong&gt;H&lt;/strong&gt;:[&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;value&lt;/em&gt;] &lt;strong&gt;T&lt;/strong&gt;:[&lt;em&gt;Free&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] … [&lt;em&gt;Free&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;em&gt;EndOfList&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;]&lt;/font&gt;&lt;/p&gt;&lt;/blockquote&gt;&lt;p&gt;That completes the Enqueue.&lt;/p&gt;&lt;p&gt;In pseudocode:&lt;/p&gt;&lt;blockquote&gt;  &lt;p&gt;&lt;font face="Consolas, Courier New"&gt;repeat &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; fetch tag from current head &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; if tag = tagFree and CAS(tag, tagAllocating) then &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; break &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; if tag = tagEndOfList and CAS(tag, tagExtending) then &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; break &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; yield&amp;#160; &lt;br /&gt;forever&amp;#160; &lt;br /&gt;if tag = tagFree then&amp;#160; &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; increment head&amp;#160; &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; store (tagAllocated, value) into locked slot&amp;#160; &lt;br /&gt;else&amp;#160; &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; // ignore this for a moment&lt;/font&gt;&lt;/p&gt;&lt;/blockquote&gt;&lt;p&gt;Let’s think about possible problems.&lt;/p&gt;&lt;ol&gt;&lt;br /&gt;  &lt;li&gt;Two (or more) threads can simultaneously find out that &lt;em&gt;head^.tag = tagFree&lt;/em&gt; and try to swap in &lt;em&gt;tagAllocating&lt;/em&gt;. As this is implemented using atomic &lt;em&gt;compare-and-swap&lt;/em&gt;, one thread will succeed and another will fail and retry. No problem here. &lt;/li&gt;&lt;br /&gt;  &lt;li&gt;The thread increments the &lt;em&gt;head&lt;/em&gt; pointer. At that moment it is suspended and another thread calls the Enqueue. The second thread finds that the &lt;em&gt;head&lt;/em&gt; points to a free element and continues with execution. It’s possible that the second thread will finish the operation before the first thread is resumed and that for some short time the queue storage would look like this:&lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; &lt;font face="Consolas"&gt;&lt;strong&gt;H&lt;/strong&gt;:[&lt;em&gt;Allocating&lt;/em&gt;|0] [&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;value&lt;/em&gt;] &lt;strong&gt;T&lt;/strong&gt;:[&lt;em&gt;Free&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] … [&lt;em&gt;Free&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;em&gt;EndOfList&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] &lt;br /&gt;&lt;/font&gt;&lt;font face="Georgia"&gt;Again, no problem here. Dequeue will take care of this situation.&lt;/font&gt; &lt;/li&gt;&lt;br /&gt;&lt;/ol&gt;&lt;p&gt;Another interesting situation occurs when &lt;em&gt;head&lt;/em&gt; is pointing to the last element in the block, the one with the &lt;em&gt;EndOfList&lt;/em&gt; tag. In this case, new block is allocated and current element is changed so that it points to the new block.&lt;/p&gt;&lt;blockquote&gt;  &lt;p&gt;&lt;font face="Consolas, Courier New"&gt;if tag = tagFree then &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; // we covered that already &lt;br /&gt;else // tag = tagEndOfList &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; allocate and initialize new block &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; set head to new block's slot 1 &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; store (tagAllocated, value) into new block's slot 0 &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; store (tagBlockPointer, pointer to new block) into locked slot&lt;/font&gt;&lt;/p&gt;&lt;/blockquote&gt;&lt;p&gt;After that, memory is laid out as follows:&lt;/p&gt;&lt;blockquote&gt;  &lt;p&gt;&lt;font face="Consolas"&gt;&lt;strong&gt;H&lt;/strong&gt;:[&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;value&lt;/em&gt;] [&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;value&lt;/em&gt;] … [&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;value&lt;/em&gt;] [&lt;em&gt;BlockPointer&lt;/em&gt;|&lt;em&gt;B2&lt;/em&gt;] &lt;br /&gt;&lt;/font&gt;&lt;font face="Consolas"&gt;&lt;strong&gt;B2&lt;/strong&gt;:&lt;font face="Consolas, Courier New"&gt;[&lt;em&gt;Allocated&lt;/em&gt;|value] &lt;strong&gt;T&lt;/strong&gt;:[&lt;em&gt;Free&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] … [&lt;em&gt;Free&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;em&gt;EndOfList&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;]&lt;/font&gt;&lt;/font&gt; &lt;/p&gt;&lt;/blockquote&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;Compare and Swap&lt;/h2&gt;&lt;p&gt;In the pseudo-code above I’ve used the CAS method as if it’s something that everybody on this world knows and loves, but probably it deserves some explanation.&lt;/p&gt;&lt;p&gt;CAS, or Compare-and-Swap, is an atomic function that compares some memory location with a value and puts in a new value if memory location was equal to that value. Otherwise, it does nothing.&amp;#160; And the best thing is that all this behaviour executes &lt;em&gt;atomically&lt;/em&gt;. In other words – if two threads are attempting to CAS the same destination, only one of them will succeed.&lt;/p&gt;&lt;p&gt;In plain Delphi, CAS could be written as&lt;/p&gt;&lt;pre class="pas-source"&gt;&lt;span class="pas-kwd"&gt;function&lt;/span&gt; CAS32(&lt;span class="pas-kwd"&gt;const&lt;/span&gt; oldValue, newValue: cardinal; &lt;span class="pas-kwd"&gt;var&lt;/span&gt; PCardinal): boolean;&lt;br /&gt;&lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;  EnterCriticalSection(cs);&lt;br /&gt;  Result := (destination^ = oldValue);&lt;br /&gt;  &lt;span class="pas-kwd"&gt;if&lt;/span&gt; Result &lt;span class="pas-kwd"&gt;then&lt;/span&gt;&lt;br /&gt;    destination^ := newValue;&lt;br /&gt;  LeaveCriticalSection(cs);&lt;br /&gt;&lt;span class="pas-kwd"&gt;end&lt;/span&gt;;&lt;/pre&gt;&lt;br /&gt;However, that is quite slow so in reality we go down to the hardware and use &lt;em&gt;lock cpmxchg&lt;/em&gt; operation that was designed exactly for this purpose. &lt;br /&gt;&lt;pre class="pas-source"&gt;&lt;span class="pas-kwd"&gt;function&lt;/span&gt; CAS32(&lt;span class="pas-kwd"&gt;const&lt;/span&gt; oldValue, newValue: cardinal; &lt;span class="pas-kwd"&gt;var&lt;/span&gt; destination): boolean;&lt;br /&gt;&lt;span class="pas-kwd"&gt;asm&lt;/span&gt;&lt;br /&gt;  &lt;span class="pas-asm"&gt;lock&lt;/span&gt; &lt;span class="pas-asm"&gt;cmpxchg&lt;/span&gt; &lt;span class="pas-asm"&gt;dword&lt;/span&gt; &lt;span class="pas-asm"&gt;ptr&lt;/span&gt; &lt;span class="pas-asm"&gt;[&lt;/span&gt;&lt;span class="pas-asm"&gt;destination&lt;/span&gt;&lt;span class="pas-asm"&gt;]&lt;/span&gt;&lt;span class="pas-asm"&gt;,&lt;/span&gt; &lt;span class="pas-asm"&gt;newValue&lt;/span&gt;&lt;br /&gt;  &lt;span class="pas-asm"&gt;setz&lt;/span&gt;  &lt;span class="pas-asm"&gt;al&lt;/span&gt;&lt;br /&gt;&lt;span class="pas-kwd"&gt;end&lt;/span&gt;;&lt;/pre&gt;&lt;p&gt;The Win32 function &lt;a href="http://msdn.microsoft.com/en-us/library/ms683560(VS.85).aspx" target="_blank"&gt;InterlockedCompareExchange&lt;/a&gt; implements the same behaviour, except that it is slower than the assembler version.&lt;/p&gt;&lt;p&gt;The Tag field of the TOmniTaggedValue&amp;#160; record uses only one byte of storage. The CAS32 function requires four bytes to be compared and swapped. [Even more, those 4 bytes must be 4-aligned. That’s why SizeOf(TOmniTaggedValue) is 16 – so that each Tag falls on a memory location whose address is evenly divisible by 4.]&lt;/p&gt;&lt;p&gt;Therefore, TOmniTaggedValue record implements the CASTag function which does some bit-fiddling to work around the problem.&lt;/p&gt;&lt;pre class="pas-source"&gt;&lt;span class="pas-kwd"&gt;function&lt;/span&gt; TOmniTaggedValue.CASTag(oldTag, newTag: TOmniQueueTag): boolean;&lt;br /&gt;&lt;span class="pas-kwd"&gt;var&lt;/span&gt;&lt;br /&gt;  newValue: DWORD;&lt;br /&gt;  oldValue: DWORD;&lt;br /&gt;&lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;  oldValue := PDWORD(@Tag)^ &lt;span class="pas-kwd"&gt;AND&lt;/span&gt; &lt;span class="pas-hex"&gt;$FFFFFF00&lt;/span&gt; &lt;span class="pas-kwd"&gt;OR&lt;/span&gt; DWORD(ORD(oldTag));&lt;br /&gt;  newValue := oldValue &lt;span class="pas-kwd"&gt;AND&lt;/span&gt; &lt;span class="pas-hex"&gt;$FFFFFF00&lt;/span&gt; &lt;span class="pas-kwd"&gt;OR&lt;/span&gt; DWORD(Ord(newTag));&lt;br /&gt;  Result := CAS32(oldValue, newValue, Tag);&lt;br /&gt;&lt;span class="pas-kwd"&gt;end&lt;/span&gt;; &lt;span class="pas-comment"&gt;{ TOmniTaggedValue.CASTag }&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;First, we need an “old” 4-byte value. It is constructed by taking the &lt;em&gt;oldTag&lt;/em&gt; parameter and OR-ing it with bytes 2, 3, and 4 of the record.&lt;/p&gt;&lt;p&gt;Then we need a “new” 4-byte value, which is constructed in a similar way.&lt;/p&gt;&lt;p&gt;Only then can we call the CAS32 function to compare-and-swap “old” value with the “new” one. &lt;/p&gt;&lt;h2&gt;Dequeue&lt;/h2&gt;&lt;p&gt;Dequeue is not much different from the Enqueue. First it locks the tail element by swapping it from &lt;em&gt;tagAllocated&lt;/em&gt; to &lt;em&gt;tagRemoving &lt;/em&gt;(a normal element) or from &lt;em&gt;tagBlockPointer&lt;/em&gt; to &lt;em&gt;tagDestroying &lt;/em&gt;(a pointer to the next block). If the tag is &lt;em&gt;tagFree&lt;/em&gt;, then the queue is empty and Dequeue can return. In all other cases, it will loop and retry.&lt;/p&gt;&lt;p&gt;If &lt;em&gt;tagAllocated&lt;/em&gt; was found, Dequeue can remove the current element in two easy steps. Firstly it increments the &lt;em&gt;tail&lt;/em&gt; pointer and with that unlocks the queue tail. Only then it fetches the value from the queue. Tag is not modified at all.&lt;/p&gt;&lt;p&gt;In pseudocode:&lt;/p&gt;&lt;blockquote&gt;  &lt;p&gt;&lt;font face="Consolas, Courier New"&gt;&amp;#160; repeat &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; fetch tag from current tail &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; if tag = tagFree then &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; return Empty &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; if tag = tagAllocated and CAS(tag, tagRemoving) then &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; break &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; if tag = tagBlockPointer and CAS(tag, tagDestroying) then&amp;#160; &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; break&lt;/font&gt;&lt;font face="Consolas, Courier New"&gt; &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; yield &lt;br /&gt;&amp;#160; forever &lt;br /&gt;&amp;#160; if tag = tagAllocated then&amp;#160; &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; get value &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; increment tail &lt;br /&gt;&amp;#160; else &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; // ignore this for a moment&lt;/font&gt;&lt;/p&gt;&lt;/blockquote&gt;&lt;p&gt;Let’s look at a simple example. Assume that there are two elements in the queue originally and that Dequeue was just called.&lt;/p&gt;&lt;blockquote&gt;  &lt;p&gt;&lt;font face="Consolas, Courier New"&gt;&lt;strong&gt;H&lt;/strong&gt;:[&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;value1&lt;/em&gt;] [&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;value2&lt;/em&gt;] &lt;strong&gt;T&lt;/strong&gt;:[&lt;em&gt;Free&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] … [&lt;em&gt;EndOfList&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;]&lt;/font&gt;&lt;/p&gt;&lt;/blockquote&gt;&lt;p&gt;Dequeue first swaps the tail tag.&lt;/p&gt;&lt;blockquote&gt;  &lt;p&gt;&lt;font face="Consolas, Courier New"&gt;&lt;strong&gt;H&lt;/strong&gt;:[&lt;em&gt;Removing&lt;/em&gt;|&lt;em&gt;value1&lt;/em&gt;] [&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;value2&lt;/em&gt;] &lt;strong&gt;T&lt;/strong&gt;:[&lt;em&gt;Free&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] … [&lt;em&gt;EndOfList&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;]&lt;/font&gt;&lt;/p&gt;&lt;/blockquote&gt;&lt;p&gt;Then it increments the &lt;em&gt;tail&lt;/em&gt; pointer.&lt;/p&gt;&lt;blockquote&gt;  &lt;p&gt;&lt;font face="Consolas, Courier New"&gt;[&lt;em&gt;Removing&lt;/em&gt;|&lt;em&gt;value1&lt;/em&gt;] &lt;strong&gt;H&lt;/strong&gt;:[&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;value2&lt;/em&gt;] &lt;strong&gt;T&lt;/strong&gt;:[&lt;em&gt;Free&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] … [&lt;em&gt;EndOfList&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;]&lt;/font&gt;&lt;/p&gt;&lt;/blockquote&gt;&lt;p&gt;At this moment, another thread may drop in and start dequeueing.&lt;/p&gt;&lt;blockquote&gt;  &lt;p&gt;&lt;font face="Consolas, Courier New"&gt;[&lt;em&gt;Removing&lt;/em&gt;|&lt;em&gt;value1&lt;/em&gt;] &lt;strong&gt;H&lt;/strong&gt;:[&lt;em&gt;Removing&lt;/em&gt;|&lt;em&gt;value2&lt;/em&gt;] &lt;strong&gt;T&lt;/strong&gt;:[&lt;em&gt;Free&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] … [&lt;em&gt;EndOfList&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;]&lt;/font&gt;&lt;/p&gt;&lt;/blockquote&gt;&lt;p&gt;It is entirely possible that the second thread will finish before the first one. The queue is empty now although the first thread has not yet completed its dequeue.&lt;/p&gt;&lt;blockquote&gt;  &lt;p&gt;&lt;font face="Consolas, Courier New"&gt;[&lt;em&gt;Removing&lt;/em&gt;|&lt;em&gt;value1&lt;/em&gt;] [&lt;em&gt;Removing&lt;/em&gt;|&lt;em&gt;value2&lt;/em&gt;] &lt;strong&gt;H&lt;/strong&gt;:&lt;strong&gt;T&lt;/strong&gt;:[&lt;em&gt;Free&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] … [&lt;em&gt;EndOfList&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;]&lt;/font&gt;&lt;/p&gt;&lt;/blockquote&gt;&lt;p&gt;First thread then continues execution, fetches the value from the slot and exits. &lt;/p&gt;&lt;p&gt;End-of-block pointer handling is only slightly more complicated.&lt;/p&gt;&lt;blockquote&gt;  &lt;p&gt;&lt;font face="Consolas, Courier New"&gt;&amp;#160; if tag = tagAllocated then&amp;#160; &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; // we covered that already &lt;br /&gt;&amp;#160; else &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; // we know that the first slot in new block is allocated&amp;#160; &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; set tail to new block's slot 1&lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; get value &lt;/font&gt;&lt;font face="Consolas, Courier New"&gt;&lt;/font&gt;&lt;/p&gt;&lt;/blockquote&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;Assume the following situation:&lt;/p&gt;&lt;blockquote&gt;  &lt;p&gt;&lt;font face="Consolas"&gt;[&lt;em&gt;Removing&lt;/em&gt;|&lt;em&gt;value&lt;/em&gt;] [&lt;em&gt;Removing&lt;/em&gt;|&lt;em&gt;value&lt;/em&gt;] … [&lt;em&gt;Removing&lt;/em&gt;|&lt;em&gt;value&lt;/em&gt;] &lt;strong&gt;H&lt;/strong&gt;:[&lt;em&gt;BlockPointer&lt;/em&gt;|&lt;em&gt;B2&lt;/em&gt;] &lt;br /&gt;&lt;/font&gt;&lt;font face="Consolas"&gt;&lt;strong&gt;B2&lt;/strong&gt;:&lt;font face="Consolas, Courier New"&gt;[&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;value&lt;/em&gt;] &lt;strong&gt;T&lt;/strong&gt;:[&lt;em&gt;Free&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] … [&lt;em&gt;Free&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;em&gt;EndOfList&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;]&lt;/font&gt;&lt;/font&gt; &lt;/p&gt;&lt;/blockquote&gt;&lt;p&gt;Dequeue first swaps &lt;em&gt;tagBlockPointer&lt;/em&gt; with &lt;em&gt;tagDestroying&lt;/em&gt;.&lt;/p&gt;&lt;blockquote&gt;  &lt;p&gt;&lt;font face="Consolas"&gt;[&lt;em&gt;Removing&lt;/em&gt;|&lt;em&gt;value&lt;/em&gt;] [&lt;em&gt;Removing&lt;/em&gt;|&lt;em&gt;value&lt;/em&gt;] … [&lt;em&gt;Removing&lt;/em&gt;|&lt;em&gt;value&lt;/em&gt;] &lt;strong&gt;H&lt;/strong&gt;:[&lt;em&gt;Destroying&lt;/em&gt;|&lt;em&gt;B2&lt;/em&gt;] &lt;br /&gt;&lt;/font&gt;&lt;font face="Consolas"&gt;&lt;strong&gt;B2&lt;/strong&gt;:&lt;font face="Consolas, Courier New"&gt;[&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;value&lt;/em&gt;] &lt;strong&gt;T&lt;/strong&gt;:[&lt;em&gt;Free&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] … [&lt;em&gt;Free&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;em&gt;EndOfList&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;]&lt;/font&gt;&lt;/font&gt; &lt;/p&gt;&lt;/blockquote&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;Then it follows the pointer to the next block. It knows that the first slot in this block will be allocated (because that’s how Enqueue is implemented) and moves the &lt;em&gt;tail&lt;/em&gt; pointer directly to the second slot. By doing this, the tail pointer is released.. This is entirely safe to do as no other thread could have locked the &lt;em&gt;tail&lt;/em&gt; slot in the meantime because it contains tag &lt;em&gt;Destroying.&lt;/em&gt;&lt;/p&gt;&lt;blockquote&gt;  &lt;p&gt;&lt;font face="Consolas"&gt;[&lt;em&gt;Removing&lt;/em&gt;|&lt;em&gt;value&lt;/em&gt;] [&lt;em&gt;Removing&lt;/em&gt;|&lt;em&gt;value&lt;/em&gt;] … [&lt;em&gt;Removing&lt;/em&gt;|&lt;em&gt;value&lt;/em&gt;] [&lt;em&gt;Destroying&lt;/em&gt;|&lt;em&gt;B2&lt;/em&gt;] &lt;br /&gt;&lt;/font&gt;&lt;font face="Consolas"&gt;&lt;strong&gt;B2&lt;/strong&gt;:&lt;font face="Consolas, Courier New"&gt;[&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;value&lt;/em&gt;] &lt;strong&gt;H&lt;/strong&gt;:&lt;strong&gt;T&lt;/strong&gt;:[&lt;em&gt;Free&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] … [&lt;em&gt;Free&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] [&lt;em&gt;EndOfList&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;]&lt;/font&gt;&lt;/font&gt; &lt;/p&gt;&lt;/blockquote&gt;&lt;p&gt;Last, the Dequeue fetches the value and exits.&lt;/p&gt;&lt;p&gt;That’s all – the &lt;em&gt;tail&lt;/em&gt; pointer was safely moved to the new block and element was fetched (and marked as such).&lt;/p&gt;&lt;p&gt;But … is that really it? A careful reader may have noticed that something was not yet done. The first block is still allocated although no thread is referencing it anymore. Somebody has to release it – but who?&lt;/p&gt;&lt;h2&gt;ABA Strikes&lt;/h2&gt;&lt;p&gt;The first idea is just to release a block at this point. After all, no other dequeuers are doing anything with this block as all tags are known to be marked as &lt;em&gt;tagRemoving&lt;/em&gt; or &lt;em&gt;tagDestroying&lt;/em&gt;. So we can safely release the memory, no?&lt;/p&gt;&lt;p&gt;Actually, we can’t. And the reason for that is the ABA problem – and a tough one. It took me many days to find the reason behind the constant crashes I was experiencing when testing that initial approach.&lt;/p&gt;&lt;p&gt;Assume the following situation:&lt;/p&gt;&lt;blockquote&gt;  &lt;p&gt;&lt;font face="Consolas"&gt;&lt;strong&gt;B1:&lt;/strong&gt;[&lt;em&gt;Removing&lt;/em&gt;|&lt;em&gt;value&lt;/em&gt;] &lt;strong&gt;H&lt;/strong&gt;:[&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;value&lt;/em&gt;] … [&lt;em&gt;BlockPointer&lt;/em&gt;|&lt;em&gt;B2&lt;/em&gt;] &lt;br /&gt;&lt;/font&gt;&lt;font face="Consolas"&gt;&lt;strong&gt;B2:&lt;/strong&gt;&lt;font face="Consolas, Courier New"&gt;[&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;value&lt;/em&gt;] &lt;strong&gt;H:&lt;/strong&gt;[&lt;em&gt;Free&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] … [&lt;em&gt;EndOfList&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;]&lt;/font&gt;&lt;/font&gt; &lt;/p&gt;&lt;/blockquote&gt;&lt;p&gt;Thread 1 starts executing the Dequeue method. It reads the tag from the &lt;em&gt;tail&lt;/em&gt; pointer and is suspended before it can CAS the &lt;em&gt;tagRemoving&lt;/em&gt; tag into the tail slot.&lt;/p&gt;&lt;p&gt;&lt;font face="Consolas"&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; fetch tag from current tail &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; if tag = tagFree then // &amp;lt;- here we stop &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; return Empty &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; if tag = tagAllocated and CAS(tag, tagRemoving) then &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; break &lt;/font&gt;&lt;/p&gt;&lt;p&gt;Now the fun begins. We have suspended thread with remembered location of the &lt;em&gt;tail&lt;/em&gt; pointer pointing to the second slot of the B1 block. I’ll mark this pointer with S: (for Suspended).&lt;/p&gt;&lt;blockquote&gt;  &lt;p&gt;&lt;font face="Consolas"&gt;&lt;strong&gt;B1:&lt;/strong&gt;[&lt;em&gt;Removing&lt;/em&gt;|&lt;em&gt;value&lt;/em&gt;] &lt;strong&gt;S&lt;/strong&gt;:&lt;strong&gt;H&lt;/strong&gt;:[&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;value&lt;/em&gt;] … [&lt;em&gt;BlockPointer&lt;/em&gt;|&lt;em&gt;B2&lt;/em&gt;] &lt;br /&gt;&lt;/font&gt;&lt;font face="Consolas"&gt;&lt;strong&gt;B2:&lt;/strong&gt;&lt;font face="Consolas, Courier New"&gt;[&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;value&lt;/em&gt;] &lt;strong&gt;H:&lt;/strong&gt;[&lt;em&gt;Free&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] … [&lt;em&gt;EndOfList&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;]&lt;/font&gt;&lt;/font&gt; &lt;/p&gt;&lt;/blockquote&gt;&lt;p&gt;Another thread takes over and initiates the Dequeue. As the suspended thread was not yet able to change the tag, the second thread succeeds in dequeueing from the second slot and then from the third one and so on, up to the end of the block.&lt;/p&gt;&lt;blockquote&gt;  &lt;p&gt;&lt;font face="Consolas"&gt;&lt;strong&gt;B1:&lt;/strong&gt;[&lt;em&gt;Removing&lt;/em&gt;|&lt;em&gt;value&lt;/em&gt;] &lt;strong&gt;S&lt;/strong&gt;:[&lt;em&gt;Removing&lt;/em&gt;|&lt;em&gt;value&lt;/em&gt;] … &lt;strong&gt;H&lt;/strong&gt;:[&lt;em&gt;BlockPointer&lt;/em&gt;|&lt;em&gt;B2&lt;/em&gt;]&lt;/font&gt;&lt;font face="Consolas"&gt;&lt;strong&gt; &lt;br /&gt;B2:&lt;/strong&gt;&lt;font face="Consolas, Courier New"&gt;[&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;value&lt;/em&gt;] &lt;strong&gt;H:&lt;/strong&gt;[&lt;em&gt;Free&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] … [&lt;em&gt;EndOfList&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;]&lt;/font&gt;&lt;/font&gt; &lt;/p&gt;&lt;/blockquote&gt;&lt;p&gt;During the next dequeue, second thread destroys block B1.&lt;/p&gt;&lt;blockquote&gt;  &lt;p&gt;&lt;strike&gt;&lt;font face="Consolas"&gt;&lt;strong&gt;B1:&lt;/strong&gt;[&lt;em&gt;Removing&lt;/em&gt;|&lt;em&gt;value&lt;/em&gt;] &lt;strong&gt;S&lt;/strong&gt;:[&lt;em&gt;Removing&lt;/em&gt;|&lt;em&gt;value&lt;/em&gt;] … [&lt;em&gt;Destroying&lt;/em&gt;|&lt;em&gt;B2&lt;/em&gt;]&lt;/font&gt;&lt;font face="Consolas"&gt;&lt;/font&gt;&lt;/strike&gt;&lt;strong&gt;&lt;strike&gt; &lt;br /&gt;&lt;/strike&gt;&lt;font face="Consolas"&gt;B2:&lt;/font&gt;&lt;/strong&gt;&lt;font face="Consolas"&gt;[&lt;em&gt;Removing&lt;/em&gt;|&lt;em&gt;value&lt;/em&gt;] &lt;strong&gt;T:H:&lt;/strong&gt;[&lt;em&gt;Free&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] … [&lt;em&gt;EndOfList&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;]&lt;/font&gt; &lt;/p&gt;&lt;/blockquote&gt;&lt;p&gt;Then the third thread writes into all elements of block B2. &lt;/p&gt;&lt;blockquote&gt;  &lt;p&gt;&lt;strike&gt;&lt;font face="Consolas"&gt;&lt;strong&gt;B1:&lt;/strong&gt;[&lt;em&gt;Removing&lt;/em&gt;|&lt;em&gt;value&lt;/em&gt;] &lt;strong&gt;S&lt;/strong&gt;:[&lt;em&gt;Removing&lt;/em&gt;|&lt;em&gt;value&lt;/em&gt;] … [&lt;em&gt;Destroying&lt;/em&gt;|&lt;em&gt;B2&lt;/em&gt;]&lt;/font&gt;&lt;font face="Consolas"&gt;&lt;/font&gt;&lt;/strike&gt;&lt;strong&gt;&lt;strike&gt; &lt;br /&gt;&lt;/strike&gt;&lt;font face="Consolas"&gt;B2:&lt;/font&gt;&lt;/strong&gt;&lt;font face="Consolas"&gt;[&lt;em&gt;Removing&lt;/em&gt;|&lt;em&gt;value&lt;/em&gt;] &lt;strong&gt;T:&lt;/strong&gt;[&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;value&lt;/em&gt;] … &lt;strong&gt;H:&lt;/strong&gt;[&lt;em&gt;EndOfList&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;]&lt;/font&gt; &lt;/p&gt;&lt;/blockquote&gt;&lt;p&gt;During the next write a memory block is allocated. It may happen (with a high probability, because FastMM memory manager tries to reuse recently released memory blocks) that this memory block will be located at address B1. The block is emptied during the allocation but the suspended thread’s copy of the &lt;em&gt;tail&lt;/em&gt; pointer still points into it.&lt;/p&gt;&lt;blockquote&gt;  &lt;p&gt;&lt;font face="Consolas"&gt;&lt;strong&gt;B1:&lt;/strong&gt;[&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;value&lt;/em&gt;] &lt;strong&gt;H:S&lt;/strong&gt;:[&lt;em&gt;Free&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;] … [&lt;em&gt;EndOfList&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;]&lt;/font&gt; &lt;br /&gt;&lt;font face="Consolas"&gt;&lt;strong&gt;B2:&lt;/strong&gt;[&lt;em&gt;Removing&lt;/em&gt;|&lt;em&gt;value&lt;/em&gt;] &lt;strong&gt;T:&lt;/strong&gt;[&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;value&lt;/em&gt;] … [&lt;em&gt;BlockPointer&lt;/em&gt;|&lt;em&gt;B1&lt;/em&gt;]&lt;/font&gt; &lt;/p&gt;&lt;/blockquote&gt;&lt;p&gt;Then another slot gets enqueued.&lt;/p&gt;&lt;blockquote&gt;  &lt;p&gt;&lt;font face="Consolas"&gt;&lt;strong&gt;B1:&lt;/strong&gt;[&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;value&lt;/em&gt;] &lt;strong&gt;S&lt;/strong&gt;:[&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;value&lt;/em&gt;] &lt;strong&gt;H: &lt;/strong&gt;… [&lt;em&gt;EndOfList&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;]&lt;/font&gt; &lt;br /&gt;&lt;font face="Consolas"&gt;&lt;strong&gt;B2:&lt;/strong&gt;[&lt;em&gt;Removing&lt;/em&gt;|&lt;em&gt;value&lt;/em&gt;] &lt;strong&gt;T:&lt;/strong&gt;[&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;value&lt;/em&gt;] … [&lt;em&gt;BlockPointer&lt;/em&gt;|&lt;em&gt;B1&lt;/em&gt;]&lt;/font&gt; &lt;/p&gt;&lt;/blockquote&gt;&lt;p&gt;At that point, the original thread is resumed. It continues the execution with &lt;/p&gt;&lt;blockquote&gt;  &lt;p&gt;&lt;font face="Consolas"&gt;&amp;#160; if tag = tagAllocated and CAS(tag, tagRemoving) then&amp;#160; &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; break&amp;#160; &lt;br /&gt;&lt;/font&gt;&lt;font face="Consolas"&gt;&amp;#160; if tag = tagAllocated then &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; increment tail &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; get value &lt;/font&gt;&lt;/p&gt;&lt;/blockquote&gt;&lt;p&gt;Tag is still &lt;em&gt;tagAllocated&lt;/em&gt; (well, it is &lt;strong&gt;again&lt;/strong&gt; set to &lt;em&gt;tagAllocated&lt;/em&gt;, but our poor thread doesn’t know that) so it swaps it with &lt;em&gt;tagRemoving&lt;/em&gt;. That is weird as we have now a &lt;em&gt;tagRemoving &lt;/em&gt;slot in the middle of &lt;em&gt;tagAllocated&lt;/em&gt; ones, but that’s maybe something we could live with. The biggest problem lies in the next line which sets the &lt;em&gt;tail &lt;/em&gt;pointer to the next slot.And by that I don’t mean the slot relative to the current tail but to the stored &lt;em&gt;tail&lt;/em&gt; pointer! In other words, the third slot of block B1.&lt;/p&gt;&lt;blockquote&gt;  &lt;p&gt;&lt;font face="Consolas"&gt;&lt;strong&gt;B1:&lt;/strong&gt;[&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;value&lt;/em&gt;] [&lt;em&gt;Removing&lt;/em&gt;|&lt;em&gt;value&lt;/em&gt;] &lt;strong&gt;T:H: &lt;/strong&gt;… [&lt;em&gt;EndOfList&lt;/em&gt;|&lt;em&gt;0&lt;/em&gt;]&lt;/font&gt; &lt;br /&gt;&lt;font face="Consolas"&gt;&lt;strong&gt;B2:&lt;/strong&gt;[&lt;em&gt;Removing&lt;/em&gt;|&lt;em&gt;value&lt;/em&gt;] [&lt;em&gt;Allocated&lt;/em&gt;|&lt;em&gt;value&lt;/em&gt;] … [&lt;em&gt;BlockPointer&lt;/em&gt;|&lt;em&gt;B1&lt;/em&gt;]&lt;/font&gt; &lt;/p&gt;&lt;/blockquote&gt;&lt;p&gt;And now we have a total mess of a memory layout. From this point onwards, nothing works &lt;img alt="Surprise" src="http://us.i1.yimg.com/us.yimg.com/i/mesg/emoticons7/13.gif" /&gt;&lt;/p&gt;&lt;p&gt;Even worse, that is not the only problem. For instance, the B1 block may have been reallocated by another thread, for another purposes and CAS may have still succeeded if correct bytes are found at that location. Fat chance, I know, but as the Pratchett likes to say, million-to-one chances crop up nine times out of ten. &lt;/p&gt;&lt;p&gt;The same scenario can happen during the Enqueue.&lt;/p&gt;&lt;p&gt;&lt;font face="Consolas"&gt;&amp;#160;&amp;#160;&amp;#160; fetch tag from current head &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; //thread pauses here&lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; if tag = tagFree and CAS(tag, tagAllocating) then &lt;/font&gt;&lt;/p&gt;&lt;p&gt;The problem is more likely to occur at this point because CAS will expect the source to be&amp;#160; $00000000. It is entirely possible that another thread allocates this block, clears it for further use, and just the moment after that our suspended thread kicks in and a) destroys this block by CAS-ing &lt;em&gt;tagAllocated&lt;/em&gt; in and b) points the &lt;em&gt;tail&lt;/em&gt; pointer into that block. Utter disaster.&lt;/p&gt;&lt;h2&gt;Garbage Collector&lt;/h2&gt;&lt;p&gt;There is just one thing that can be done – never to release a memory block while any thread is using it. In a way, we need a garbage collector.&lt;/p&gt;&lt;p&gt;It is hard to answer the question: “Is any thread using this memory block?” [Not impossible, I must add. Just very hard. And any solution would be totally impractical.] We have to be satisfied with less. Another way to look at the problem is: “When is it safe to release a memory block?” That, at least we can answer: “When Enqueue and Dequeue are not executing. At all. In any thread.” That is also the solution which I’ve implemented.&lt;/p&gt;&lt;p&gt;We must allow many Enqueue/Dequeue paths to execute at the same time, but we only want one thread to be releasing memory and during this time no Enqueue/Dequeue must execute. Does this remind you of anything? Of course, a Multi-Readers-Exclusive-Writer lock!&lt;/p&gt;&lt;p&gt;Enqueue/Dequeue acquire read lock during the execution. Garbage collector acquires write lock, releases the memory and releases the lock. Simple.&lt;/p&gt;&lt;p&gt;The garbage collector is very simple and is implemented in place (as opposed to the implementation in a separate thread). The thread that found the &lt;em&gt;tagBlockPointer &lt;/em&gt;is responsible for freeing the memory block.&lt;/p&gt;&lt;p&gt;Enqueue is simply wrapped in the read lock.&lt;/p&gt;&lt;blockquote&gt;  &lt;p&gt;&lt;font face="Consolas, Courier New"&gt;acquire read access to GC &lt;br /&gt;// do the Enqueue &lt;br /&gt;release read access to GC&lt;/font&gt;&lt;/p&gt;&lt;/blockquote&gt;&lt;p&gt;Dequeue is slightly more complicated. If the &lt;em&gt;tagBlockPointer&lt;/em&gt; is found then the code releases read lock, acquires write lock and releases the memory block. In other words, Dequeue switches from the dequeueing mode (by releasing the read lock) into garbage collecting mode (by acquiering the write lock).&lt;/p&gt;&lt;blockquote&gt;  &lt;p&gt;&lt;font face="Consolas, Courier New"&gt;acquire read access to GC &lt;br /&gt;// do the Dequeue &lt;br /&gt;release read access to GC &lt;br /&gt;if block has to be released &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; acquire write access to GC &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; release the block &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; release write access to GC&lt;/font&gt;&lt;/p&gt;&lt;/blockquote&gt;&lt;p&gt;MREW implementation is very simple and could theoretically lead to starvation. However, the practical tests confirmed that this does not happen.&lt;/p&gt;&lt;p&gt;One number is used for locking. If it is greater than zero, there are readers active. Each reader increments the number on enter and decrements it on exit.&lt;/p&gt;&lt;p&gt;Writer waits until this number is 0 (no readers) and decrements it to –1. When exiting, it just sets the number back to 0.&lt;/p&gt;&lt;p&gt;Of course, all those increments and decrements are done atomically.&lt;/p&gt;&lt;pre class="pas-source"&gt;&lt;span class="pas-kwd"&gt;procedure&lt;/span&gt; TOmniBaseQueue.EnterReader;&lt;br /&gt;&lt;span class="pas-kwd"&gt;var&lt;/span&gt;&lt;br /&gt;  value: integer;&lt;br /&gt;&lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;  &lt;span class="pas-kwd"&gt;repeat&lt;/span&gt;&lt;br /&gt;    value := obcRemoveCount.Value;&lt;br /&gt;    &lt;span class="pas-kwd"&gt;if&lt;/span&gt; value &amp;gt;= &lt;span class="pas-num"&gt;0&lt;/span&gt; &lt;span class="pas-kwd"&gt;then&lt;/span&gt;&lt;br /&gt;      &lt;span class="pas-kwd"&gt;if&lt;/span&gt; obcRemoveCount.CAS(value, value + &lt;span class="pas-num"&gt;1&lt;/span&gt;) &lt;span class="pas-kwd"&gt;then&lt;/span&gt;&lt;br /&gt;        break&lt;br /&gt;    &lt;span class="pas-kwd"&gt;else&lt;/span&gt; &lt;br /&gt;      DSiYield; &lt;span class="pas-comment"&gt;// let the GC do its work&lt;/span&gt;&lt;br /&gt;  &lt;span class="pas-kwd"&gt;until&lt;/span&gt; false;&lt;br /&gt;&lt;span class="pas-kwd"&gt;end&lt;/span&gt;; &lt;span class="pas-comment"&gt;{ TOmniBaseQueue.EnterReader }&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="pas-kwd"&gt;procedure&lt;/span&gt; TOmniBaseQueue.EnterWriter;&lt;br /&gt;&lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;  &lt;span class="pas-kwd"&gt;while&lt;/span&gt; &lt;span class="pas-kwd"&gt;not&lt;/span&gt; ((obcRemoveCount.Value = &lt;span class="pas-num"&gt;0&lt;/span&gt;) &lt;span class="pas-kwd"&gt;and&lt;/span&gt; (obcRemoveCount.CAS(&lt;span class="pas-num"&gt;0&lt;/span&gt;, -&lt;span class="pas-num"&gt;1&lt;/span&gt;))) &lt;span class="pas-kwd"&gt;do&lt;/span&gt;&lt;br /&gt;    &lt;span class="pas-kwd"&gt;asm&lt;/span&gt; &lt;span class="pas-asm"&gt;pause&lt;/span&gt;&lt;span class="pas-asm"&gt;;&lt;/span&gt; &lt;span class="pas-kwd"&gt;end&lt;/span&gt;; &lt;span class="pas-comment"&gt;// retry after slight pause&lt;/span&gt;&lt;br /&gt;&lt;span class="pas-kwd"&gt;end&lt;/span&gt;; &lt;span class="pas-comment"&gt;{ TOmniBaseQueue.EnterWriter }&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="pas-kwd"&gt;procedure&lt;/span&gt; TOmniBaseQueue.LeaveReader;&lt;br /&gt;&lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;  obcRemoveCount.Decrement;&lt;br /&gt;&lt;span class="pas-kwd"&gt;end&lt;/span&gt;; &lt;span class="pas-comment"&gt;{ TOmniBaseQueue.LeaveReader }&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="pas-kwd"&gt;procedure&lt;/span&gt; TOmniBaseQueue.LeaveWriter;&lt;br /&gt;&lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;  obcRemoveCount.Value := &lt;span class="pas-num"&gt;0&lt;/span&gt;;&lt;br /&gt;&lt;span class="pas-kwd"&gt;end&lt;/span&gt;; &lt;span class="pas-comment"&gt;{ TOmniBaseQueue.LeaveWriter }&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;This implementation of the queue has passed 24 hour stress test where millions of messages were enqueued and dequeued every second and where from 1 to 8 threads were functioning as a writer and another 1 to 8 as a reader. Every four million messages threads were stopped and content of queues was checked for validity. No problems were found.&lt;/p&gt;&lt;h2&gt;What About Performance?&lt;/h2&gt;&lt;p&gt;To test the workings of the queue and to measure its performance I wrote a simple test, located in folder 32_Queue in the Tests branch of the OTL tree.&lt;/p&gt;&lt;p&gt;The test framework sets up the following data path:&lt;/p&gt;&lt;blockquote&gt;  &lt;p&gt;&lt;font face="Consolas, Courier New"&gt;source queue –&amp;gt; N threads –&amp;gt; channel queue –&amp;gt; M threads –&amp;gt; destination queue&lt;/font&gt;&lt;/p&gt;&lt;/blockquote&gt;&lt;p&gt;Source queue is filled with numbers from 1 to 1.000.000. Then 1 to 8 threads are set up to read from the source queue and write into the channel queue and another 1 to 8 threads are set up to read from the channel queue and write to the destination queue. Application then starts the clock and starts all threads. When all numbers are moved to the destination queue, clock is stopped and contents of the destination queue are verified. Thread creation time is not included in the measured time.&lt;/p&gt;&lt;p&gt;All in all this results in 2 million reads and 2 million writes distributed over three queues. Tests are very brutal as all threads are just hammering on the queues, doing nothing else. The table below contains average, min and max time of 5 runs on a 2.67 GHz computer with two 4-core CPUs.&lt;/p&gt;&lt;div align="center"&gt;  &lt;table border="0" cellspacing="0" cellpadding="2" width="600" align="center"&gt;&lt;tbody&gt;      &lt;tr&gt;        &lt;td valign="top" width="158"&gt;&amp;#160;&lt;/td&gt;        &lt;td valign="top" width="235"&gt;&lt;strong&gt;average [min-max]            &lt;/strong&gt;all data in milliseconds&lt;/td&gt;        &lt;td valign="top" width="205"&gt;&lt;strong&gt;millions of queue operations per second&lt;/strong&gt;&lt;/td&gt;      &lt;/tr&gt;      &lt;tr&gt;        &lt;td valign="top" width="159"&gt;N = 1, M = 1&lt;/td&gt;        &lt;td valign="top" width="235"&gt;707 [566-834]&lt;/td&gt;        &lt;td valign="top" width="204"&gt;5,66&lt;/td&gt;      &lt;/tr&gt;      &lt;tr&gt;        &lt;td valign="top" width="159"&gt;N = 2, M = 2&lt;/td&gt;        &lt;td valign="top" width="235"&gt;996 [950-1031]&lt;/td&gt;        &lt;td valign="top" width="204"&gt;4,02&lt;/td&gt;      &lt;/tr&gt;      &lt;tr&gt;        &lt;td valign="top" width="159"&gt;N = 3, M = 3&lt;/td&gt;        &lt;td valign="top" width="235"&gt;1065 [1055-1074]&lt;/td&gt;        &lt;td valign="top" width="204"&gt;3,76&lt;/td&gt;      &lt;/tr&gt;      &lt;tr&gt;        &lt;td valign="top" width="159"&gt;N = 4, M = 4&lt;/td&gt;        &lt;td valign="top" width="235"&gt;1313 [1247-1358]&lt;/td&gt;        &lt;td valign="top" width="204"&gt;3,04&lt;/td&gt;      &lt;/tr&gt;      &lt;tr&gt;        &lt;td valign="top" width="159"&gt;N = 8, M = 8&lt;/td&gt;        &lt;td valign="top" width="235"&gt;1520 [1482-1574]&lt;/td&gt;        &lt;td valign="top" width="204"&gt;2,63&lt;/td&gt;      &lt;/tr&gt;      &lt;tr&gt;        &lt;td valign="top" width="159"&gt;N = 1 , M = 7&lt;/td&gt;        &lt;td valign="top" width="235"&gt;3880 [3559-4152]&lt;/td&gt;        &lt;td valign="top" width="204"&gt;1,03&lt;/td&gt;      &lt;/tr&gt;      &lt;tr&gt;        &lt;td valign="top" width="159"&gt;N = 7, M = 1&lt;/td&gt;        &lt;td valign="top" width="235"&gt;1314 [1299-1358]&lt;/td&gt;        &lt;td valign="top" width="204"&gt;3,04&lt;/td&gt;      &lt;/tr&gt;    &lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;&lt;/div&gt;&lt;p&gt;The queue is performing well even when there are twice more threads than cores in the computer. The only anomalous data is in the N=1, M=7 row where there were only eight threads but the performance was quite low. It looks like the single writer was not able to put enough data into the channel queue for seven readers to read and that caused excessive looping in the MREW. But I have no proof for that.&lt;/p&gt;&lt;h2&gt;What is Good For? &lt;/h2&gt;&lt;p&gt;Absolutely &lt;strike&gt;nothing&lt;/strike&gt;everything! Of course, this queue was designed as a backing storage for the blocking collection, but it is also useful for any multi-threaded and single-threaded use. Just don’t use it in situation where you can’t control the growth of the data.&lt;/p&gt;&lt;p&gt;For example, all internal messaging queues in the OTL will still use bounded (fixed-size) queues. That way, a message recipient that blocks at some point only causes the queue to fill up which then triggers the exception. If dynamic queue would be used for the messaging, it could fill up all the virtual memory on the computer and only then crash the program (or some other thread would run out of memory and you’d have no idea where the cause of the problem lies).&lt;/p&gt;&lt;p&gt;Use it sparingly, use it wisely.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;font size=-2&gt;---&lt;br/&gt;Published under the &lt;a href="http://creativecommons.org/licenses/by/3.0/"&gt;Creative Commons Attribution 3.0&lt;/a&gt; license&lt;/font&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/29331675-1488081948103738418?l=17slon.com%2Fblogs%2Fgabr%2Fblogger.html' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/1488081948103738418/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://17slon.com/blogs/gabr/2010/01/three-steps-to-blocking-collection-2.html#comment-form' title='8 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/posts/default/1488081948103738418'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/posts/default/1488081948103738418'/><link rel='alternate' type='text/html' href='http://17slon.com/blogs/gabr/2010/01/three-steps-to-blocking-collection-2.html' title='Three steps to the blocking collection: [2] Dynamically allocated queue'/><author><name>gabr</name><uri>http://www.blogger.com/profile/06903558857617342477</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='12941540036853261067'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>8</thr:total></entry><entry><id>tag:blogger.com,1999:blog-29331675.post-8026419860194985198</id><published>2010-02-01T18:52:00.005+01:00</published><updated>2010-02-07T18:49:28.613+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='open source'/><category scheme='http://www.blogger.com/atom/ns#' term='Delphi'/><category scheme='http://www.blogger.com/atom/ns#' term='OmniThreadLibrary'/><category scheme='http://www.blogger.com/atom/ns#' term='multithreading'/><category scheme='http://www.blogger.com/atom/ns#' term='source code'/><category scheme='http://www.blogger.com/atom/ns#' term='programming'/><title type='text'>Parallel.ForEach.Aggregate</title><content type='html'>&lt;p&gt;Totally out-of-band posting (I’m still working on the second part of the “blocking collection” trilogy), posted just because I’m happy that the code works.&lt;/p&gt;  &lt;p&gt;This code.&lt;/p&gt;  &lt;pre class="pas-source"&gt;&lt;span class="pas-kwd"&gt;procedure&lt;/span&gt; TfrmParallelAggregateDemo.btnCountParallelClick(Sender: TObject);&lt;br /&gt;&lt;span class="pas-kwd"&gt;var&lt;/span&gt;&lt;br /&gt;  numPrimes: integer;&lt;br /&gt;&lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;  numPrimes :=&lt;br /&gt;    Parallel.ForEach(&lt;span class="pas-num"&gt;1&lt;/span&gt;, inpMaxPrime.Value)&lt;br /&gt;    .Aggregate(&lt;br /&gt;      &lt;span class="pas-kwd"&gt;procedure&lt;/span&gt; (&lt;span class="pas-kwd"&gt;var&lt;/span&gt; aggregate: int64; value: int64) &lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;        aggregate := aggregate + value;&lt;br /&gt;      &lt;span class="pas-kwd"&gt;end&lt;/span&gt;)&lt;br /&gt;    .Execute(&lt;br /&gt;      &lt;span class="pas-kwd"&gt;function&lt;/span&gt; (&lt;span class="pas-kwd"&gt;const&lt;/span&gt; value: TOmniValue): TOmniValue &lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;        &lt;span class="pas-kwd"&gt;if&lt;/span&gt; IsPrime(value) &lt;span class="pas-kwd"&gt;then&lt;/span&gt;&lt;br /&gt;          Result := &lt;span class="pas-num"&gt;1&lt;/span&gt;&lt;br /&gt;        &lt;span class="pas-kwd"&gt;else&lt;/span&gt;&lt;br /&gt;          Result := &lt;span class="pas-num"&gt;0&lt;/span&gt;;&lt;br /&gt;      &lt;span class="pas-kwd"&gt;end&lt;/span&gt;);&lt;br /&gt;  Log(&lt;span class="pas-str"&gt;'%d primes from 1 to %d'&lt;/span&gt;, [numPrimes, inpMaxSummand.Value]);&lt;br /&gt;&lt;span class="pas-kwd"&gt;end&lt;/span&gt;;&lt;/pre&gt;&lt;p&gt;Anonymous methods are simply great. They make the code unreadable, but they are oh so useful!&lt;/p&gt;&lt;p&gt;Everything is in the &lt;a href="http://code.google.com/p/omnithreadlibrary/source/checkout" target="_blank"&gt;trunk&lt;/a&gt;. Checkout and enjoy.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;font size="-2"&gt;---&lt;br /&gt;    &lt;br /&gt;Published under the &lt;a href="http://creativecommons.org/licenses/by/3.0/"&gt;Creative Commons Attribution 3.0&lt;/a&gt; license&lt;/font&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;font size=-2&gt;---&lt;br/&gt;Published under the &lt;a href="http://creativecommons.org/licenses/by/3.0/"&gt;Creative Commons Attribution 3.0&lt;/a&gt; license&lt;/font&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/29331675-8026419860194985198?l=17slon.com%2Fblogs%2Fgabr%2Fblogger.html' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/8026419860194985198/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://17slon.com/blogs/gabr/2010/02/parallelforeachaggregate.html#comment-form' title='19 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/posts/default/8026419860194985198'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/posts/default/8026419860194985198'/><link rel='alternate' type='text/html' href='http://17slon.com/blogs/gabr/2010/02/parallelforeachaggregate.html' title='Parallel.ForEach.Aggregate'/><author><name>gabr</name><uri>http://www.blogger.com/profile/06903558857617342477</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='12941540036853261067'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>19</thr:total></entry><entry><id>tag:blogger.com,1999:blog-29331675.post-262522453141290622</id><published>2010-01-08T19:34:00.007+01:00</published><updated>2010-02-07T18:49:50.315+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='open source'/><category scheme='http://www.blogger.com/atom/ns#' term='Delphi'/><category scheme='http://www.blogger.com/atom/ns#' term='OmniThreadLibrary'/><category scheme='http://www.blogger.com/atom/ns#' term='multithreading'/><category scheme='http://www.blogger.com/atom/ns#' term='source code'/><category scheme='http://www.blogger.com/atom/ns#' term='programming'/><title type='text'>Parallel.For</title><content type='html'>&lt;p&gt;Just thinking out loud:&lt;/p&gt;  &lt;pre class="pas-source"&gt;&lt;span class="pas-kwd"&gt;function&lt;/span&gt; TfrmParallelForDemo.ParaScan(rootNode: TNode; value: integer): TNode;&lt;br /&gt;&lt;span class="pas-kwd"&gt;var&lt;/span&gt;&lt;br /&gt;  nodeResult: TNode;&lt;br /&gt;  nodeQueue : TOmniBlockingCollection;&lt;br /&gt;&lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;  nodeResult := &lt;span class="pas-kwd"&gt;nil&lt;/span&gt;;&lt;br /&gt;  nodeQueue := TOmniBlockingCollection.Create;&lt;br /&gt;  &lt;span class="pas-kwd"&gt;try&lt;/span&gt;&lt;br /&gt;    nodeQueue.Add(rootNode);&lt;br /&gt;    Parallel.ForEach(nodeQueue.GetEnumerator).Timeout(&lt;span class="pas-num"&gt;10&lt;/span&gt;*&lt;span class="pas-num"&gt;1000&lt;/span&gt;).Execute(&lt;br /&gt;      &lt;span class="pas-kwd"&gt;procedure&lt;/span&gt; (&lt;span class="pas-kwd"&gt;const&lt;/span&gt; elem: TOmniValue)&lt;br /&gt;      &lt;span class="pas-kwd"&gt;var&lt;/span&gt;&lt;br /&gt;        node : TNode;&lt;br /&gt;        iNode: integer;&lt;br /&gt;      &lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;        node := TNode(elem.AsPointer);&lt;br /&gt;        &lt;span class="pas-kwd"&gt;if&lt;/span&gt; node.Value = value &lt;span class="pas-kwd"&gt;then&lt;/span&gt; &lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;          nodeResult := node;&lt;br /&gt;          nodeQueue.CompleteAdding;&lt;br /&gt;        &lt;span class="pas-kwd"&gt;end&lt;/span&gt;&lt;br /&gt;        &lt;span class="pas-kwd"&gt;else&lt;/span&gt; &lt;span class="pas-kwd"&gt;for&lt;/span&gt; iNode := &lt;span class="pas-num"&gt;0&lt;/span&gt; &lt;span class="pas-kwd"&gt;to&lt;/span&gt; node.NumChild - &lt;span class="pas-num"&gt;1&lt;/span&gt; &lt;span class="pas-kwd"&gt;do&lt;/span&gt;&lt;br /&gt;          nodeQueue.TryAdd(node.Child[iNode]);&lt;br /&gt;      &lt;span class="pas-kwd"&gt;end&lt;/span&gt;);&lt;br /&gt;  &lt;span class="pas-kwd"&gt;finally&lt;/span&gt; FreeAndNil(nodeQueue); &lt;span class="pas-kwd"&gt;end&lt;/span&gt;;&lt;br /&gt;  Result := nodeResult;&lt;br /&gt;&lt;span class="pas-kwd"&gt;end&lt;/span&gt;; &lt;span class="pas-comment"&gt;{ TfrmParallelForDemo.ParaScan }&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;I can make it compile (just did) and I think I can make it work.&lt;/p&gt;&lt;p&gt;Useful? Simple enough? What do you think?&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;font size="-2"&gt;---&lt;br /&gt;Published under the &lt;a href="http://creativecommons.org/licenses/by/3.0/"&gt;Creative Commons Attribution 3.0&lt;/a&gt; license&lt;/font&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;font size=-2&gt;---&lt;br/&gt;Published under the &lt;a href="http://creativecommons.org/licenses/by/3.0/"&gt;Creative Commons Attribution 3.0&lt;/a&gt; license&lt;/font&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/29331675-262522453141290622?l=17slon.com%2Fblogs%2Fgabr%2Fblogger.html' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/262522453141290622/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://17slon.com/blogs/gabr/2010/01/parallelfor.html#comment-form' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/posts/default/262522453141290622'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/posts/default/262522453141290622'/><link rel='alternate' type='text/html' href='http://17slon.com/blogs/gabr/2010/01/parallelfor.html' title='Parallel.For'/><author><name>gabr</name><uri>http://www.blogger.com/profile/06903558857617342477</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='12941540036853261067'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-29331675.post-8924600292131012429</id><published>2010-01-07T16:57:00.001+01:00</published><updated>2010-02-07T18:50:20.954+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='open source'/><category scheme='http://www.blogger.com/atom/ns#' term='Delphi'/><category scheme='http://www.blogger.com/atom/ns#' term='OmniThreadLibrary'/><category scheme='http://www.blogger.com/atom/ns#' term='multithreading'/><category scheme='http://www.blogger.com/atom/ns#' term='source code'/><category scheme='http://www.blogger.com/atom/ns#' term='programming'/><title type='text'>Three steps to the blocking collection: [1] Inverse semaphore</title><content type='html'>&lt;p&gt;In &lt;a href="http://blogs.msdn.com/pfxteam/archive/2009/11/06/9918363.aspx" target="_blank"&gt;What’s new for the coordination data structures in Beta 2?&lt;/a&gt; Joshua Phillips published following algorithm demonstrating a use of BlockingCollection in Parallel Extensions Beta 2.&lt;/p&gt;  &lt;blockquote&gt;   &lt;p&gt;var targetNode = …;      &lt;br /&gt;var bc = new BlockingCollection&amp;lt;Node&amp;gt;(startingNodes);       &lt;br /&gt;// since we expect GetConsumingEnumerable to block, limit parallelism to the number of       &lt;br /&gt;// procs, avoiding too much thread injection       &lt;br /&gt;var parOpts = new ParallelOptions() { MaxDegreeOfParallelism = Enivronment.ProcessorCount };       &lt;br /&gt;Parallel.ForEach(bc.GetConsumingEnumerable(), parOpts, (node,loop) =&amp;gt;       &lt;br /&gt;{       &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; if (node == targetNode)       &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; {       &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; Console.WriteLine(“hooray!”);       &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; bc.CompleteAdding();       &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; loop.Stop();       &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; }       &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; else       &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; {       &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160;&amp;#160; foreach(var neighbor in node.Neighbors) bc.Add(neighbor);       &lt;br /&gt;&amp;#160;&amp;#160;&amp;#160; }       &lt;br /&gt;});&lt;/p&gt; &lt;/blockquote&gt;  &lt;p&gt;Even if you’re not familiar with C# and Parallel Extensions, this code is fairly simple to read. It implements a parallel search in a tree and writes “hooray” when a node is found.&lt;/p&gt;  &lt;p&gt;When I saw this code I thought to myself: “Hmmm, this BlockingCollection really looks neat. Maybe I can add it to the OmniThreadLibrary.” And as I had some free time (shockingly, I know) I started coding. Soon I noticed a problem in that search algorithm. While it looks like a nice piece of code, it exhibits a problem that makes it mostly unusable in real world applications. Can you spot it? [I should add that the authors are aware of the problem and they decided to ignore it while writing this code fragment for the sake of simplicity.]&lt;/p&gt;  &lt;p&gt;To solve it, I needed an inverse semaphore. This is an interesting synchronisation tool which is sadly not implemented in Win32 API. It differs from the ordinary semaphore in one important way – ordinary semaphore is signalled while greater than zero and inverse semaphore is signalled when it &lt;strong&gt;is equal to &lt;/strong&gt;zero. You have no idea what I’m talking about? Here’s a more elaborate description … &lt;/p&gt;  &lt;p&gt;[BTW, I googled for an authoritative definition of the inverse semaphore but couldn’t find one so this is my own approximation of the concept.]&lt;/p&gt;  &lt;p&gt;A semaphore is a counting synchronisation object that starts at some value (typically greater than 0). This value typically represents a number of available resources (concurrent connections etc). To allocate a semaphore, one &lt;strong&gt;waits&lt;/strong&gt; on it. If the semaphore count is &amp;gt; 0, the semaphore is signalled, wait will succeed and semaphore count gets decremented by 1. [Of course, all of this executes atomically.] If the semaphore count is 0, the semaphore is not signalled and wait will block until the timeout or until other thread &lt;strong&gt;releases&lt;/strong&gt; the semaphore, which increments the semaphore’s count and puts it into the signalled state. [‘Nuff said. If you want to read more about semaphores, I’m recommending &lt;a href="http://www.greenteapress.com/semaphores/" target="_blank"&gt;The Little Book of Semaphores&lt;/a&gt;, a free textbook on all things semaphorical.]&lt;/p&gt;  &lt;p&gt;Inverse semaphore, on the other hand, gets signalled when &lt;strong&gt;the count drops to 0&lt;/strong&gt;. This allows another thread to execute a blocking wait, which will succeed only when the semaphore’s count is 0. Why is that good, you’ll ask? Because it simplifies resource exhaustion detection. If you an inverse semaphore and this semaphore becomes signalled, then you know that the resource is fully used. And why is &lt;strong&gt;that&lt;/strong&gt; good, you’ll ask? Well, you’ll have to wait until the Part 3 to learn the answer.&lt;/p&gt;  &lt;p&gt;OTL’s inverse semaphore lives in the &lt;a href="http://code.google.com/p/omnithreadlibrary/source/browse/trunk/OtlSync.pas" target="_blank"&gt;OtlSync&lt;/a&gt; unit and is called TOmniResourceCount. It also implements IOmniResourceCount in case you want to use it through the interface. &lt;/p&gt;  &lt;pre class="pas-source"&gt;IOmniResourceCount = &lt;span class="pas-kwd"&gt;interface&lt;/span&gt; [&lt;span class="pas-str"&gt;'{F5281539-1DA4-45E9-8565-4BEA689A23AD}'&lt;/span&gt;]&lt;br /&gt;  &lt;span class="pas-kwd"&gt;function&lt;/span&gt;  GetHandle: THandle;&lt;br /&gt;  &lt;span class="pas-comment"&gt;//&lt;/span&gt;&lt;br /&gt;  &lt;span class="pas-kwd"&gt;function&lt;/span&gt;  Allocate: cardinal; &lt;br /&gt;  &lt;span class="pas-kwd"&gt;function&lt;/span&gt;  Release: cardinal;&lt;br /&gt;  &lt;span class="pas-kwd"&gt;function&lt;/span&gt;  TryAllocate(&lt;span class="pas-kwd"&gt;var&lt;/span&gt; resourceCount: cardinal; timeout_ms: cardinal = &lt;span class="pas-num"&gt;0&lt;/span&gt;): boolean;&lt;br /&gt;  &lt;span class="pas-kwd"&gt;property&lt;/span&gt; Handle: THandle &lt;span class="pas-kwd"&gt;read&lt;/span&gt; GetHandle;&lt;br /&gt;&lt;span class="pas-kwd"&gt;end&lt;/span&gt;; &lt;span class="pas-comment"&gt;{ IOmniResourceCount }&lt;/span&gt; &lt;/pre&gt;&lt;p&gt;Resource count (let’s call it that from now on) starts at some count (passed to the constructor). &lt;em&gt;Allocate&lt;/em&gt; will block if this count is 0 (until the count becomes greater than 0), otherwise it will decrement the count. The new value of the counter is returned as a function result. [Keep in mind that this number may not be valid even at the time the function returned if other threads are using the same resource count.]&lt;/p&gt;&lt;p&gt;&lt;em&gt;Release&lt;/em&gt; increments the count and unblocks waiting &lt;em&gt;Allocates. &lt;/em&gt;New resource count (potentially invalid at the moment caller will see it) is returned as the result.&lt;/p&gt;&lt;p&gt;Then there is &lt;em&gt;TryAllocate&lt;/em&gt; – a safer version of &lt;em&gt;Allocate&lt;/em&gt; taking a timeout parameter (which may be set to &lt;em&gt;INFINITE&lt;/em&gt;) and returning success/fail status as a function result.&lt;/p&gt;&lt;p&gt;Finally, there is a &lt;em&gt;Handle&lt;/em&gt; property exposing a handle which is signalled when resource count is 0 and unsignalled otherwise.&lt;/p&gt;&lt;pre class="pas-source"&gt;  TOmniResourceCount = &lt;span class="pas-kwd"&gt;class&lt;/span&gt;(TInterfacedObject, IOmniResourceCount)&lt;br /&gt;  strict &lt;span class="pas-kwd"&gt;private&lt;/span&gt;&lt;br /&gt;    orcAvailable   : TDSiEventHandle;&lt;br /&gt;    orcHandle      : TDSiEventHandle;&lt;br /&gt;    orcLock        : TOmniCS;&lt;br /&gt;    orcNumResources: TGp4AlignedInt;&lt;br /&gt;  &lt;span class="pas-kwd"&gt;protected&lt;/span&gt;&lt;br /&gt;    &lt;span class="pas-kwd"&gt;function&lt;/span&gt; GetHandle: THandle;&lt;br /&gt;  &lt;span class="pas-kwd"&gt;public&lt;/span&gt;&lt;br /&gt;    &lt;span class="pas-kwd"&gt;constructor&lt;/span&gt; Create(initialCount: cardinal);&lt;br /&gt;    &lt;span class="pas-kwd"&gt;destructor&lt;/span&gt;  Destroy; &lt;span class="pas-kwd"&gt;override&lt;/span&gt;;&lt;br /&gt;    &lt;span class="pas-kwd"&gt;function&lt;/span&gt;  Allocate: cardinal; &lt;span class="pas-kwd"&gt;inline&lt;/span&gt;;&lt;br /&gt;    &lt;span class="pas-kwd"&gt;function&lt;/span&gt;  Release: cardinal;&lt;br /&gt;    &lt;span class="pas-kwd"&gt;function&lt;/span&gt;  TryAllocate(&lt;span class="pas-kwd"&gt;var&lt;/span&gt; resourceCount: cardinal; timeout_ms: cardinal = &lt;span class="pas-num"&gt;0&lt;/span&gt;): boolean;&lt;br /&gt;    &lt;span class="pas-kwd"&gt;property&lt;/span&gt; Handle: THandle &lt;span class="pas-kwd"&gt;read&lt;/span&gt; GetHandle;&lt;br /&gt;  &lt;span class="pas-kwd"&gt;end&lt;/span&gt;; &lt;span class="pas-comment"&gt;{ TOmniResourceCount }&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;Internally, &lt;em&gt;orcNumResources&lt;/em&gt; is used to manage the resource count, &lt;em&gt;orcLock&lt;/em&gt; provides internal locking (I never said that my inverse semaphore is lock free), &lt;em&gt;orcHandle&lt;/em&gt; is externally visible event that gets signalled when resource count drops to zero and &lt;em&gt;orcAvailable&lt;/em&gt; is an internal event which is signalled when resource count is above zero (just like in a standard semaphore).&lt;/p&gt;&lt;p&gt;Some parts are really really (really!) simple.&lt;/p&gt;&lt;pre class="pas-source"&gt;&lt;span class="pas-kwd"&gt;constructor&lt;/span&gt; TOmniResourceCount.Create(initialCount: cardinal);&lt;br /&gt;&lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;  &lt;span class="pas-kwd"&gt;inherited&lt;/span&gt; Create;&lt;br /&gt;  orcHandle := CreateEvent(&lt;span class="pas-kwd"&gt;nil&lt;/span&gt;, true, (initialCount = &lt;span class="pas-num"&gt;0&lt;/span&gt;), &lt;span class="pas-kwd"&gt;nil&lt;/span&gt;);&lt;br /&gt;  orcAvailable := CreateEvent(&lt;span class="pas-kwd"&gt;nil&lt;/span&gt;, true, (initialCount &amp;lt;&amp;gt; &lt;span class="pas-num"&gt;0&lt;/span&gt;), &lt;span class="pas-kwd"&gt;nil&lt;/span&gt;);&lt;br /&gt;  orcNumResources.Value := initialCount;&lt;br /&gt;&lt;span class="pas-kwd"&gt;end&lt;/span&gt;; &lt;span class="pas-comment"&gt;{ TOmniResourceCount.Create }&lt;/span&gt; &lt;br /&gt;&lt;br /&gt;&lt;span class="pas-kwd"&gt;destructor&lt;/span&gt; TOmniResourceCount.Destroy;&lt;br /&gt;&lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;  DSiCloseHandleAndNull(orcHandle);&lt;br /&gt;  DSiCloseHandleAndNull(orcAvailable);&lt;br /&gt;  &lt;span class="pas-kwd"&gt;inherited&lt;/span&gt;;&lt;br /&gt;&lt;span class="pas-kwd"&gt;end&lt;/span&gt;; &lt;span class="pas-comment"&gt;{ TOmniResourceCount.Destroy }&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="pas-kwd"&gt;function&lt;/span&gt; TOmniResourceCount.GetHandle: THandle;&lt;br /&gt;&lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;  Result := orcHandle;&lt;br /&gt;&lt;span class="pas-kwd"&gt;end&lt;/span&gt;; &lt;span class="pas-comment"&gt;{ TOmniResourceCount.GetHandle }&lt;/span&gt; &lt;br /&gt;&lt;br /&gt;&lt;span class="pas-kwd"&gt;function&lt;/span&gt; TOmniResourceCount.Allocate: cardinal;&lt;br /&gt;&lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;  TryAllocate(Result, INFINITE);&lt;br /&gt;&lt;span class="pas-kwd"&gt;end&lt;/span&gt;; &lt;span class="pas-comment"&gt;{ TOmniResourceCount.Allocate }&lt;/span&gt; &lt;/pre&gt;&lt;p&gt;Release is only slightly more complicated as it has to provide atomic ‘change, test and signal’ operation.&lt;/p&gt;&lt;pre class="pas-source"&gt;&lt;span class="pas-kwd"&gt;function&lt;/span&gt; TOmniResourceCount.Release: cardinal;&lt;br /&gt;&lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;  orcLock.Acquire;&lt;br /&gt;  &lt;span class="pas-kwd"&gt;try&lt;/span&gt;&lt;br /&gt;    Result := cardinal(orcNumResources.Increment);&lt;br /&gt;    &lt;span class="pas-kwd"&gt;if&lt;/span&gt; Result = &lt;span class="pas-num"&gt;1&lt;/span&gt; &lt;span class="pas-kwd"&gt;then&lt;/span&gt; &lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;      ResetEvent(orcHandle);&lt;br /&gt;      SetEvent(orcAvailable);&lt;br /&gt;    &lt;span class="pas-kwd"&gt;end&lt;/span&gt;;&lt;br /&gt;  &lt;span class="pas-kwd"&gt;finally&lt;/span&gt; orcLock.Release; &lt;span class="pas-kwd"&gt;end&lt;/span&gt;;&lt;br /&gt;&lt;span class="pas-kwd"&gt;end&lt;/span&gt;; &lt;span class="pas-comment"&gt;{ TOmniResourceCount.Release }&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;Now TryAllocate, that’s the problematic one. Lets take a look at how it would be defined if there was no &lt;em&gt;timeout&lt;/em&gt; parameter.&lt;/p&gt;&lt;pre class="pas-source"&gt;&lt;span class="pas-kwd"&gt;function&lt;/span&gt; TOmniResourceCount.TryAllocate(&lt;span class="pas-kwd"&gt;var&lt;/span&gt; resourceCount: cardinal): boolean;&lt;br /&gt;&lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;  Result := false;&lt;br /&gt;  orcLock.Acquire;&lt;br /&gt;  &lt;span class="pas-kwd"&gt;repeat&lt;/span&gt;&lt;br /&gt;    &lt;span class="pas-kwd"&gt;if&lt;/span&gt; orcNumResources.Value = &lt;span class="pas-num"&gt;0&lt;/span&gt; &lt;span class="pas-kwd"&gt;then&lt;/span&gt; &lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;      orcLock.Release;&lt;br /&gt;      &lt;span class="pas-kwd"&gt;if&lt;/span&gt; WaitForSingleObject(orcAvailable, INFINITE) &amp;lt;&amp;gt; WAIT_OBJECT_0 &lt;span class="pas-kwd"&gt;then&lt;/span&gt;&lt;br /&gt;        Exit;&lt;br /&gt;      orcLock.Acquire;&lt;br /&gt;    &lt;span class="pas-kwd"&gt;end&lt;/span&gt;;&lt;br /&gt;    &lt;span class="pas-kwd"&gt;if&lt;/span&gt; orcNumResources.Value &amp;gt; &lt;span class="pas-num"&gt;0&lt;/span&gt; &lt;span class="pas-kwd"&gt;then&lt;/span&gt; &lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;      resourceCount := cardinal(orcNumResources.Decrement);&lt;br /&gt;      &lt;span class="pas-kwd"&gt;if&lt;/span&gt; resourceCount = &lt;span class="pas-num"&gt;0&lt;/span&gt; &lt;span class="pas-kwd"&gt;then&lt;/span&gt; &lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;        SetEvent(orcHandle);&lt;br /&gt;        ResetEvent(orcAvailable);&lt;br /&gt;      &lt;span class="pas-kwd"&gt;end&lt;/span&gt;;&lt;br /&gt;      break; &lt;span class="pas-comment"&gt;//repeat&lt;/span&gt;&lt;br /&gt;    &lt;span class="pas-kwd"&gt;end&lt;/span&gt;;&lt;br /&gt;  &lt;span class="pas-kwd"&gt;until&lt;/span&gt; false;&lt;br /&gt;  orcLock.Release; &lt;br /&gt;&lt;span class="pas-kwd"&gt;end&lt;/span&gt;; &lt;span class="pas-comment"&gt;{ TOmniResourceCount.TryAllocate }&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;The code first locks the internal lock. If there are no free resources, it will release the lock (so a &lt;em&gt;Release&lt;/em&gt; in another thread can execute), wait on the &lt;em&gt;available&lt;/em&gt; handle to become signalled (that will happen when a &lt;em&gt;Release&lt;/em&gt; is called) and relock the internal lock. Then it will again check the resource count (it might get allocated by another thread between the &lt;em&gt;WaitForSingleObject&lt;/em&gt; and &lt;em&gt;Acquire&lt;/em&gt;) and decrement it if a resource is available. When resource count drops to zero, events are set/reset appropriately.&lt;/p&gt;&lt;p&gt;In reality, TryAllocate doesn’t loop infinitely (well, it does if you pass it the INFINITE timeout) and calculates appropriate timeout that is passed to the &lt;em&gt;WaitForSingleObject&lt;/em&gt;. Check the &lt;a href="http://code.google.com/p/omnithreadlibrary/source/browse/trunk/OtlSync.pas#360" target="_blank"&gt;source&lt;/a&gt; if you want to learn how that is done.&lt;/p&gt;&lt;p&gt;That’s about all that can be written about TOmniResourceCount. Next time, I’ll tackle something much more interesting – a microlocking, O(1) insert/remove (well, most of the time ;) ), dynamically allocated queue.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;font size=-2&gt;---&lt;br/&gt;Published under the &lt;a href="http://creativecommons.org/licenses/by/3.0/"&gt;Creative Commons Attribution 3.0&lt;/a&gt; license&lt;/font&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/29331675-8924600292131012429?l=17slon.com%2Fblogs%2Fgabr%2Fblogger.html' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/8924600292131012429/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://17slon.com/blogs/gabr/2010/01/three-steps-to-blocking-collection-1.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/posts/default/8924600292131012429'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/posts/default/8924600292131012429'/><link rel='alternate' type='text/html' href='http://17slon.com/blogs/gabr/2010/01/three-steps-to-blocking-collection-1.html' title='Three steps to the blocking collection: [1] Inverse semaphore'/><author><name>gabr</name><uri>http://www.blogger.com/profile/06903558857617342477</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='12941540036853261067'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-29331675.post-4070748011852359357</id><published>2009-12-30T22:18:00.003+01:00</published><updated>2010-02-07T18:50:51.797+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='open source'/><category scheme='http://www.blogger.com/atom/ns#' term='Delphi'/><category scheme='http://www.blogger.com/atom/ns#' term='OmniThreadLibrary'/><category scheme='http://www.blogger.com/atom/ns#' term='multithreading'/><category scheme='http://www.blogger.com/atom/ns#' term='programming'/><title type='text'>A gift to all multithreaded Delphi programmers</title><content type='html'>&lt;p&gt;A (very) prerelease version 1.05, available via &lt;a href="http://omnithreadlibrary.googlecode.com/svn/tags/pre-1.05" target="_blank"&gt;SVN&lt;/a&gt; or as a &lt;a href="http://omnithreadlibrary.googlecode.com/files/OmniThreadLibrary-pre-1.05.zip" target="_blank"&gt;ZIP archive&lt;/a&gt;.&lt;/p&gt;  &lt;p&gt;I’ve managed to produce two interesting data structures:&lt;/p&gt;  &lt;ul&gt;   &lt;li&gt;TOmniQueue (existing class TOmniQueue was renamed to TOmniBoundedQueue) is a dynamically allocated, O(1) enqueue and dequeue, threadsafe,&amp;#160; microlocking queue. The emphasys is on &lt;em&gt;dynamically allocated&lt;/em&gt;. In other words – it grows and shrinks!&lt;/li&gt;    &lt;li&gt;TOmniBlockingCollection is a partial clone (with some enhancements) of .NET’s &lt;a href="http://msdn.microsoft.com/en-us/library/dd267312(VS.100).aspx" target="_blank"&gt;BlockingCollection&lt;/a&gt;.&lt;/li&gt; &lt;/ul&gt;  &lt;p&gt;Have fun and happy new year to all Delphi programmers!&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;font size=-2&gt;---&lt;br/&gt;Published under the &lt;a href="http://creativecommons.org/licenses/by/3.0/"&gt;Creative Commons Attribution 3.0&lt;/a&gt; license&lt;/font&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/29331675-4070748011852359357?l=17slon.com%2Fblogs%2Fgabr%2Fblogger.html' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/4070748011852359357/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://17slon.com/blogs/gabr/2009/12/gift-for-all-multithreaded-delphi.html#comment-form' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/posts/default/4070748011852359357'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/posts/default/4070748011852359357'/><link rel='alternate' type='text/html' href='http://17slon.com/blogs/gabr/2009/12/gift-for-all-multithreaded-delphi.html' title='A gift to all multithreaded Delphi programmers'/><author><name>gabr</name><uri>http://www.blogger.com/profile/06903558857617342477</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='12941540036853261067'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-29331675.post-8988932452100729304</id><published>2009-12-18T20:46:00.002+01:00</published><updated>2010-02-07T18:51:11.821+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='open source'/><category scheme='http://www.blogger.com/atom/ns#' term='Delphi'/><category scheme='http://www.blogger.com/atom/ns#' term='OmniThreadLibrary'/><category scheme='http://www.blogger.com/atom/ns#' term='multithreading'/><category scheme='http://www.blogger.com/atom/ns#' term='source code'/><category scheme='http://www.blogger.com/atom/ns#' term='programming'/><title type='text'>OmniThreadLibrary 1.04b – It’s all Embarcadero’s fault</title><content type='html'>&lt;p&gt;Delphi 2010 Update 2/3 broke OmniThreadLibrary, but as this update was revoked, I didn’t look into the problem at all.&lt;/p&gt;  &lt;p&gt;Now that Update 4/5 is out and OTL is still broken I had no choice but to fix it. Luckily for me, &lt;em&gt;ahwux&lt;/em&gt; &lt;a href="http://otl.17slon.com/forum/index.php?topic=65.0" target="_blank"&gt;did most of the work&lt;/a&gt; in detecting the problem and providing (at least partial) fix.&lt;/p&gt;  &lt;p&gt;OTL is written without resorting to ugly hacks (at least whenever possible). So what could they do to break my code?&lt;/p&gt;  &lt;p&gt;OTL uses RTTI information to implement &lt;a title="Erlangenizing the OmniThreadLibrary" href="http://17slon.com/blogs/gabr/2008/10/erlangenizing-omnithreadlibrary.html" target="_blank"&gt;‘call by name’ mechanism&lt;/a&gt;. And that’s not the basic RTTI, implemented in TypInfo unit, but extended class-RTTI from ObjAuto. [In case you want to take a peek at the code – the relevant bits can be found in method TOmniTaskExecutor.GetMethodAddrAndSignature inside the &lt;a href="http://code.google.com/p/omnithreadlibrary/source/browse/trunk/OtlTaskControl.pas" target="_blank"&gt;OtlTaskControl&lt;/a&gt; unit.] The code checks the method signature (number of parameters, their types and the way they are passed to the method) to see if it matches one of three supported signatures.&lt;/p&gt;  &lt;p&gt;For example, first parameter must be the &lt;em&gt;Self&lt;/em&gt; object and the code checked this by testing &lt;em&gt;(params^.Flags = []) and (paramType^.Kind = tkClass)&lt;/em&gt;. This worked in Delphi 2007, 2009, and 2010 – but only in the original release and Update 1. Starting with the Update 2, &lt;em&gt;params^.Flags&lt;/em&gt; equals &lt;em&gt;[pfAddress]&lt;/em&gt; in this case.&lt;/p&gt;  &lt;p&gt;Similarly, constant parameters had flags &lt;em&gt;[pfVar]&lt;/em&gt; up to D2010 Update 1 while this changed to &lt;em&gt;[pfConst, pfReference] &lt;/em&gt;in D2010 Update 2.&lt;/p&gt;  &lt;p&gt;I’m not against those changes. After all, the RTTI parameter description is now much more accurate. But why do they have to make this change in an update!? [Yes, I’m screaming.]&lt;/p&gt;  &lt;p&gt;The problem here is that I can’t detect during the compilation whether the Update 4 has been installed. I can easily check for Delphi 2010, but that’s all – there’s no way (I’m aware of) of detecting which update is installed. So now my code looks like this:&lt;/p&gt;  &lt;pre class="pas-source"&gt;  &lt;span class="pas-kwd"&gt;function&lt;/span&gt; VerifyObjectFlags(flags, requiredFlags: TParamFlags): boolean;&lt;br /&gt;  &lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;    Result := ((flags * requiredFlags) = requiredFlags);&lt;br /&gt;    &lt;span class="pas-kwd"&gt;if&lt;/span&gt; &lt;span class="pas-kwd"&gt;not&lt;/span&gt; Result &lt;span class="pas-kwd"&gt;then&lt;/span&gt;&lt;br /&gt;      Exit;&lt;br /&gt;    flags := flags - requiredFlags;&lt;br /&gt;    &lt;span class="pas-preproc"&gt;{$IF CompilerVersion &amp;lt; 21}&lt;/span&gt;&lt;br /&gt;    Result := (flags = []);&lt;br /&gt;    &lt;span class="pas-preproc"&gt;{$ELSEIF CompilerVersion = 21}&lt;/span&gt;&lt;br /&gt;    &lt;span class="pas-comment"&gt;// Delphi 2010 original and Update 1: []&lt;/span&gt;&lt;br /&gt;    &lt;span class="pas-comment"&gt;// Delphi 2010 while Update 2 and 4: [pfAddress]&lt;/span&gt;&lt;br /&gt;    Result := (flags = []) &lt;span class="pas-kwd"&gt;or&lt;/span&gt; (flags = [pfAddress]);&lt;br /&gt;    &lt;span class="pas-preproc"&gt;{$ELSE}&lt;/span&gt; &lt;span class="pas-comment"&gt;// best guess&lt;/span&gt;&lt;br /&gt;    Result := (flags = [pfAddress]);&lt;br /&gt;    &lt;span class="pas-preproc"&gt;{$IFEND}&lt;/span&gt;&lt;br /&gt;  &lt;span class="pas-kwd"&gt;end&lt;/span&gt;; &lt;span class="pas-comment"&gt;{ VerifyObjectFlags }&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;  &lt;span class="pas-kwd"&gt;function&lt;/span&gt; VerifyConstFlags(flags: TParamFlags): boolean;&lt;br /&gt;  &lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;    &lt;span class="pas-preproc"&gt;{$IF CompilerVersion &amp;lt; 21}&lt;/span&gt;&lt;br /&gt;    Result := (flags = [pfVar]);&lt;br /&gt;    &lt;span class="pas-preproc"&gt;{$ELSEIF CompilerVersion = 21}&lt;/span&gt;&lt;br /&gt;    &lt;span class="pas-comment"&gt;// Delphi 2010 original and Update 1: [pfVar]&lt;/span&gt;&lt;br /&gt;    &lt;span class="pas-comment"&gt;// Delphi 2010 Update 2 and 4: [pfConst, pfReference]&lt;/span&gt;&lt;br /&gt;    Result := (flags = [pfVar]) &lt;span class="pas-kwd"&gt;or&lt;/span&gt; (flags = [pfConst, pfReference]);&lt;br /&gt;    &lt;span class="pas-preproc"&gt;{$ELSE}&lt;/span&gt; &lt;span class="pas-comment"&gt;// best guess&lt;/span&gt;&lt;br /&gt;    Result := (flags = [pfConst, pfReference]);&lt;br /&gt;    &lt;span class="pas-preproc"&gt;{$IFEND}&lt;/span&gt;&lt;br /&gt;  &lt;span class="pas-kwd"&gt;end&lt;/span&gt;; &lt;span class="pas-comment"&gt;{ VerifyConstFlags }&lt;/span&gt;&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;p&gt;Ugly!&lt;/p&gt;&lt;br /&gt;&lt;br /&gt;&lt;p&gt;If anybody from Embarcadero is reading this: Could you please refrain from doing such changes in IDE updates? Thanks in advance.&lt;/p&gt;&lt;br /&gt;&lt;br /&gt;&lt;p&gt;Oh, I almost forgot – OTL 1.04b is &lt;a href="http://code.google.com/p/omnithreadlibrary/downloads/list" target="_blank"&gt;available on the Google Code.&lt;/a&gt;&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;font size=-2&gt;---&lt;br/&gt;Published under the &lt;a href="http://creativecommons.org/licenses/by/3.0/"&gt;Creative Commons Attribution 3.0&lt;/a&gt; license&lt;/font&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/29331675-8988932452100729304?l=17slon.com%2Fblogs%2Fgabr%2Fblogger.html' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/8988932452100729304/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://17slon.com/blogs/gabr/2009/12/omnithreadlibrary-104b-its-all.html#comment-form' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/posts/default/8988932452100729304'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/posts/default/8988932452100729304'/><link rel='alternate' type='text/html' href='http://17slon.com/blogs/gabr/2009/12/omnithreadlibrary-104b-its-all.html' title='OmniThreadLibrary 1.04b – It’s all Embarcadero’s fault'/><author><name>gabr</name><uri>http://www.blogger.com/profile/06903558857617342477</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='12941540036853261067'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-29331675.post-3576203589319782746</id><published>2009-12-13T15:41:00.001+01:00</published><updated>2009-12-13T15:41:17.710+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='open source'/><category scheme='http://www.blogger.com/atom/ns#' term='Delphi'/><title type='text'>DsiWin31 1.53a</title><content type='html'>&lt;p&gt;&lt;a href="http://gp.17slon.com/gp/dsiwin32.htm" target="_blank"&gt;This release&lt;/a&gt; fixes nasty bug (introduced in release 1.51) which caused various TDSiRegistry function (and other DSi code using those functions) to fail on Delphi 2009/2010.&lt;/p&gt;  &lt;p&gt;Other changes:&lt;/p&gt;  &lt;ul&gt;   &lt;li&gt;Implemented DSiDeleteRegistryValue.&lt;/li&gt;    &lt;li&gt;Added parameter 'access' to the DSiKillRegistry.&lt;/li&gt;    &lt;li&gt;[Mitja] Fixed allocation in DSiGetUserName.&lt;/li&gt;    &lt;li&gt;[Mitja] Also catch 'error' output in DSiExecuteAndCapture.&lt;/li&gt;    &lt;li&gt;DSiAddApplicationToFirewallExceptionList renamed to DSiAddApplicationToFirewallExceptionListXP.&lt;/li&gt;    &lt;li&gt;Added DSiAddApplicationToFirewallExceptionListAdvanced which uses Advanced Firewall interface, available on Vista+.&lt;/li&gt;    &lt;li&gt;DSiAddApplicationToFirewallExceptionList now calls either DSiAddApplicationToFirewallExceptionListXP or DSiAddApplicationToFirewallExceptionListAdvanced, depending on OS version.&lt;/li&gt;    &lt;li&gt;Implemented functions to remove application from the firewall exception list: DSiRemoveApplicationFromFirewallExceptionList, DSiRemoveApplicationFromFirewallExceptionListAdvanced, DSiRemoveApplicationFromFirewallExceptionListXP.&lt;/li&gt; &lt;/ul&gt;  &lt;div class="blogger-post-footer"&gt;&lt;font size=-2&gt;---&lt;br/&gt;Published under the &lt;a href="http://creativecommons.org/licenses/by/3.0/"&gt;Creative Commons Attribution 3.0&lt;/a&gt; license&lt;/font&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/29331675-3576203589319782746?l=17slon.com%2Fblogs%2Fgabr%2Fblogger.html' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/3576203589319782746/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://17slon.com/blogs/gabr/2009/12/dsiwin31-153a.html#comment-form' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/posts/default/3576203589319782746'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/posts/default/3576203589319782746'/><link rel='alternate' type='text/html' href='http://17slon.com/blogs/gabr/2009/12/dsiwin31-153a.html' title='DsiWin31 1.53a'/><author><name>gabr</name><uri>http://www.blogger.com/profile/06903558857617342477</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='12941540036853261067'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-29331675.post-5619904183900974532</id><published>2009-12-13T15:04:00.002+01:00</published><updated>2010-02-07T18:51:32.911+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='open source'/><category scheme='http://www.blogger.com/atom/ns#' term='Delphi'/><category scheme='http://www.blogger.com/atom/ns#' term='OmniThreadLibrary'/><category scheme='http://www.blogger.com/atom/ns#' term='multithreading'/><title type='text'>OmniThreadLibrary 1.04a</title><content type='html'>&lt;p&gt;This minor release was released mostly because of exception handling problems when thread pool was used in version 1.04. If you’re using thread pool feature and have OTL 1.04 installed, I’d strongly urge you to upgrade.&lt;/p&gt;  &lt;p&gt;Besides code fix I sneaked in a small API upgrade. IOmniTask interface now defines methods RegisterWaitObject/UnregisterWaitObject which the task can use to wait on any waitable object when using TOmniWorker approach (no main thread loop). There’s also a new demo application 31_WaitableObjects which demonstrates the use of this feature.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;font size=-2&gt;---&lt;br/&gt;Published under the &lt;a href="http://creativecommons.org/licenses/by/3.0/"&gt;Creative Commons Attribution 3.0&lt;/a&gt; license&lt;/font&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/29331675-5619904183900974532?l=17slon.com%2Fblogs%2Fgabr%2Fblogger.html' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/5619904183900974532/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://17slon.com/blogs/gabr/2009/12/omnithreadlibrary-104a.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/posts/default/5619904183900974532'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/posts/default/5619904183900974532'/><link rel='alternate' type='text/html' href='http://17slon.com/blogs/gabr/2009/12/omnithreadlibrary-104a.html' title='OmniThreadLibrary 1.04a'/><author><name>gabr</name><uri>http://www.blogger.com/profile/06903558857617342477</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='12941540036853261067'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-29331675.post-1570873311312221531</id><published>2009-11-30T08:58:00.001+01:00</published><updated>2010-02-07T18:51:55.284+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='open source'/><category scheme='http://www.blogger.com/atom/ns#' term='Delphi'/><category scheme='http://www.blogger.com/atom/ns#' term='OmniThreadLibrary'/><category scheme='http://www.blogger.com/atom/ns#' term='multithreading'/><category scheme='http://www.blogger.com/atom/ns#' term='source code'/><category scheme='http://www.blogger.com/atom/ns#' term='programming'/><title type='text'>OmniThreadLibrary patterns – Task controller needs an owner</title><content type='html'>&lt;p&gt;Pop quiz. What’s wrong with this code?&lt;/p&gt;&lt;pre class="pas-source"&gt;CreateTask(MyWorker).Run;&lt;/pre&gt;&lt;p&gt;Looks fine, but it doesn’t work. In most cases, running this code fragment would cause immediate access violation.&lt;/p&gt;&lt;p&gt;This is a common problem amongst new OTL users. Heck, even I have fallen into this trap!&lt;/p&gt;&lt;p&gt;The problem here is that &lt;a href="http://17slon.com/blogs/gabr/2008/09/omnithreadlibrary-patterns-how-to-not.html" target="_blank"&gt;CreateTask&lt;/a&gt; returns IOmniTaskControl interface, or &lt;em&gt;task controller&lt;/em&gt;. This interface must be stored into some persistent location, or task controller would be destroyed immediately after Run is called (because the reference count would fall to 0).&lt;/p&gt;&lt;p&gt;A common solution is to just store the interface in some field.&lt;/p&gt;&lt;pre class="pas-source"&gt;FTaskControl := CreateTask(MyWorker).Run;&lt;/pre&gt;&lt;p&gt;When you don’t need background worker anymore, you should terminate the task and free the task controller.&lt;/p&gt;&lt;pre class="pas-source"&gt;FTaskControl.Terminate;&lt;br /&gt;&lt;br /&gt;FTaskControl := &lt;span class="pas-kwd"&gt;nil&lt;/span&gt;;&lt;/pre&gt;&lt;p&gt;This works for background workers with long life span – for example if there’s a background thread running all the time the program itself is running. But what if you are starting a short-term background task? In this case you should monitor it with TOmniEventMonitor and cleanup task controller reference in OnTerminate event handler.&lt;/p&gt;&lt;pre class="pas-source"&gt;FTaskControl := CreateTask(MyWorker).MonitorWith(eventMonitor).Run;&lt;/pre&gt;&lt;p&gt;In eventMonitor.OnTerminate:&lt;/p&gt;&lt;pre class="pas-source"&gt;FTaskControl := &lt;span class="pas-kwd"&gt;nil&lt;/span&gt;;&lt;/pre&gt;&lt;p&gt;As it turns out, event monitor keeps task controller interface stored in its own list, which will also keep the task controller alive. That’s why the following code also works.&lt;/p&gt;&lt;pre class="pas-source"&gt;CreateTask(MyWorker).MonitorWith(eventMonitor).Run;&lt;/pre&gt;&lt;p&gt;Since OTL v1.04 you have another possibility – write a method to free the task controller and pass it to the OnTerminated.&lt;/p&gt;&lt;pre class="pas-source"&gt;FTaskControl := CreateTask(MyWorker).OnTerminated(FreeTaskControl).Run;&lt;br /&gt;&lt;br /&gt;&lt;span class="pas-kwd"&gt;procedure&lt;/span&gt; FreeTaskControl(&lt;span class="pas-kwd"&gt;const&lt;/span&gt; task: IOmniTaskControl);&lt;br /&gt;&lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;  FTaskControl := &lt;span class="pas-kwd"&gt;nil&lt;/span&gt;;&lt;br /&gt;&lt;span class="pas-kwd"&gt;end&lt;/span&gt;;&lt;/pre&gt;&lt;p&gt;If you’re using Delphi 2009 or 2010, you can put the cleanup code in anonymous method.&lt;/p&gt;&lt;pre class="pas-source"&gt;FTaskControl := CreateTask(MyWorker).OnTerminated(&lt;br /&gt;&lt;span class="pas-kwd"&gt;procedure&lt;/span&gt;(&lt;span class="pas-kwd"&gt;const&lt;/span&gt; task: IOmniTaskControl) &lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;  FTaskControl := &lt;span class="pas-kwd"&gt;nil&lt;/span&gt;;&lt;br /&gt;&lt;span class="pas-kwd"&gt;end&lt;/span&gt;)&lt;br /&gt;.Run;&lt;/pre&gt;&lt;p&gt;OnTerminated does its magic by hooking task controller into internal event monitor. Therefore, you can get real tricky and just write “null” OnTerminated.&lt;/p&gt;&lt;pre class="pas-source"&gt;CreateTask(MyWorker).OnTerminated(DoNothing).Run;&lt;br /&gt;&lt;br /&gt;&lt;span class="pas-kwd"&gt;procedure&lt;/span&gt; DoNothing(&lt;span class="pas-kwd"&gt;const&lt;/span&gt; task: IOmniTaskControl);&lt;br /&gt;&lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;&lt;span class="pas-kwd"&gt;end&lt;/span&gt;;&lt;/pre&gt;&lt;p&gt;As that looks quite ugly, I’ve added method Unobserved just few days before version 1.04 was released. This method does essentially the same as the “null” OnTerminated approach, except that the code looks nicer and programmers intentions are more clearly expressed.&lt;/p&gt;&lt;pre class="pas-source"&gt;CreateTask(MyWorker).Unobserved.Run;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;font size=-2&gt;---&lt;br/&gt;Published under the &lt;a href="http://creativecommons.org/licenses/by/3.0/"&gt;Creative Commons Attribution 3.0&lt;/a&gt; license&lt;/font&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/29331675-1570873311312221531?l=17slon.com%2Fblogs%2Fgabr%2Fblogger.html' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/1570873311312221531/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://17slon.com/blogs/gabr/2009/11/omnithreadlibrary-patterns-task.html#comment-form' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/posts/default/1570873311312221531'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/posts/default/1570873311312221531'/><link rel='alternate' type='text/html' href='http://17slon.com/blogs/gabr/2009/11/omnithreadlibrary-patterns-task.html' title='OmniThreadLibrary patterns – Task controller needs an owner'/><author><name>gabr</name><uri>http://www.blogger.com/profile/06903558857617342477</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='12941540036853261067'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>6</thr:total></entry><entry><id>tag:blogger.com,1999:blog-29331675.post-7345052967914416155</id><published>2009-11-23T08:54:00.002+01:00</published><updated>2010-02-07T18:52:14.863+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='open source'/><category scheme='http://www.blogger.com/atom/ns#' term='Delphi'/><category scheme='http://www.blogger.com/atom/ns#' term='OmniThreadLibrary'/><category scheme='http://www.blogger.com/atom/ns#' term='multithreading'/><category scheme='http://www.blogger.com/atom/ns#' term='source code'/><category scheme='http://www.blogger.com/atom/ns#' term='programming'/><title type='text'>OmniThreadLibrary 1.04</title><content type='html'>&lt;p&gt;Stable release is out! Get it while it’s still hot!&lt;/p&gt;  &lt;p align="left"&gt;&lt;a href="http://code.google.com/p/omnithreadlibrary/downloads/list" target="_blank"&gt;Click to download!&lt;/a&gt;&lt;/p&gt;  &lt;p align="left"&gt;New since 1.04 alpha:&lt;/p&gt;  &lt;ul&gt;   &lt;li&gt;     &lt;div align="left"&gt;Bugfixes in the thread pool code.&lt;/div&gt;   &lt;/li&gt;    &lt;li&gt;     &lt;div align="left"&gt;Implemented IOmniTaskControl.Unobserved behaviour modifier.&lt;/div&gt;   &lt;/li&gt;    &lt;li&gt;     &lt;div align="left"&gt;D2010 designtime package fixed.&lt;/div&gt;   &lt;/li&gt;    &lt;li&gt;     &lt;div align="left"&gt;D2009 packages and test project group updated (thanks to mghie).&lt;/div&gt;   &lt;/li&gt; &lt;/ul&gt;  &lt;p align="left"&gt;New since 1.03: &lt;a href="http://17slon.com/blogs/gabr/2009/11/omnithreadlibrary-104-alpha.html" target="_blank"&gt;read full list&lt;/a&gt;.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;font size=-2&gt;---&lt;br/&gt;Published under the &lt;a href="http://creativecommons.org/licenses/by/3.0/"&gt;Creative Commons Attribution 3.0&lt;/a&gt; license&lt;/font&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/29331675-7345052967914416155?l=17slon.com%2Fblogs%2Fgabr%2Fblogger.html' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/7345052967914416155/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://17slon.com/blogs/gabr/2009/11/omnithreadlibrary-104.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/posts/default/7345052967914416155'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/posts/default/7345052967914416155'/><link rel='alternate' type='text/html' href='http://17slon.com/blogs/gabr/2009/11/omnithreadlibrary-104.html' title='OmniThreadLibrary 1.04'/><author><name>gabr</name><uri>http://www.blogger.com/profile/06903558857617342477</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='12941540036853261067'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-29331675.post-5298995792601809584</id><published>2009-11-17T13:17:00.002+01:00</published><updated>2010-02-07T18:52:34.408+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='open source'/><category scheme='http://www.blogger.com/atom/ns#' term='Delphi'/><category scheme='http://www.blogger.com/atom/ns#' term='OmniThreadLibrary'/><category scheme='http://www.blogger.com/atom/ns#' term='multithreading'/><category scheme='http://www.blogger.com/atom/ns#' term='source code'/><category scheme='http://www.blogger.com/atom/ns#' term='programming'/><title type='text'>OmniThreadLibrary 1.04 now in beta</title><content type='html'>&lt;p&gt;I’ve released OTL 1.04 beta, which is functionally the same as the &lt;a href="http://17slon.com/blogs/gabr/2009/11/omnithreadlibrary-104-alpha.html" target="_blank"&gt;alpha release&lt;/a&gt; but contains some bug fixes. You can download it from &lt;a href="http://code.google.com/p/omnithreadlibrary/downloads/list" target="_blank"&gt;Google Code&lt;/a&gt;.&lt;/p&gt;  &lt;p&gt;1.04 final will be released on 2009-11-23, i.e. next Monday.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;font size=-2&gt;---&lt;br/&gt;Published under the &lt;a href="http://creativecommons.org/licenses/by/3.0/"&gt;Creative Commons Attribution 3.0&lt;/a&gt; license&lt;/font&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/29331675-5298995792601809584?l=17slon.com%2Fblogs%2Fgabr%2Fblogger.html' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/5298995792601809584/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://17slon.com/blogs/gabr/2009/11/omnithreadlibrary-104-now-in-beta.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/posts/default/5298995792601809584'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/posts/default/5298995792601809584'/><link rel='alternate' type='text/html' href='http://17slon.com/blogs/gabr/2009/11/omnithreadlibrary-104-now-in-beta.html' title='OmniThreadLibrary 1.04 now in beta'/><author><name>gabr</name><uri>http://www.blogger.com/profile/06903558857617342477</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='12941540036853261067'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-29331675.post-4243047362712076524</id><published>2009-11-13T20:44:00.002+01:00</published><updated>2010-02-07T18:52:59.189+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='open source'/><category scheme='http://www.blogger.com/atom/ns#' term='Delphi'/><category scheme='http://www.blogger.com/atom/ns#' term='OmniThreadLibrary'/><category scheme='http://www.blogger.com/atom/ns#' term='multithreading'/><category scheme='http://www.blogger.com/atom/ns#' term='source code'/><category scheme='http://www.blogger.com/atom/ns#' term='programming'/><title type='text'>OmniThreadLibrary 1.04 alpha</title><content type='html'>&lt;p&gt;Not yet beta as I still have to fix few TODOs …&lt;/p&gt;  &lt;p&gt;&lt;a title="OmniThradLibrary @ Google Code" href="http://code.google.com/p/omnithreadlibrary/downloads/list" target="_blank"&gt;Get it here.&lt;/a&gt;&lt;/p&gt;  &lt;h2&gt;COMPATIBILITY ISSUES&lt;/h2&gt;  &lt;ul&gt;   &lt;li&gt;Changed semantics in comm event notifications! When you get the 'new message' event, read all messages from the queue in a loop! &lt;/li&gt;    &lt;li&gt;Message is passed to the TOmniEventMonitor.OnTaskMessage handler. There's no need to read from Comm queue in the handler. &lt;/li&gt;    &lt;li&gt;Exceptions in tasks are now visible by default. To hide them, use IOmniTaskControl.SilentExceptions. Test 13_Exceptions was improved to demonstrate this behaviour. &lt;/li&gt; &lt;/ul&gt;  &lt;h2&gt;Other changes&lt;/h2&gt;  &lt;ul&gt;   &lt;li&gt;Works with Delphi 2010. &lt;/li&gt;    &lt;li&gt;Default communication queue size reduced to 1000 messages. &lt;/li&gt;    &lt;li&gt;Support for 'wait and send' in IOmniCommunicationEndpoint.SendWait. &lt;/li&gt;    &lt;li&gt;Communication subsystem implements observer pattern. &lt;/li&gt;    &lt;li&gt;WideStrings can be send over the communication channel. &lt;/li&gt;    &lt;li&gt;New event TOmniEventMonitor.OnTaskUndeliveredMessage is called after the task is terminated for all messages still waiting in the message queue. &lt;/li&gt;    &lt;li&gt;Implemented automatic event monitor with methods IOmniTaskControl.OnMessage and OnTerminated. Both support 'procedure of object' and 'reference to procedure' parameters. &lt;/li&gt;    &lt;li&gt;New unit OtlSync contains (old) TOmniCS and IOmniCriticalSection together with (new) OmniMREW - very simple and extremely fast multi-reader-exclusive-writer - and atomic CompareAndSwap functions. &lt;/li&gt;    &lt;li&gt;New unit OtlHooks contains API that can be used by external libraries to hook into OTL thread creation/destruction process and into exception chain. &lt;/li&gt;    &lt;li&gt;All known bugs fixed. &lt;/li&gt; &lt;/ul&gt;  &lt;h2&gt;New demos&lt;/h2&gt;  &lt;ul&gt;   &lt;li&gt;25_WaitableComm: Demo for ReceiveWait and SendWait. &lt;/li&gt;    &lt;li&gt;26_MultiEventMonitor: How to run multiple event monitors in parallel. &lt;/li&gt;    &lt;li&gt;27_RecursiveTree: Parallel tree processing. &lt;/li&gt;    &lt;li&gt;28_Hooks: Demo for the new hook system. &lt;/li&gt;    &lt;li&gt;29_ImplicitEventMonitor: Demo for OnMessage and OnTerminated, named method approach. &lt;/li&gt;    &lt;li&gt;30_AnonymousEventMonitor: Demo for OnMessage and OnTerminated, anonymous method approach. &lt;/li&gt; &lt;/ul&gt;  &lt;h2&gt;A teaser from demo 30&lt;/h2&gt;  &lt;pre class="pas-source"&gt;&lt;span class="pas-kwd"&gt;procedure&lt;/span&gt; TfrmAnonymousEventMonitorDemo.btnHelloClick(Sender: TObject);&lt;br /&gt;&lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;  btnHello.Enabled := false;&lt;br /&gt;  FAnonTask := CreateTask(&lt;br /&gt;    &lt;span class="pas-kwd"&gt;procedure&lt;/span&gt; (task: IOmniTask) &lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;      task.Comm.Send(&lt;span class="pas-num"&gt;0&lt;/span&gt;, Format(&lt;span class="pas-str"&gt;'Hello, world! Reporting from thread %d'&lt;/span&gt;,&lt;br /&gt;        [GetCurrentThreadID]));&lt;br /&gt;    &lt;span class="pas-kwd"&gt;end&lt;/span&gt;,&lt;br /&gt;    &lt;span class="pas-str"&gt;'HelloWorld'&lt;/span&gt;)&lt;br /&gt;  .OnMessage(&lt;br /&gt;    &lt;span class="pas-kwd"&gt;procedure&lt;/span&gt;(&lt;span class="pas-kwd"&gt;const&lt;/span&gt; task: IOmniTaskControl; &lt;span class="pas-kwd"&gt;const&lt;/span&gt; msg: TOmniMessage) &lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;      lbLog.ItemIndex := lbLog.Items.Add(Format(&lt;span class="pas-str"&gt;'%d:[%d/%s] %d|%s'&lt;/span&gt;,&lt;br /&gt;        [GetCurrentThreadID, task.UniqueID, task.&lt;span class="pas-kwd"&gt;Name&lt;/span&gt;, msg.msgID,&lt;br /&gt;         msg.msgData.AsString]));&lt;br /&gt;    &lt;span class="pas-kwd"&gt;end&lt;/span&gt;)&lt;br /&gt;  .OnTerminated(&lt;br /&gt;    &lt;span class="pas-kwd"&gt;procedure&lt;/span&gt;(&lt;span class="pas-kwd"&gt;const&lt;/span&gt; task: IOmniTaskControl) &lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br /&gt;      lbLog.ItemIndex := lbLog.Items.Add(Format(&lt;span class="pas-str"&gt;'[%d/%s] Terminated'&lt;/span&gt;,&lt;br /&gt;        [task.UniqueID, task.&lt;span class="pas-kwd"&gt;Name&lt;/span&gt;]));&lt;br /&gt;      btnHello.Enabled := true;&lt;br /&gt;      FAnonTask := &lt;span class="pas-kwd"&gt;nil&lt;/span&gt;;&lt;br /&gt;    &lt;span class="pas-kwd"&gt;end&lt;/span&gt;)&lt;br /&gt;  .Run;&lt;br /&gt;&lt;span class="pas-kwd"&gt;end&lt;/span&gt;;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;font size=-2&gt;---&lt;br/&gt;Published under the &lt;a href="http://creativecommons.org/licenses/by/3.0/"&gt;Creative Commons Attribution 3.0&lt;/a&gt; license&lt;/font&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/29331675-4243047362712076524?l=17slon.com%2Fblogs%2Fgabr%2Fblogger.html' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/4243047362712076524/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://17slon.com/blogs/gabr/2009/11/omnithreadlibrary-104-alpha.html#comment-form' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/posts/default/4243047362712076524'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/posts/default/4243047362712076524'/><link rel='alternate' type='text/html' href='http://17slon.com/blogs/gabr/2009/11/omnithreadlibrary-104-alpha.html' title='OmniThreadLibrary 1.04 alpha'/><author><name>gabr</name><uri>http://www.blogger.com/profile/06903558857617342477</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='12941540036853261067'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-29331675.post-607831959336695084</id><published>2009-11-06T23:46:00.001+01:00</published><updated>2009-11-06T23:46:22.926+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Delphi'/><category scheme='http://www.blogger.com/atom/ns#' term='this I believe'/><title type='text'>Do we need DelphiOverflow.com?</title><content type='html'>&lt;p&gt;Today I was interviewed for &lt;a title="The Podcast at Delphi.org" href="http://www.delphi.org/category/podcast/" target="_blank"&gt;the greatest Delphi podcast of them all&lt;/a&gt; and Jim asked me a question I didn’t know how to answer: “Do you think there should be Delphi equivalent of &lt;a href="http://stackoverflow.com" target="_blank"&gt;StackOverflow.com&lt;/a&gt;?” I’m afraid my answer was somewhere along: “Hmph. Yes. Very good question. Very good. Let’s talk about something else.”&lt;/p&gt;  &lt;p&gt;And now I can’t get it out of my head. Should there be &lt;font color="#004080"&gt;delphioverflow.com&lt;/font&gt;? What could we get out of it? I would be the first to admit that the StackOverflow model is greatest thing since Belgian waffles and that having Delphi questions and answers in such form would be very useful.&lt;/p&gt;  &lt;p&gt;But wait – there already &lt;strong&gt;are&lt;/strong&gt; Delphi questions on StackOverflow! Not that many as C# questions, but still enough that Delphi is seen on the front page and that other users can read about it and see that it is alive and well. Even more – there are enough knowledgeable Delphi programmers on SO and most questions get great answers in less than five minutes.&lt;/p&gt;  &lt;p&gt;What other positive result could such site bring? Maybe Embarcadero people would be more eager to participate and answer questions on their own server? Maybe, but not sure. Delphi R&amp;amp;D team is very busy and sometimes they can’t even find time to answer newsgroup questions. And I’m pretty sure that - whatever such change would bring – newsgroups wouldn’t go away.&lt;/p&gt;  &lt;p&gt;Let’s take a look from another perspective. What would be negative consequences? Less Delphi questions on StackOverflow. And that’s a Bad Thing because it lowers Delphi’s discoverability. We want to talk about Delphi in public places, not on some secluded server!&lt;/p&gt;  &lt;p&gt;Now I know how to answer. No, I don’t think we need DelphiOverflow. We need more Delphi R&amp;amp;D people answering questions on StackOverflow.&lt;/p&gt;  &lt;p&gt;(Your comments on the topic are very much welcome, as always!)&lt;/p&gt;  &lt;div class="blogger-post-footer"&gt;&lt;font size=-2&gt;---&lt;br/&gt;Published under the &lt;a href="http://creativecommons.org/licenses/by/3.0/"&gt;Creative Commons Attribution 3.0&lt;/a&gt; license&lt;/font&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/29331675-607831959336695084?l=17slon.com%2Fblogs%2Fgabr%2Fblogger.html' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/607831959336695084/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://17slon.com/blogs/gabr/2009/11/do-we-need-delphioverflowcom.html#comment-form' title='21 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/posts/default/607831959336695084'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/posts/default/607831959336695084'/><link rel='alternate' type='text/html' href='http://17slon.com/blogs/gabr/2009/11/do-we-need-delphioverflowcom.html' title='Do we need DelphiOverflow.com?'/><author><name>gabr</name><uri>http://www.blogger.com/profile/06903558857617342477</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='12941540036853261067'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>21</thr:total></entry><entry><id>tag:blogger.com,1999:blog-29331675.post-6136837241794873072</id><published>2009-11-04T09:00:00.002+01:00</published><updated>2009-11-04T09:00:01.616+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='open source'/><category scheme='http://www.blogger.com/atom/ns#' term='Delphi'/><category scheme='http://www.blogger.com/atom/ns#' term='programming'/><title type='text'>GpStuff 1.19 &amp; GpLists 1.43</title><content type='html'>I’ll finish my short overview of changes in various Gp units with new GpStuff and GpLists.&lt;/p&gt;&lt;p&gt;Let’s deal with the latter first. There were only two changes. Firstly, &lt;em&gt;Slice&lt;/em&gt;, &lt;em&gt;Walk&lt;/em&gt; and &lt;em&gt;WalkKV&lt;/em&gt; enumerators got the &lt;em&gt;step&lt;/em&gt; parameter. Now Delphi is really as powerful as Basic!&lt;/p&gt;&lt;p&gt;Secondly, I’ve added method &lt;em&gt;FreeObjects&lt;/em&gt; to the &lt;em&gt;TStringList&lt;/em&gt; helper. It will walk the string list and free all associated objects – something that is not done automatically in the &lt;em&gt;TStringList&lt;/em&gt; destructor. Very useful helper, if I can say so.&lt;/p&gt;&lt;pre class="pas-source"&gt;&lt;span class="pas-kwd"&gt;procedure&lt;/span&gt; TGpStringListHelper.FreeObjects;&lt;br/&gt;&lt;span class="pas-kwd"&gt;var&lt;/span&gt;&lt;br/&gt;  iObject: integer;&lt;br/&gt;&lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br/&gt;  &lt;span class="pas-kwd"&gt;for&lt;/span&gt; iObject := &lt;span class="pas-num"&gt;0&lt;/span&gt; &lt;span class="pas-kwd"&gt;to&lt;/span&gt; Count - &lt;span class="pas-num"&gt;1&lt;/span&gt; &lt;span class="pas-kwd"&gt;do&lt;/span&gt; &lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br/&gt;    Objects[iObject].Free;&lt;br/&gt;    Objects[iObject] := &lt;span class="pas-kwd"&gt;nil&lt;/span&gt;;&lt;br/&gt;  &lt;span class="pas-kwd"&gt;end&lt;/span&gt;;&lt;br/&gt;&lt;span class="pas-kwd"&gt;end&lt;/span&gt;; &lt;span class="pas-comment"&gt;{ TGpStringListHelper.FreeObjects }&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;Changes in GpStuff were more significant.&lt;/p&gt;&lt;p&gt;There are new enumerator factories. &lt;em&gt;EnumStrings&lt;/em&gt; allows you do do stuff like this:&lt;/p&gt;&lt;pre class="pas-source"&gt;&lt;span class="pas-kwd"&gt;for&lt;/span&gt; s &lt;span class="pas-kwd"&gt;in&lt;/span&gt; EnumStrings([&lt;span class="pas-str"&gt;'one'&lt;/span&gt;, &lt;span class="pas-str"&gt;'two'&lt;/span&gt;, &lt;span class="pas-str"&gt;'three'&lt;/span&gt;]) &lt;span class="pas-kwd"&gt;do&lt;/span&gt;&lt;br/&gt;  &lt;span class="pas-comment"&gt;// ...&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;&lt;em&gt;EnumValues&lt;/em&gt; will do the same for integer arrays. &lt;em&gt;EnumPairs&lt;/em&gt; is similar to &lt;em&gt;EnumStrings&lt;/em&gt; but returns (key, value) pairs:&lt;/p&gt;&lt;pre class="pas-source"&gt;&lt;span class="pas-kwd"&gt;var&lt;/span&gt;&lt;br/&gt;  kv: TGpStringPair;&lt;br/&gt;&lt;br/&gt;&lt;span class="pas-kwd"&gt;for&lt;/span&gt; kv &lt;span class="pas-kwd"&gt;in&lt;/span&gt; EnumPairs([&lt;span class="pas-str"&gt;'1'&lt;/span&gt;, &lt;span class="pas-str"&gt;'one'&lt;/span&gt;, &lt;span class="pas-str"&gt;'2'&lt;/span&gt;, &lt;span class="pas-str"&gt;'two'&lt;/span&gt;]) &lt;span class="pas-kwd"&gt;do&lt;/span&gt;&lt;br/&gt;  &lt;span class="pas-comment"&gt;// k.key = '1', k.value = 'one'&lt;/span&gt;&lt;br/&gt;  &lt;span class="pas-comment"&gt;// k.key = '2', k.value = 'two'&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;There is also &lt;em&gt;EnumList&lt;/em&gt;, which enumerates lists of items (where the whole list itself is a string):&lt;/p&gt;&lt;pre class="pas-source"&gt;&lt;span class="pas-kwd"&gt;for&lt;/span&gt; s &lt;span class="pas-kwd"&gt;in&lt;/span&gt; EnumList(&lt;span class="pas-str"&gt;'one,two,&amp;quot;one,two,three&amp;quot;'&lt;/span&gt;, &lt;span class="pas-str"&gt;','&lt;/span&gt;, &lt;span class="pas-str"&gt;'&amp;quot;'&lt;/span&gt;) &lt;span class="pas-kwd"&gt;do&lt;/span&gt;&lt;br/&gt;  &lt;span class="pas-comment"&gt;// s = 'one'&lt;/span&gt;&lt;br/&gt;  &lt;span class="pas-comment"&gt;// s = 'two'&lt;/span&gt;&lt;br/&gt;  &lt;span class="pas-comment"&gt;// s = 'one,two,three'&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;There were some changes in &lt;em&gt;TGp4AlignedInt&lt;/em&gt; internals – now all values are integer, not cardinal (because underlying Windows implementation works with integers). There is also new function “Compare and Swap” (&lt;em&gt;CAS&lt;/em&gt;) in &lt;em&gt;TGp4AlignedInt&lt;/em&gt; and &lt;em&gt;TGp8AlignedInt64&lt;/em&gt; (which was previously called &lt;em&gt;TGp8AlignedInt&lt;/em&gt;).&lt;/p&gt;&lt;p&gt;Finally, there are new interface and class - &lt;em&gt;IGpTraceable&lt;/em&gt; and &lt;em&gt;TGpTraceable&lt;/em&gt;.&lt;/p&gt;&lt;pre class="pas-source"&gt;&lt;span class="pas-kwd"&gt;type&lt;/span&gt;&lt;br/&gt;  IGpTraceable = &lt;span class="pas-kwd"&gt;interface&lt;/span&gt;(IInterface)&lt;br/&gt;    &lt;span class="pas-kwd"&gt;function&lt;/span&gt;  GetTraceReferences: boolean; &lt;span class="pas-kwd"&gt;stdcall&lt;/span&gt;;&lt;br/&gt;    &lt;span class="pas-kwd"&gt;procedure&lt;/span&gt; SetTraceReferences(&lt;span class="pas-kwd"&gt;const&lt;/span&gt; value: boolean); &lt;span class="pas-kwd"&gt;stdcall&lt;/span&gt;;&lt;br/&gt;    &lt;span class="pas-kwd"&gt;function&lt;/span&gt;  _AddRef: integer; &lt;span class="pas-kwd"&gt;stdcall&lt;/span&gt;;&lt;br/&gt;    &lt;span class="pas-kwd"&gt;function&lt;/span&gt;  _Release: integer; &lt;span class="pas-kwd"&gt;stdcall&lt;/span&gt;;&lt;br/&gt;    &lt;span class="pas-kwd"&gt;function&lt;/span&gt;  GetRefCount: integer; &lt;span class="pas-kwd"&gt;stdcall&lt;/span&gt;;&lt;br/&gt;    &lt;span class="pas-kwd"&gt;property&lt;/span&gt; TraceReferences: boolean &lt;span class="pas-kwd"&gt;read&lt;/span&gt; GetTraceReferences &lt;span class="pas-kwd"&gt;write&lt;/span&gt; SetTraceReferences;&lt;br/&gt;  &lt;span class="pas-kwd"&gt;end&lt;/span&gt;; &lt;span class="pas-comment"&gt;{ IGpTraceable }&lt;/span&gt;&lt;br/&gt;&lt;br/&gt;  TGpTraceable = &lt;span class="pas-kwd"&gt;class&lt;/span&gt;(TInterfacedObject, IGpTraceable)&lt;br/&gt;  &lt;span class="pas-kwd"&gt;private&lt;/span&gt;&lt;br/&gt;    gtTraceRef: boolean;&lt;br/&gt;  &lt;span class="pas-kwd"&gt;public&lt;/span&gt;&lt;br/&gt;    &lt;span class="pas-kwd"&gt;destructor&lt;/span&gt;  Destroy; &lt;span class="pas-kwd"&gt;override&lt;/span&gt;;&lt;br/&gt;    &lt;span class="pas-kwd"&gt;function&lt;/span&gt;  _AddRef: integer; &lt;span class="pas-kwd"&gt;stdcall&lt;/span&gt;;&lt;br/&gt;    &lt;span class="pas-kwd"&gt;function&lt;/span&gt;  _Release: integer; &lt;span class="pas-kwd"&gt;stdcall&lt;/span&gt;;&lt;br/&gt;    &lt;span class="pas-kwd"&gt;function&lt;/span&gt;  GetRefCount: integer; &lt;span class="pas-kwd"&gt;stdcall&lt;/span&gt;;&lt;br/&gt;    &lt;span class="pas-kwd"&gt;function&lt;/span&gt;  GetTraceReferences: boolean; &lt;span class="pas-kwd"&gt;stdcall&lt;/span&gt;;&lt;br/&gt;    &lt;span class="pas-kwd"&gt;procedure&lt;/span&gt; SetTraceReferences(&lt;span class="pas-kwd"&gt;const&lt;/span&gt; value: boolean); &lt;span class="pas-kwd"&gt;stdcall&lt;/span&gt;;&lt;br/&gt;    &lt;span class="pas-kwd"&gt;property&lt;/span&gt; TraceReferences: boolean &lt;span class="pas-kwd"&gt;read&lt;/span&gt; GetTraceReferences &lt;span class="pas-kwd"&gt;write&lt;/span&gt; SetTraceReferences;&lt;br/&gt;  &lt;span class="pas-kwd"&gt;end&lt;/span&gt;; &lt;span class="pas-comment"&gt;{ TGpTraceable }&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;The &lt;em&gt;TGpTraceable&lt;/em&gt; class helps me debug interface problems. It&lt;em&gt; &lt;/em&gt;exposes &lt;em&gt;GetRefCount&lt;/em&gt; function which returns reference count, and it can trigger debugger interrupt on each reference count change if &lt;em&gt;TraceReferences&lt;/em&gt; property is set. &lt;/p&gt;&lt;pre class="pas-source"&gt;&lt;span class="pas-kwd"&gt;function&lt;/span&gt; TGpTraceable._AddRef: integer;&lt;br/&gt;&lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br/&gt;  Result := &lt;span class="pas-kwd"&gt;inherited&lt;/span&gt; _AddRef;&lt;br/&gt;  &lt;span class="pas-kwd"&gt;if&lt;/span&gt; gtTraceRef &lt;span class="pas-kwd"&gt;then&lt;/span&gt;&lt;br/&gt;    &lt;span class="pas-kwd"&gt;asm&lt;/span&gt; &lt;span class="pas-asm"&gt;int&lt;/span&gt; &lt;span class="pas-asm"&gt;3&lt;/span&gt;&lt;span class="pas-asm"&gt;;&lt;/span&gt; &lt;span class="pas-kwd"&gt;end&lt;/span&gt;;&lt;br/&gt;&lt;span class="pas-kwd"&gt;end&lt;/span&gt;; &lt;span class="pas-comment"&gt;{ TGpTraceable._AddRef }&lt;/span&gt;&lt;br/&gt;&lt;br/&gt;&lt;span class="pas-kwd"&gt;function&lt;/span&gt; TGpTraceable._Release: integer;&lt;br/&gt;&lt;span class="pas-kwd"&gt;begin&lt;/span&gt;&lt;br/&gt;  &lt;span class="pas-kwd"&gt;if&lt;/span&gt; gtTraceRef &lt;span class="pas-kwd"&gt;then&lt;/span&gt;&lt;br/&gt;    &lt;span class="pas-kwd"&gt;asm&lt;/span&gt; &lt;span class="pas-asm"&gt;int&lt;/span&gt; &lt;span class="pas-asm"&gt;3&lt;/span&gt;&lt;span class="pas-asm"&gt;;&lt;/span&gt; &lt;span class="pas-kwd"&gt;end&lt;/span&gt;;&lt;br/&gt;  Result := &lt;span class="pas-kwd"&gt;inherited&lt;/span&gt; _Release;&lt;br/&gt;&lt;span class="pas-kwd"&gt;end&lt;/span&gt;; &lt;span class="pas-comment"&gt;{ TGpTraceable._Release }&lt;/span&gt;&lt;/pre&gt;&lt;font size="-2"&gt;---Published under the &lt;a href="http://creativecommons.org/licenses/by/3.0/"&gt;Creative Commons Attribution 3.0&lt;/a&gt; license&lt;/font&gt;&lt;div class="blogger-post-footer"&gt;&lt;font size=-2&gt;---&lt;br/&gt;Published under the &lt;a href="http://creativecommons.org/licenses/by/3.0/"&gt;Creative Commons Attribution 3.0&lt;/a&gt; license&lt;/font&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/29331675-6136837241794873072?l=17slon.com%2Fblogs%2Fgabr%2Fblogger.html' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/6136837241794873072/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://17slon.com/blogs/gabr/2009/11/gpstuff-119-gplists-143.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/posts/default/6136837241794873072'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/29331675/posts/default/6136837241794873072'/><link rel='alternate' type='text/html' href='http://17slon.com/blogs/gabr/2009/11/gpstuff-119-gplists-143.html' title='GpStuff 1.19 &amp;amp; GpLists 1.43'/><author><name>gabr</name><uri>http://www.blogger.com/profile/06903558857617342477</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='12941540036853261067'/></author><thr:total xmlns:thr='http://purl.org/syndication/thread/1.0'>0</thr:total></entry></feed>