Friday, July 04, 2008

OmniThreadLibrary internals - OtlTask

image [Please note - new snapshot is available on the OTL home page.]

In next few posts, I'll try to present a rough overview of the OTL internals.

The most important unit is OtlTask. It hosts interfaces describing threaded task. Currently, it also provides thread descendant to run those tasks, but that one will most probably be moved to the OtlThread unit.

The most important interface in OtlTask is IOmniTaskControl. This task control interface is returned from the CreateTask function. It is used to communicate with the task and to control its behavior. Four CreateTask functions map directly to four constructors, which store execution parameters into internal field and initialize some events used for task control.

    constructor Create(worker: IOmniWorker; const taskName: string); overload;
constructor Create(worker: TOmniWorker; const taskName: string); overload;
constructor Create(executor: TOmniTaskMethod; const taskName: string); overload;
constructor Create(executor: TOmniTaskProcedure; const taskName: string); overload;

SetParameter* methods are used to set parameters that are passed to the threaded worker.

    function  SetParameter(const paramName: string; paramValue: TOmniValue): IOmniTaskControl; overload;
function SetParameter(paramValue: TOmniValue): IOmniTaskControl; overload;
function SetParameters(parameters: array of TOmniValue): IOmniTaskControl;

Next four methods are used to control internal event/message loop behavior. They are only used when working with IOmniWorker or TOmniWorker tasks. Alertable causes message wait to be alertable and MsgWait causes message wait to process messages (see TOmniTaskControl.DispatchMessages for more information). SetTimer causes timerMessage message to be sent to the worker in specified intervals. If timerMessage is -1, worker's Timer method is called instead. FreeOnTerminate is only available when worker is a descendant of the TOmniWorker class and causes the worker to be destroyed after the task is completed.

    function  Alertable: IOmniTaskControl;
function FreeOnTerminate: IOmniTaskControl;
function MsgWait(wakeMask: DWORD = QS_ALLEVENTS): IOmniTaskControl;
function SetTimer(interval_ms: cardinal; timerMessage: integer = -1): IOmniTaskControl;

SetMonitor and RemoveMonitor are used to attach external message watching monitor. Typically that would be the TOmniTaskEventDispatch component and those TOmniTaskControl messages would be called component's Monitor and Detach functions.

    function  RemoveMonitor: IOmniTaskControl;
function SetMonitor(hWindow: THandle): IOmniTaskControl;

To start the task, you must execute either Run (creates a new thread and starts executing the task) or Schedule (schedules task to be executed in a thread pool; not implemented yet).

    function  Run: IOmniTaskControl;
function Schedule(threadPool: IOmniThreadPool = nil {default pool}): IOmniTaskControl;

There are four other methods related to the execution control. Terminate stops the task and waits the specified amount of time for the task to complete gracefully. At the moment it returns False if task does not terminate in the allotted time. In the future, it will maybe kill the thread. I'm not sure yet.


TerminateWhen is not implemented yet. It was supposed to assign another synchronization object to the thread. That object would cause the task to stop. The idea was to use one event to stop multiple tasks at once, but I'm not sure if that is useful at all.


WaitFor waits for the task to finish execution and returns True if task has stopped. WaitForInit waits for IOmniWorker/TOmniWorker worker to execute its Initialize method and returns initialization status (return value of the Initialize method).

    function  Terminate(maxWait_ms: cardinal = INFINITE): boolean;
function TerminateWhen(event: THandle): IOmniTaskControl;
function WaitFor(maxWait_ms: cardinal): boolean;
function WaitForInit: boolean;

The Comm property gives the owner access to the communication subsystem. This will be covered in a separate post.


ExitCode and ExitMessage will return task's exit code and message. Not implemented yet.


Name returns task's name, as set in the constructor and UniqueID returns task's unique ID. This identifier is unique inside the application.


Options gives you direct access to execution options, usually set with Alertable, MsgWait, and FreeOnTerminate.

    property Comm: IOmniCommunicationEndpoint read GetComm;
property ExitCode: integer read GetExitCode;
property ExitMessage: string read GetExitMessage;
property Name: string read GetName;
property Options: TOmniTaskControlOptions read otcOptions write otcOptions;
property UniqueID: cardinal read GetUniqueID;

Execution


Run first makes parameters immutable, then creates thread-side task interface (IOmniTask), creates new thread, assigns that task to the thread and starts the thread.

function TOmniTaskControl.Run: IOmniTaskControl;
var
task: IOmniTask;
begin
otcParameters.Lock;
task := TOmniTask.Create(otcExecutor, otcTaskName, otcParameters, otcCommChannel,
otcUniqueID, otcTerminateEvent, otcTerminatedEvent, otcMonitorWindow);
otcThread := TOmniThread.Create(task);
otcThread.Resume;
Result := Self;
end; { TOmniTaskControl.Run }

Thread's Execute method passes control to the TOmniTaskControl's Execute method.

procedure TOmniThread.Execute;
begin
{$IFNDEF OTL_DontSetThreadName}
SetThreadName(otTask.Name);
{$ENDIF OTL_DontSetThreadName}
(otTask as IOmniTaskExecutor).Execute;
end; { TOmniThread.Execute }

Execute calls either procedure or method worker. When working with IOmniWorker/TOmniWorker worker, Execute calls TOmniTaskControl.DispatchMessages. In all cases, IOmniTask is passed as a parameter.

procedure TOmniTaskExecutor.Execute(task: IOmniTask);
begin
case ExecutorType of
etMethod:
Method(task);
etProcedure:
Proc(task);
else
raise Exception.Create('TOmniTaskExecutor.Execute: Executor is not set');
end;
end; { TOmniTaskExecutor.Execute }

IOmniTask


IOmniTask interface allows the worker to communicate with the owner.

  IOmniTask = interface
procedure SetExitStatus(exitCode: integer; const exitMessage: string);
procedure Terminate;
property Comm: IOmniCommunicationEndpoint read GetComm;
property Name: string read GetName;
property Param[idxParam: integer]: TOmniValue read GetParam;
property ParamByName[const paramName: string]: TOmniValue read GetParamByName;
property TerminateEvent: THandle read GetTerminateEvent;
property UniqueID: cardinal read GetUniqueID;
end; { IOmniTask }

SetExitStatus sets exit code and message, which are then mapped to task control's ExitCode and ExitMessage parameters. Not implemented yet.


Terminate sets task's TerminateEvent. Thus the task can terminate itself. TerminateEvent is also set when code calls task control interface's Terminate method.


Comm returns task's communication endpoint. More details will be provided in the next post.


Name returns the task name, as set in the CreateTask call. UniqueID returns task's unique ID.


Param and ParamByName can be used to access parameters passed to the task.


More information on task writing is available in examples (and already published posts on the OTL topic).

Labels: , , , , ,

Thursday, July 03, 2008

OmniThreadLibrary Example #4a: Bidirectional communication, with objects

imageThis is a variation of Example #4 that uses TOmniWorker object instead of IOmniWorker interface. Code is stored in folder tests\6_TwoWayHello_with_object_worker.

I strongly suggest that you look at Example #4 first, as this post only lists the differences between those two examples.

Firstly, we don't need TAsyncHello.Initialize. We'll set initial message text in the constructor. [Admittedly, this is something that can also be done when using IOmniWorker approach.] Message handlers are identical to the Example #4.

  TAsyncHello = class(TOmniWorker)
strict private
aiMessage: string;
public
constructor Create(const initialMessage: string);
procedure OMChangeMessage(var msg: TOmniMessage); message MSG_CHANGE_MESSAGE;
procedure OMSendMessage(var msg: TOmniMessage); message MSG_SEND_MESSAGE;
end;
constructor TAsyncHello.Create(const initialMessage: string);
begin
aiMessage := initialMessage;
end;

Secondly, we have to tweak the task creation.

procedure TfrmTestOTL.actStartHelloExecute(Sender: TObject);
begin
FHelloTask :=
OmniTaskEventDispatch1.Monitor(CreateTask(TAsyncHello.Create('Hello'), 'Hello')).
SetIdle(1000, MSG_SEND_MESSAGE).
SetParameter('Delay', 1000).
FreeOnTerminate.
Run;
end;

TAsyncHello object can now be passed directly to the CreateTask method. There is no need to set the Message parameter. Also new is the FreeOnTerminate call which tells the task manager to destroy the worker object when task is terminated. Alternatively, you can store away the worker object and destroy it "manually" after the FHelloTask is terminated.


The rest of the code does not differ from the Example #4.

Labels: , , , , ,

OmniThreadLibrary Example #4: Bidirectional communication, the OTL way

[Please note - today's snapshot is available on the OTL home page.]

image Today's topic is writing message-driven tasks without having to write an event/message processing loop. Of course, the loop is still there, but it is hidden inside the OtlTask unit.

The demo is stored in folder tests\5_TwoWayHello_without_loop. OTL package is required to open the main form.

This time the worker is not a method or procedure, but a whole object. It must either implement IOmniWorker interface or be a descendant of the TOmniWorker class (which implements IOmniWorker).

The worker class will process two messages. When MSG_CHANGE_MESSAGE is received, worker will change internal message text. When MSG_SEND_MESSAGE is received, worker will send this message text to the owner (just like in yesterday's example). We'll also override the Initialize method to set initial message text. Initialize and it's counterpart Cleanup are executed in the context of the background thread executing the task so they can be used to allocate/deallocate thread-sensitive objects.

const
MSG_CHANGE_MESSAGE = 1;
MSG_SEND_MESSAGE = 2;

type
TAsyncHello = class(TOmniWorker)
strict private
aiMessage: string;
public
function Initialize: boolean; override;
procedure OMChangeMessage(var msg: TOmniMessage); message MSG_CHANGE_MESSAGE;
procedure OMSendMessage(var msg: TOmniMessage); message MSG_SEND_MESSAGE;
end;

We can write those two message handlers identical to the way we write Windows message handlers. Message IDs are arbitrary, I usually start with 1.

{ TAsyncHello }

function TAsyncHello.Initialize: boolean;
begin
aiMessage := Task.ParamByName['Message'];
Result := true;
end;

procedure TAsyncHello.OMChangeMessage(var msg: TOmniMessage);
begin
aiMessage := msg.MsgData;
end;

procedure TAsyncHello.OMSendMessage(var msg: TOmniMessage);
begin
Task.Comm.Send(0, aiMessage);
end;

Worker code (above) is trivial, task creation code only slightly less.

procedure TfrmTestOTL.actStartHelloExecute(Sender: TObject);
var
worker: IOmniWorker;
begin
worker := TAsyncHello.Create;
FHelloTask :=
OmniTaskEventDispatch1.Monitor(CreateTask(worker, 'Hello')).
SetIdle(1000, MSG_SEND_MESSAGE).
SetParameter('Delay', 1000).
SetParameter('Message', 'Hello').
Run;
end;

First we create  the worker object and store it in the interface variable. CreateTasks then takes this worker interface instead of worker method/procedure. SetParameters are same as in the previous example (except that the parameter order has been changed to name, value in today's snapshot). The only mystery here is the SetIdle call. It tells the internal event/message loop to send the MSG_SEND_MESSAGE every 1000 milliseconds. [The second parameter - message ID - is optional. When left out, worker's Idle method is called instead.]


A word of warning: SetIdle will be most probably renamed to SetTimer in the next release.


What about MSG_CHANGE_MESSAGE? It is generated in completely the same way as in the previous example.

procedure TfrmTestOTL.actChangeMessageExecute(Sender: TObject);
begin
FHelloTask.Comm.Send(MSG_CHANGE_MESSAGE, 'Random ' + IntToStr(Random(1234)));
end;

Same goes for the task termination - just call FHelloTask.Terminate and set FHelloTask to nil. Worker object will be destroyed automatically.


So, what you say? Simple enough?

Labels: , , , , ,

Wednesday, July 02, 2008

OmniThreadLibrary Example #3: Bidirectional communication

image In the third installment of my introduction to the OmniThreadLibrary, I'll show how you can implement bidirectional communication with the threaded task. This time there will be more code to write so you should probably just open example in tests\4_TwoWayHello_with_package folder (or tests\2_TwoWayHello if you don't want to install the OTL package).

For your convenience, I've uploaded snapshot of the current repository state to the projects home address http://code.google.com/p/omnithreadlibrary/.

For the readers that are following my exposé in the browser, here is the screenshot of  today's example in action.

image

There are three buttons - first starts the threaded task, second instructs the task to change the message and thirds stops the task. There are also a TOmniTaskEventDispatch and a TActionList components on a form.

The actStart action is connected to the first button and starts the threaded task.

procedure TfrmTestOTL.actStartHelloExecute(Sender: TObject);
begin
FHelloTask :=
OmniTaskEventDispatch1.Monitor(CreateTask(RunHello, 'Hello')).
SetParameter(1000, 'Delay').
SetParameter('Hello', 'Message').
Run;
end;

The task is created in an already familiar manner (CreateTask output passed to the Monitor method). Then, two named parameters are set. First parameter to the SetParameter method contains a Variant value and second (optional) contains parameter's name.


At the end, Run is called to start the task and task control interface is stored in a field.

strict private
FHelloTask: IOmniTaskControl;

Stopping the task is trivial.

procedure TfrmTestOTL.actStopHelloExecute(Sender: TObject);
begin
FHelloTask.Terminate;
FHelloTask := nil;
end;

When you click the second button (Change message), standard communication channel (introduced in previous installment) is used to send a message containing new text.

procedure TfrmTestOTL.actChangeMessageExecute(Sender: TObject);
begin
FHelloTask.Comm.Send(MSG_CHANGE_MESSAGE, 'Random ' + IntToStr(Random(1234)));
end;

The OmniTaskEventDispatch1's event handlers are identical to the Hello, world! example.


The part that is new today is the worker code which must receive and process MSG_CHANGE_MESSAGE messages. Today, this code is a normal procedure and not a method (just to show that this option is available, too).

procedure RunHello(task: IOmniTask);
var
msg : string;
msgData: TOmniValue;
msgID : word;
begin
msg := task.ParamByName['Message'];
repeat
case DSiWaitForTwoObjects(task.TerminateEvent, task.Comm.NewMessageEvent,
false, task.ParamByName['Delay'])
of
WAIT_OBJECT_1:
begin
task.Comm.Receive(msgID, msgData);
if msgID = MSG_CHANGE_MESSAGE then
msg := msgData;
end;
WAIT_TIMEOUT:
task.Comm.Send(0, msg);
else
break; //repeat
end;
until false;
end;

RunHello first retrieves the value of the Message parameter. Then it enters an infinite loop (repeat .. until false) in which it waits for either task.TerminateEvent or task.Comm.NewMessageEvent. DSiWin32 is used for brevity only. Normal WaitForMultipleObjects API call could be used insted.


Task.TerminateEvent is signaled in the Terminate method (when user clicks the Stop "Hello" button). In this case, code simply breaks out of the repeat..until loop.


Task.Comm.NewMessageEvent is signaled when a message is waiting in the communication queue. In this case (WAIT_OBJECT_1), message is received and processed.


If nothing was signaled for <value of the 'Delay' parameter> milliseconds, the WAIT_TIMEOUT path is taken and message msg (whatever the current value may be) is sent to the owner where it is displayed.


If you did any threaded pr0gramming then you certainly recognized this loop - this is fairly standard approach when writing multithreaded code. Nothing special here except the messaging subsystem. Still, OTL doesn't stop here. As you'll see in the next example, it is possible to rewrite this code in much simpler way.

Labels: , , , , ,

OmniThreadLibrary Example #2: Hello, world!

imageLast time I've shown how to write a trivial threaded Beep in two lines.  Today I'll do something more complicated - a threaded Hello, world program that communicates with its owner. Communication will be unidirectional; I'll leave bidirectional communication for the next example.

First, compile and install the OmniThreadLibrary_D2007 package (from the packages\ subfolder). This will add one component to your palette - TOmniTaskEventDispatch. This component simplifies communication between a threaded task and primary thread.

As before, start with an empty form and drop a button on it. You'll also need a listbox to show "Hello, world!" messages. You'll also need to drom one TOmniTaskEventDispatch component on the form.

Write an OnClick handler for the button.

procedure TfrmTestOTL.btnHelloClick(Sender: TObject);
begin
btnHello.Enabled := false;
OmniTaskEventDispatch1.Monitor(CreateTask(RunHelloWorld, 'HelloWorld')).Run;
end;

The code is quite similar to the one you had to write in the first example, with one important modification - it passes the task interface, created in CreateTask, to the TOmniTaskEventDispatch component. That will tell  the component to start monitoring messages from the threaded task.


As you can see, TOmniTaskEventDispatch.Monitor takes as a parameter the interface returned from CreateTask (I'll give more attention to this interface in latter posts) and it also returns that same interface as its result so that you can immediately call the Run method.


The code above also disables the button. It will be re-enabled when the threaded task exits.


Let's write the code that does some work now. Method RunHelloWorld will be executed in the background thread.

procedure TfrmTestOTL.RunHelloWorld(task: IOmniTask);
begin
task.Comm.Send(0, 'Hello, world!');
end;

The OTL automatically sets up two-way channel between task control interface (the one that is returned from the CreateTask) and the threaded task (the IOmniTask interface passed to the worker method). Both expose a property named Comm that provides this communication. At this time, the communication channel is very simplistic - you can only send (word, Variant) pairs (message ID and data). In this case, I'm setting message ID to 0 and message data to 'Hello, world!'.


That is everything that has to be done on the threaded side. On the GUI side, the OmniTaskEventDispatch1 component is already set to monitor messages from the threaded task. We only have to write an OnTaskMessage handler.

procedure TfrmTestOTL.OmniTaskEventDispatch1TaskMessage(task: IOmniTaskControl);
var
msgID : word;
msgData: TOmniValue;
begin
task.Comm.Receive(msgID, msgData);
lbLog.ItemIndex := lbLog.Items.Add(Format('[%d/%s] %d|%s', [task.UniqueID, task.Name, msgID, msgData]));
end;

The OnTaskMessage handler is called each time the message is available. The code reads this message and shows it in the listbox. As you can also see, each task is assigned an unique ID.


To enable the button when task is done, you have to write an OnTaskTerminated event handler.

procedure TfrmTestOTL.OmniTaskEventDispatch1TaskTerminated(task: IOmniTaskControl);
begin
lbLog.ItemIndex := lbLog.Items.Add(Format('[%d/%s] Terminated', [task.UniqueID, task.Name]));
btnHello.Enabled := true;
end;

The code doesn't check which task has completed as we can only have one task running.


Run it and press the button few times. You should see that the threaded task gets allocated new unique ID every time it is run.


image


This example is available in the OTL repository in folder tests\3_HelloWorld_with_package.


If you don't want to install the package, you can create TOmniTaskEventDispatch component on the fly. An example using this approach is available in the repository in folder tests\1_HelloWorld.

Labels: , , , , ,

Why is software third time lucky?

In the beginning, there is a Problem.

Customer knows that the Problem exists, but can't tell exactly what the Problem is.

So the Customer goes to to a Programmer and asks him [or her, of course, but let me keep this simple]: "Can you build me a program that does That?"

[Here lies the original sin. Customer never asks: "Can you help me solve the Problem?" Because of that, Programmer is writing a program to do That instead of program that does This.]

Programmer says: "I can," because we programmers, by our nature, want to believe that we can code anything. So the Programmer goes to work and he works and works and although he has no idea what he is working on and how it should be built, he makes the Version One. It is merely a research project, a tool that allows the Programmer to understand the problem (not the Problem, just the problem of coding a program that does That), but the pressure of the modern world forces him to deliver it to the Customer.

Of course, Version One is unusable. After all, it was merely a sandbox where the Programmer tried to understand the problem. So the Customer pretends that he is using it and from time to time orders a small change, just to make the Programmer think that his software is being used.

After some time, Customer starts to think that he should really get something useful for his money, not just the unusable Version One and thus commissions the Programmer to write a new, improved version. Mind you, the Customer still hasn't a faintest clues what the Problem is, and so doesn't the Programmer. Even more, Programmer still has no idea that he has to solve the Problem. His task is to write a program that does That.

Still, something has improved since the day 1 - Programmer's understanding on how to write a program that does That. So he goes and codes much better program, which is built on his previous experiences. And thus the Version Two is born.

Version Two is much better version. It actually works. It doesn't look like a five year old has put it together in one afternoon and it doesn't crash every five minutes. And so, the Customer starts using it. And after some time, he sadly recognizes that the Programmer has indeed written a wonderful software that does That, but that this, alas, doesn't help him solve the Problem at all.

But now the Customer can finally tell the Programmer what the Version Two should do differently. And the Programmer listens and finally says in awe: "You mean that it should do This instead of That?!? Why didn't you say that in first place?"

And thus, the Version Three is born. It is written well. It solves the Problem. The Customer is happy. And the Programmer lives on to fight yet another battle.

This I believe.

Labels: ,

Monday, June 30, 2008

OmniThreadLibrary Example #1: Beep, world!

image In my previous post, I promised to publish some examples of OTL usage. Here is the first one - a simplification of the Hello, World program. This example doesn't write anything to the screen, it just makes a noise.

Start by creating a new Delphi VCL application. Drop a button on the form and write OnClick handler.

procedure TfrmTestOTL.btnBeepClick(Sender: TObject);
begin
CreateTask(Beep, 'Beep').Run;
end;

This will create a threaded task with name 'Beep' and main method Beep. Then the code will create a new thread and start executing this task in the context of the newly created thread. That's all, folks!


To make the example fully functional, add OtlTask unit to the uses list and write the Beep method.

procedure TfrmTestOTL.Beep(task: IOmniTask);
begin
MessageBeep(MB_ICONEXCLAMATION);
end;

Run the program, click the button. It will beep!


If you don't believe that the code is executed in a secondary thread, place a breakpoint on the MessageBeep call and check the Thread Status window. It will clearly indicate that the breakpoint was triggered from a secondary thread.


image


This example is included in the OTL repository in folder tests/0_Beep.

Labels: , , , , ,

OmniThreadLibrary - threading library for Delphi [WIP]

imageIn my $$$ line of programming, everything revolves around threads. Servers, multithreaded processing on clients, real time hardware control ... you name it. In all the years I've learned many dos and don'ts of multithreaded programming. I've also written few frameworks to simplify this job and most of them are already retired. Sometimes a potentially good approach turns out not to be so good at all ...

In the last year I've become more and more unhappy with my current framework and decided that it's time for something better, something simpler. Something built from scratch. This time I didn't start with the implementation but tried to visualize how I'd like to use my framework in real-life applications. After all, I had all those test cases lying around in form of my old applications using old framework. And so, the OmniThreadLibrary was born.

Currently OTL is in a very fluent alpha state. I'm already using it in my projects but some more important interfaces are changing from day to day. Still, you're invited to try it and to comment on its usefulness. If you can give me an example of real-world problem that can't be implemented using OTL, all the better. Maybe I'll be able to enhance the OTL to suite your purpose more.

OTL is living on the Google Code hosting and is BSD licensed. Current implementation only supports Delphi 2007 (although I'd been told that only small modifications are required to make it work with Delphi 2006) and unless I'm given a very convicting business case support for older Delphis won't be included. Let's look into future, not into past.

Some examples are included in the repository, but not all parts of the current infrastructure are explored. I'll post more on OTL and describe examples and OTL's inner workings on this blog in the next few days.

Labels: , , , , ,

Sunday, May 11, 2008

Delphi compatibility restored

A recent update to my GpLists unit broke compatibility with all Delphi versions that don't have enumerator support. Sorry :(

This has now been fixed.

Labels: , , ,

Monday, April 21, 2008

Enumerating XML nodes

I just submitted an update to OmniXmlUtils.pas - a helper library for the Delphi implementation of the XML DOM model, the OmniXML. It allows you to walk over child nodes with a simple enumerator (but you already saw that coming, huh?).

In OmniXML (and in MS XML on which the OmniXML interface is based), you access child nodes via IXMLNodeList interface. Actually, when you access any subset of nodes (for example, when you call SelectNodes and pass it an XPath expression), you are using IXMLNodeList. More and more I was using it, more I hated its stupid interface.

  IXMLCustomList = interface
function GetLength: Integer;
function GetItem(const Index: Integer): IXMLNode;
property Item[const Index: Integer]: IXMLNode read GetItem;
property Length: Integer read GetLength;
end;

IXMLNodeList = interface(IXMLCustomList)
procedure Reset;
function NextNode: IXMLNode;
end;
[I removed declarations of all functions that are not used when walking over the list.]


Basically, I have two options with the current design. I can use a standard for loop using Length and Item[] or I can use NextNode and while loop.


Or, of course, I can write an enumerator - and that's what I did. XMLEnumNodes takes an IXMLNodeList and wraps it in an enumerator. Even better - it takes an xml node or document and an XPath expression and runs SelectNodes automatically for me so I can write code like this:

for nodePassword in XMLEnumNodes(xmlConfig, '//*/PasswordHash') do
SetTextChild(nodePassword, '(removed)');

Now that's what I call a nice code!


If you can't wait for tomorrow update to the OmniXML daily snapshot, you can download OmniXMLUtils 1.25 here.

Labels: , , ,

Thursday, March 27, 2008

GpLists emergency update

TGpInt64List.Move was totally broken. Great thanks go to 'Intra' for noticing it.

Fixed now.

Labels: , , ,

Friday, March 21, 2008

Walking the key-value container

The recent discussion in comments to my latest articles (Fun with enumerators - Walking the (integer) list, On bad examples and smelly code) caused a shift in my perspective. I was always treating my TGpIntegerObjectList and TGpInt64ObjectList as lists with some baggage attached, but in practice I'm almost never using them that way. Most of the time I treat them as a background storage for key-value pairs.

What's the difference? Most of the time, I don't care about item indices. When I use my lists as containers, I never need to know where in the list some (Key, Value) pair is stored. OK, almost never. When I'm deleting from the list, I sometimes use the index, just for the performance purposes. And when I access the Value part, I have to find the index with IndexOf and then use it to reference the Objects property. There are probably other cases too - but that's something that you just have to take into account if you want to use a list as a storage media.

From time to time I'm extending lists in my GpLists unit with small wrappers that help me not to use list indices in some specific situation. Today, I used the new Walk enumerator in some code and asked myself: "Why does it have to return list index? Why couldn't it return a (Key, Value) pair?"

Good question. Why couldn't it? It turns out that it could.

A little enumerator that could

Let's set up some assumptions first. Assume that I have this pointless list.

il := TGpIntegerObjectList.Create;
il.AddObject(1, TGpString.Create('one'));
il.AddObject(2, TGpString.Create('two'));
il.AddObject(3, TGpString.Create('three'));
Further assume that I want to walk over it and display both number and text for each item. I can do this with a standard loop.
for idx := 0 to il.Count - 1 do
Log('%d: %s', [il[idx], TGpString(il.Objects[idx]).Value]);
Or with my index-returning walker.
for idx in il.Walk do
Log('%d: %s', [il[idx], TGpString(il.Objects[idx]).Value]);
But what I'd really like to do is.
for kv in il.WalkKV do
Log('%d: %s', [kv.Key, TGpString(kv.Value).Value]);
Or even better.
for kv in il.WalkKV do
Log('%d: %s', [kv.Key, kv.StrValue]);

In other words, I want to return not a single item, but a pair of items from the enumerator. Of course, Delphi expressions can only return a single result and not a tuple so we have to provide a wrapper for enumerated (Key, Value) pairs. We have to pack them in a record, class or interface.


Getting all self-destructive


My first idea was to return an interface to a key-value object from the enumerator.

  IGpKeyValue = interface
function GetKey: int64;
function GetValue: TObject;
property Key: int64 read GetKey;
property Value: TObject read GetValue;
end; { IGpKeyValue }

TGpKeyValue = class(TInterfacedObject, IGpKeyValue)
private
kvKey: int64;
kvValue: TObject;
protected
function GetKey: int64;
function GetValue: TObject;
public
constructor Create(key: int64; value: TObject);
property Key: int64 read GetKey;
property Value: TObject read GetValue;
end; { TGpKeyValue }

TGpIntegerObjectListWalkKVEnumerator = class
//...
function GetCurrent: IGpKeyValue;
property Current: IGpKeyValue read GetCurrent;
end; { TGpIntegerObjectListWalkKVEnumerator }

function TGpIntegerObjectListWalkKVEnumerator.GetCurrent: IGpKeyValue;
var
idx: integer;
begin
idx := wkeListEnumerator.GetCurrent;
Result := TGpKeyValue.Create(wkeListEnumerator.List[idx],
TGpIntegerObjectList(wkeListEnumerator.List).Objects[idx]);
end; { TGpIntegerObjectListWalkKVEnumerator.GetCurrent }

That surely works fine, but guess what - it's incredibly slow. I wouldn't expect anything else - after all an object is allocated for every enumerated value, plus all that complications with interface reference counting...


I did some testing, of course. Thousand iterations over a list with 10.000 elements. Results are quite interesting. 5 milliseconds for a standard for loop. 50 milliseconds for my Walk enumerator. 5 seconds for interface-based WalkKV. Ouch! Let's return to the drawing board...


One allocation is enough


My next idea was to return not an interface, but an object. When you return an object, you actually return a pointer to the object data, which is quite fast. It would not help much if I would recreate this object every time the GetCurrent is called, but luckily there is no real reason to do that. I can create the object when enumerator is created and destroy it when enumerator is destroyed.

  TGpKeyValue = class
private
kvKey : int64;
kvValue: TObject;
public
property Key: int64 read kvKey write kvKey;
property Value: TObject read kvValue write kvValue;
end; { TGpKeyValue }

TGpIntegerObjectListWalkKVEnumerator = class
private
wkeCurrentKV: TGpKeyValue;
wkeListEnumerator: TGpIntegerListWalkEnumerator;
public
constructor Create(aList: TGpIntegerObjectList; idxFrom, idxTo: integer);
destructor Destroy; override;
//...
function GetCurrent: TGpKeyValue;
property Current: TGpKeyValue read GetCurrent;
end; { TGpIntegerObjectListWalkKVEnumerator }

constructor TGpIntegerObjectListWalkKVEnumerator.Create(aList: TGpIntegerObjectList;
idxFrom, idxTo: integer);
begin
inherited Create;
wkeListEnumerator := TGpIntegerListWalkEnumerator.Create(aList, idxFrom, idxTo);
wkeCurrentKV := TGpKeyValue.Create;
end; { TGpIntegerObjectListWalkKVEnumerator.Create }

destructor TGpIntegerObjectListWalkKVEnumerator.Destroy;
begin
FreeAndNil(wkeCurrentKV);
FreeAndNil(wkeListEnumerator);
inherited;
end; { TGpIntegerObjectListWalkKVEnumerator.Destroy }

function TGpIntegerObjectListWalkKVEnumerator.GetCurrent: TGpKeyValue;
var
idx: integer;
begin
idx := wkeListEnumerator.GetCurrent;
wkeCurrentKV.Key := wkeListEnumerator.List[idx];
wkeCurrentKV.Value := TGpIntegerObjectList(wkeListEnumerator.List).Objects[idx];
Result := wkeCurrentKV;
end; { TGpIntegerObjectListWalkKVEnumerator.GetCurrent }

BTW, you can see another trick in this implementation - enumeration by delegation. I'm reusing my Walk enumerator to do the actual walking.


That works much faster than the interface-based version - 300 ms for my test case. It's still six times slower than the Walk enumerator, though, and it is not really obvious why the difference is so big. Leave that be, for a moment.


The third approach would be to use a record to store the current (Key, Value) pair. Then there is no allocation/deallocation at all, but resulting code is not faster. Record-based enumerator needs about 500 ms to run the test case.


This slowdown occurs because record is not returned as a pointer, but as a full copy. In the class-based scenario, GetCurrent returns a pointer to the TGpKeyValue object and that pointer is passed in a register. In the record version, GetCurrent returns not a pointer to the record, but the record itself - it copies full record to the stack so the caller can use this copy - and that is waaaay slower.


The speed difference


Let's return to that major speed difference between Walk and WalkKV. I looked at the code, but couldn't find any good reason. Then I turned to the CPU view and it was evident. The problem lied not in the enumerator, but in my poorly designed benchmarking code :(


You see, I was timing multiple repetitions of these three loops:

for idx := 1 to 10000 do 
;

for idx in il.Walk do
;

for kv in il.WalkKV do
;

Three empty loops, so I'm timing just the enumeration, yes? No!


First loop just runs from 1 to 10000. Trivial job and compiler will generate great code.


Second loop does the same, but with more overhead.


Third loop does much more than that. It also accesses il[] and il.Objects[] (inside the GetCurrent).


In reality, code inside the for statement in the first two cases would have to access il[] and il.Objects[] by itself. The code inside the third for statement has no need for that - it gets data neatly prepared in kv.Key and kv.Value.


I changed the loops to:

for idx := 0 to il.Count - 1 do begin
il[idx];
obj := il.Objects[idx];
end;

for idx in il.Walk do begin
il[idx];
obj := il.Objects[idx];
end;

for kv in il.WalkKV do begin
kv.Key;
obj := kv.Value;
end;

I'm just accessing and throwing the Items[]/Key information away and then I copy the Objects[]/Value information into the local variable. All loops are now running comparable jobs.


Results are now significantly different. Simple for loop needs 300 ms to run the test, Walk version needs 310 ms (really small difference) and WalkKV version needs 470 ms. It is still slower, but by less than the factor of two. If there would be a real code inside the for loop, the difference would be unnoticeable.


Morale? You should benchmark, but you should also check that you're benchmarking the right thing!


Syntactic sugar


The generic version of WalkKV only supports this kind of code:

for kv in il.WalkKV do
Log('%d: %s', [kv.Key, TGpString(kv.Value).Value]);

But there's a really neat trick to change this into

for kv in il.WalkKV do
Log('%d: %s', [kv.Key, kv.StrValue]);

without modifying the WalkKV itself. Can you guess it? Class helpers, of course.


You just have to declare a simple helper in the WalkKV consumer (no need to change the WalkKV itself) and you can use the StrValue instead of TGpString(kv.Value).Value.

  TGpKeyStrValue = class helper for TGpKeyValue
public
function StrValue: string; inline;
end; { TGpKeyStrValue }

function TGpKeyStrValue.StrValue: string;
begin
Result := TGpString(Self.Value).Value;
end;

Conclusion: class helpers are great. Key-value walker is useful. Enumerator-induced performance loss is negligible. Enumerators are fun.


---


I've tagged this and all previous enumerator-related articles with the enumerators label so you can now access them all at once via this simple link.

Labels: , , , , ,

Sunday, March 16, 2008

On bad examples and smelly code

I've received lots of criticism on my latest article on enumerators. For the convenience of my readers, here are some excerpts.

Removal loops should be iterated backwards to prevent index shuffling.

... while I can certainly see the usefulness in your Slice- and Walk-enumerators ... I don't think they're the best solution to the two examples you quoted.

Simplicity does not justify an example that fails to work correctly. A bad example is still a bad example, no matter how simple.

I'd also like to echo the other comments w.r.t your initial example. Reversing the direction of the iteration is the simplest wasy to simplify the exemplar task, not invoking enumerators and the such like.

They can be summarized in one sentence: "Your examples suck." The main rationale for that reasoning seems to be the accepted way to delete stuff from a TList/TObjectList/TAnyList - start at the end and proceed backwards.

Well, I have two things to say. [I apologize for responding in a new post, but I feel that this is an interesting topic by itself.]

1. Yes, my examples are simplified. Terribly so. Maybe they are even simplistic. At least some part of my readers seems to think so. But that's just the way I like them.

I'm not trying to enforce my ideas on my readers. I'm only displaying them to the public. That's why I tend to minimize all my examples. I'll show you the code, I'll show you the usage, but you should think about the solution and how it applies to your problems. Maybe you'll see that it can be useful and you'll adopt it. Maybe not. Maybe you'll remember it a year from now and suddenly see the light. Maybe you'll use it for a month or two and then decide that I'm a twit and that your previous ways were better. [In this case, please refer to the last paragraph in this post.] I don't really care. I'm giving my ideas out, I'm glad of every feedback and I'm happy if somebody else is using my code. I've received a lot from the Delphi community in past ten years and I like to give back as much as I can.

2. I don't like the 'traverse the list backwards and all will be well' approach to list deletion. I have a really strong aversion to it. It looks fine, it works fine but it smells really bad.

I was thinking a lot about why I dislike it so much and I think that the main reason is that it relies too much on the list internals. Admittedly, this is a very well documented behavior, which will never change in the future. Let's try again - it relies too much on the internals of the underlying structure. If at some point in the future you change the structure, this assumption may fail. OK, it is hard to imagine an underlying structure that would behave almost the same as the TList but would behave differently on Delete, but what if you just forgot to adapt this part of the code to the new structure? Unlikely, I agree.

People will complain that I can't give a strong reason for my fears and they will be correct. I don't have any good cause against this kind of list cleanup.

You see, if I learned something in my programming career, it's that some code should never be used. When you program, some things work out and some not. Some code works fine from the beginning and some causes problem after a problem and bug report after a bug report. A good programmer recognizes that and stops using the code/fragments/patterns that are likely to cause problems in the future. Somehow I have classified the 'downto deletion' in the same category. I don't remember why. That's why I'm not saying it is bad, only that it smells bad.

The other reason is the code readability. If find the code that explicitly states its intention more readable and more maintainable. Your point may, of course, differ.

---

May my ramblings not stop you from commenting. I'm happy when I receive comments that point to my errors, however justifiable they may be. Even if I don't agree with them, they make me rethink my strategies so at the end I'm even more sure in my approach.

Flame on!

Labels: ,

Friday, March 14, 2008

Fun with enumerators - Walking the (integer) list

Do you hate such code?

    idx := 0;
while idx < il.Count do
if ShouldBeRemoved(il[idx]) then
il.Delete(idx)
else
Inc(idx);

If you do, read on!


TGpIntegerList [For the purpose of this blog entry, TGpIntegerList token is redefined to implicitly expands to "TGpIntegerList, TGpInt64List, and all classes derived from them."]  included enumeration support since Delphi 2005 times. You could, for example, write

  il := TGpIntegerList.Create([1, 3, 5, 7]);
try
for i in il do
Log('%d', [i]);
finally FreeAndNil(il); end;

and get numbers 1, 3, 5, and 7 in the log.


Slicing ...


Recently, I've extended TGpIntegerList with two additional enumerators, both (of course) springing from practice. I've noticed that I still write lots of 'old school' for loops because I want to do something special with the first element. For example, take this simple code fragment that finds a maximum element in the list.

    max := il[0];
for idx := 1 to il.Count - 1 do
if il[idx] > max then
max := il[idx];

To do such thing with for-each, you have to introduce a boolean flag.

    gotMax := false;
for i in il do
if not gotMax then begin
max := i;
gotMax := true;
end
else if i > max then
max := i;

Ugly.


Much nicer implementation can be written with the new Slice() enumerator.

    max := il[0];
for i in il.Slice(1) do
if i > max then
max := i;

Slice takes two parameters representing starting and ending index in the list and returns all values between them (indices are inclusive). Ending index can be omitted in which case it will default to Count-1 (i.e. to the end of the list).


Slice enumerator is implemented in a traditional manner - with the enumerator factory. For more details see my older articles and GpLists source code.


... And Walking


The second and much more interesting enumerator solves the problem shown in the introduction to this article. I noticed that I write lots of code that iterates over some list and removes some entries while keeping others intact. Typical scenario: removing timed-out clients from the server's list.


Something like that is impossible to do with the TGpIntegerList default enumerator. Firstly because this enumerator returns list elements and not list indices (and we typically have to access the object stored in the .Objects[] part of the list during the process) and secondly because the enumerator does not work correctly if enumerated list is modified behind its back. The second issue also prevents the use of standard for loop.


To solve both problems at once, I've introduced the Walk enumerator. It returns list indices instead of list elements and functions correctly if Add, Insert or Delete are used on the list while enumerator is active. (The second part is implemented by using TGpIntegerList internal notification mechanism, introduced just for this purpose.)


Now I can write:

    for idx in il.Walk do
if ShouldBeRemoved(il[idx]) then
il.Delete(idx);

Implementation? Check the source, Luke. It's not complicated.


Now if only I could add such enumerator to the TObjectList ...

Labels: , , , ,

Gp* update

GpHugeFile 5.05

  • Delphi 2007 changed the implementation of CreateFmtHelp so that it clears the Win32 'last error'. Therefore, it was impossible to detect windows error in detection handler when EGpHugeFile help context was hcHFWindowsError. To circumvent the problem, all EGpHugeFile help contexts were changed to a negative value. HcHFWindowsError constant was removed. All Win32 API errors are passed in the HelpContext unchanged. IOW, if HelpContext > 0, it contains an Win32 API error code, otherwise it contains one of hcHF constants.
  • Added method TGpHugeFile.GetTime and property TGpHugeFile.FileTime.

GpStreamS 1.22

  • Added AppendToFile helper functions (two overloads).
  • Added ReadTag and WriteTag support for int64 and WideString data.
  • Added two overloaded SafeCreateFileStream versions returning exception message.
  • Added Append stream helper.
  • Added AutoDestroyWrappedStream property to the TGpStreamWindow class.
  • Added TGpBufferedStream class. At the moment, only reading is buffered while writing is implemented as a pass-through operation.
  • Made 'count' parameter to CopyStream optional, the same way as TStream.CopyFrom is implemented.
  • Check for < 0 position in TGpStreamWindow.Seek.
  • Fixed reading/writing of zero bytes in TGpStreamWindow.
  • Added bunch of 'inline' directives.

GpLists 1.35

  • Implemented TGpClassList, a TStringList-based list of classes. Class names must be unique. Useful when you need to implement a class registry to generate objects from class names.
  • Added Walk enumerator to the TGpIntegerList and TGpInt64List. This enumerator allows list modifications (Add, Insert, Delete) from the enumeration consumer. IOW, you can do this:
    for idx in list.Walk do
    if SomeCondition(list[idx]) then
    list.Delete(idx);

  • Modified TGpCountedInt64List to store int64 counters.
  • Added property ItemCounter[] to TGpCountedIntegerList and TGpCountedInt64List.
  • Added Slice(from, to) enumerator to TGpIntegerList and TGpInt64List.
  • Add TGpObjectRingBuffer and TGpDoublyLinkedList enumerators. Both lock access to the list during the enumeration process if multithreaded mode is enabled.
  • Added method Contains to TGpIntegerList, TGpInt64List, TGpCountedStringList, TGpTMethodList.
  • When TGpIntegerObjectList or TGpInt64ObjectList was Sorted and with Duplicates set to dupIgnore, calling AddObject with item that was already in the list caused internal exception.
  • Enumerators changed to records with GetCurrent inlined, as suggested in More fun with Enumerators.

GpTimezone 1.22



  • Implemented DateLT, DateLE, DateGT, DateGE.

DSiWin32 1.36a



  • Added procedures DSiCenterRectInRect and DSiMakeRectFullyVisibleOnRect.
  • Added DSiTerminateProcessById procedure.
  • Added three performance counter helpers DSiPerfCounterToMS, DSiPerfCounterToUS, and DSiQueryPerfCounterAsUS.
  • Added function DSiTimeGetTime64.
  • Added function DSiGetProcessFileName.
  • Added function DSiEditShortcut.
  • Added function DSiInitFontToSystemDefault.
  • Added many SHGetSpecialFolderLocation folder constants.
  • Added ShGetSpecialFolderLocation flags CSIDL_FLAG_DONT_UNEXPAND and CSIDL_FLAG_DONT_VERIFY.
  • Added dynamically loaded API forwarder DSiSetSuspendState.
  • Added dynamically loaded API forwarders DSiEnumProcessModules, DSiGetModuleFileNameEx, and DSiGetProcessImageFileName.
  • Changed DSiIsAdmin to use big enough buffer for token data.
  • Changed DSiIsAdmin to ignore SE_GROUP_ENABLED attribute because function was sometimes incorrectly returning False.
  • Added parameter 'parameters' to DSiCreateShortcut and DSiGetShortcutInfo.
  • More stringent checking in DSiGetProcessWindow.

Labels: , , , ,

Monday, March 03, 2008

Debugging With Lazy Breakpoints

Sometimes you're hunting down a problem that occurs very rarely. When it does, an exception is triggered. If you're debugging, Delphi should pop up (it does) and allow you to look at the stack and inspect local variables (it doesn't - happens a lot to me). What can you do?

You could set a conditional breakpoint just before the problem occurs and test for the problem causing condition here. That would work on a short run, but if you're hunting the problem for many weeks, the breakpoint may move around or even get disabled. Delphi's breakpoint storage mechanism is somewhat unstable if you edit the code around the breakpoint a lot (and even more if you edit the code on multiple computers).

More stable solution is to do everything in code, from testing if debugger is running to pausing the program and breaking into the debugger. Following fragment illustrates the approach:

if DebugHook <> 0 then // test if debugger is running
if problem = true then //test the condition
asm int 3 end; //break into debugger

Not very obvious and full of arcane knowledge, but it works.

Labels: , ,

Saturday, March 01, 2008

Paint It Black (And Red)

Just in case you're not following Julian's blog, I'd like to bring your attention to his new series on red-black trees. Red-black trees are great data structure, but are notoriously hard to code correctly as the algorithm is full of weird edge cases.

I've always been a follower of his 'algorithms and data structures' articles in The Delphi Magazine and I'm sure his new series will be presented at the equally high standard. He's using C#, VB and .NET but I'm sure the series will be interesting for us Delphi users, too. [And he already gave us RB tree implementation in the EZDSL library - thanks, Julian!]

Of course, if you own his DADS book, you can read all about RB trees there.

Labels: , ,

Wednesday, February 06, 2008

TWinControl.Controls Enumerator, Revisited

Fredrik Loftheim made an interesting observation to my previous article on TWinControl.Controls enumerator - namely that the enumerator factory doesn't have to be interface based. It can be replaced by a record. Source is almost the same as before, but we can skip the interface declaration. Generated code is simpler as the enumerator factory is simply allocated from the stack and automatically destroyed when it goes out of scope. No interface support and no reference counting is required. Simpler and faster, that's the way to go.

interface

type
TControlEnumerator = record
strict private
ceClass : TClass;
ceIndex : integer;
ceParent: TWinControl;
public
constructor Create(parent: TWinControl; matchClass: TClass);
function GetCurrent: TControl;
function MoveNext: boolean;
property Current: TControl read GetCurrent;
end; { TControlEnumerator }

TControlEnumeratorFactory = record
strict private
cefClass : TClass;
cefParent: TWinControl;
public
constructor Create(parent: TWinControl; matchClass: TClass);
function GetEnumerator: TControlEnumerator;
end; { TControlEnumeratorFactory }

function EnumControls(parent: TWinControl; matchClass: TClass = nil): TControlEnumeratorFactory;

implementation

function EnumControls(parent: TWinControl; matchClass: TClass = nil): TControlEnumeratorFactory;
begin
Result := TControlEnumeratorFactory.Create(parent, matchClass);
end; { EnumControls }

{ TControlEnumerator }

constructor TControlEnumerator.Create(parent: TWinControl; matchClass: TClass);
begin
ceParent := parent;
ceClass := matchClass;
ceIndex := -1;
end; { TControlEnumerator.Create }

function TControlEnumerator.GetCurrent: TControl;
begin
Result := ceParent.Controls[ceIndex];
end; { TControlEnumerator.GetCurrent }

function TControlEnumerator.MoveNext: boolean;
begin
Result := false;
while ceIndex < (ceParent.ControlCount - 1) do begin
Inc(ceIndex);
if (ceClass = nil) or (ceParent.Controls[ceIndex].InheritsFrom(ceClass)) then begin
Result := true;
break; //while
end;
end; //while
end; { TControlEnumerator.MoveNext }

{ TControlEnumeratorFactory }

constructor TControlEnumeratorFactory.Create(parent: TWinControl; matchClass: TClass);
begin
cefParent := parent;
cefClass := matchClass;
end; { TControlEnumeratorFactory.Create }

function TControlEnumeratorFactory.GetEnumerator: TControlEnumerator;
begin
Result := TControlEnumerator.Create(cefParent, cefClass);
end; { TControlEnumeratorFactory.GetEnumerator }

The new version of the TWinControl.Controls enumerator plus some other stuff is available at http://gp.17slon.com/gp/files/gpvcl.zip.

Labels: , , , ,

Friday, February 01, 2008

TWinControl.Controls Enumerator

I'm still having fun with enumerators. The latest I wrote is a filtering enumerator for TWinControl.Controls[] that allows me to write code like this:

var
control: TControl;

for control in EnumControls(pnlAttributes, TSpeedButton) do
TSpeedButton(control).Enabled := false;

I found it interesting that Borland built a Components[] enumerator in the VCL, but not a Controls[] enumerator.


The EnumControls interface is simple. It takes a starting point for enumeration and an optional class filter. By specifying the latter, you tell the enumerator that you're only interested in child controls of a specified class.

function EnumControls(parent: TWinControl; 
matchClass: TClass = nil): IControlEnumeratorFactory;

As it is used in a for..in loop, EnumControl must return a class or interface that implements GetEnumerator function and that is exactly what it does. GetEnumerator, on the other hand, creates an instance of the TControlEnumerator class, which implements the actual enumeration.