Speed comparison: Variant, TValue, and TOmniValue
When I read TValue is very slow! at TURBU Tech blog earlier today, I immediately wondered about how fast is TOmniValue (the basic data-exchange type in the OmniThreadLibrary) in regards to Variant and TValue. What else could I do but write a benchmark?! I choose to test the performance in a way that is slightly different from the Mason’s approach. My test does not measure only store operation but also load and (in some instances) add. Also, the framework is slightly different and decouples time-management code from the benchmark. const As you can see, all three tests are fairly similar. They count from 0 to 100.000.000 and the counter is stored in a Variant/TValue/TOmniValue. The Variant test follows the same semantics as if the counter variable would be declared integer, while the TValue and TOmniValue tests require some programmer’s help to determine how the counter should be interpreted (AsInteger). The results were interesting. TValue is about 5x slower than the Variant, which is 7x slower than the TOmniValue. Of course, I was interested in where this speed difference comes from and I looked at the assembler code. Digging into the assemblerVariant
Very straightforward code. Variant is copied into a temporary location, number 1 is converted into Variant, those two variants are added and result is stored back into the counter variable. As you can see, Variant calculations are really clumsy. It would be much faster to convert Variant to integer, add one and convert the result back. Like this. procedure TfrmBenchmark.TestVariant2(var benchRes: integer); This modified version generates much faster code.
Benchmarking proves my theory. Optimized version needed only 1220 ms to complete the test which made it almost 5x faster than the original Variant code. TValue
The TValue code is quite neat. Counter is converted to an integer, one is added, result is converted into a temporary TValue and this temporary TValue is copied back into counter. Why then is TValue version so much slower? We’ll have to look into implementation to find the answer. Let’s find out first why TOmniValue is so fast. TOmniValue
Weird stuff, huh? Counter is converted to an integer, then a bunch of funny code is executed and the result is converted back to a a TOmniValue. The beginning and the end are easy to understand but what’s going on in-between? The answer is – inlining. Much of the TOmniValue implementation is marked inline and what we are seeing here is the internal implementation of the AsInteger property. I’ll return to this later but first let’s check what happens if all this inline modifiers are removed.
The generated code is now almost the same as in the TValue case, only stack offsets are different. It is also much slower, instead of the 839 ms the code took 3119 ms to execute and was only twice as fast as the original Variant code (and much slower than the modified Variant code). Inlining the AsInteger couldn’t make such big change. It looks like the CopyRecord is the culprit for the slowdown. I didn’t verify this by measurement but if you look at the _CopyRecord implementation in the System.pas it is obvious that the record copying cannot be very fast. The Delphi compiler team would do much good if in the future versions the compiler would generate custom code adapted to each record type to do the copying. Use the source, Luke!What’s left for me is to determine the reason for the big speed difference between TValue and TOmniValue. To find it, I had to dig into the implementation of both records. Of the biggest interest to me were the AsInteger getter and Implicit(from: integer) operator. TOmniValue TOmniValue lives in OtlCommon.pas. AsInteger getter GetAsInteger just remaps the call to the GetAsInt64 method. Similarly, Implicit maps to SetAsInt64. type ovData: int64; ovType: (ovtNull, ovtBoolean, ovtInteger, ovtDouble, ovtExtended, ovtString, ovtObject, ovtInterface, ovtVariant, ovtWideString, ovtPointer); function TOmniValue.GetAsInt64: int64; begin if IsInteger then Result := ovData else if IsEmpty then Result := 0 else raise Exception.Create('TOmniValue cannot be converted to int64'); end; { TOmniValue.GetAsInt64 } procedure TOmniValue.SetAsInt64(const value: int64); begin ovData := value; ovType := ovtInteger; end; { TOmniValue.SetAsInt64 } The code is quite straightforward. Some error checking is done in the getter and the value is just stored away in the setter. Now the assembler code from the first TOmniValue example makes some sense – we were simply looking at the implementation of those GetAsInt64. (Implicit operator was not inlined.) TValue The TValue record lives in RTTI.pas. AsInteger getter gets remapped to the generic version AsType<Integer> which calls TryAsType<T>. In a slightly less roundabout manner Implicit calls From<Integer>. function TValue.TryAsType<T>(out AResult: T): Boolean; class function TValue.From<T>(const Value: T): TValue; It’s quite obvious that the TValue internals are not optimized for speed. Everything is mapped to generics and the RTTI system which is fast, but not really that fast that it could be used for computationally-intensive code. Conclusion
P.S.Using OtlCommon won’t bring in any other parts of the OTL library. It will requires following units to compile: DSiWin32, GpStuff, and GpStringHash. Nothing from those units will be linked in as TOmniValue implementation doesn’t depend on them. The simplest way to get them all is to download the latest stable OmniThreadLibrary release. Labels: benchmarking, Delphi, OmniThreadLibrary, programming, RTL, source code |
0 Comments:
Post a Comment
Links to this post:
Create a Link
<< Home