Faster evaluation in Analytica 5.0
Since the October 30, 2017 release of Analytica 5.0 is fast approaching, it seemed like a good time to re-run the benchmark timing tests from my earlier blog posting, "Faster evaluation in Analytica 4.6", to see how the upcoming release compares.
Speed enhancements can vary a lot across different models. I am already well-aware that models with large arrays (including large Monte Carlo runs), often experience sizeable speed-up from Analytica 5.0's new multithreaded evaluation capability. But models without large arrays usually don't benefit from this, since the overhead of dividing up a computation can easily outweigh the gains from utilizing multiple core. Similarly, models that let array abstraction take care of iteration the way it is intended are likely to benefit, whereas code that has explicit FOR loops, thus circumventing automatic array abstraction, has less opportunity to benefit. Anecdotally, I have already seen some individual cases where evaluation experienced no speed-up at all up to others that had a four-fold speed-up. To come up with an average speed-up, we need to come up with some sort of "representative mix" of real problems. This was the concept behind the benchmark suite first reported in the "Faster evaluation in Analytica 4.6" blog post.
I set aside those benchmark models several years ago for the sole purpose of benchmark testing. I excluded these from any profiling or speed-measurements during code development in order to prevent intentional or unintentional tuning to the benchmark suite. In fact, this suite of models have been hidden away gathering dust since I published that blog article, except that at some point, one more benchmark model (ASB) was added to the suite. The models are all actual, substantial models, in most cases pretty large.
I ran all the benchmarks under four test conditions: Analytica 4.6, and then Analytica 5.0 with 1 thread, 4 threads and 8 threads. I then repeated all these tests 10 times and averaged the result. For the speed-up percent, I compared to the Analytica 4.6 speed. All timings were run on the same computer, which in fact is the same computer used for the tests in the previous article (Intel Core i7-2600 @ 3.4GHz, 4 cores, 8 logical processors, Windows 7). All tests used Analytica 64-bit.
|Benchmark||Elapsed time (sec)||Percent speed-up|
|4.6||5.0 (1)||5.0 (4)||5.0 (8)||5.0 (1)||5.0 (4)||5.0 (8)|
|Ave (w/o PO5):||6%||20%||21%|
For these parallelizable models, it is interesting to see a performance boost going from 1 to 4 threads, but not much from 4 to 8 threads. I think might be related the fact that my computer has 4 cores and 8 logical processors. I seems like the 4 cores may be the more important number. (?)
The RP1 and ASB benchmarks show a strong improvement, whereas the PO5 benchmark is disappointing. I am aware that RP1 does a fair amount of Monte Carlo simulation (whereas I think Monte Carlo simulation may be a but underrepresented by the other models in the suite). ASB does a lot of arithmetic on very large arrays. So the speed-up on those is consistent with those types of models benefiting from the utilization of multiple cores.
I ran the tests with an automated script overnight, so I didn't experience the models in the UI. With the disappointing PO5 result, I decided to take a look at PO5, and I discovered that it is issuing several warnings, which are likely responsible for much of the slowdown (although I should note that 4.6 issues the same warnings, so the warnings aren't new). The model was created in Analytica 3.1, a very long time ago. It seems to rely on
0*INF→0, which was changed a very long time ago to
0*INF→NAN in accordance with IEEE 754 floating point standards. It thus has NANs propagating through the model, the testing and handling of which slows things down some. So my working hypothesis at this point is that these things are responsible for much of the delta in speed. Having discovered this, I am motivated to edit the benchmark model and fix these issues in order to make it a fairer comparison. I have not done that yet, so I'll leave this as is here until I have the data, and then I'll update this posting.
I included the total average with and without the PO5 benchmark. Given its sub-second evaluation times and the discovery about the obsolete operations that are generating warnings, it probably isn't as representative of large-scale computations as the others. I would say a reasonable conclusion is that Analytica 5.0 is somewhere in the range of 15-20% faster on average with multithreading, but with actual results highly dependent on the specifics of the model.
I am sure that future releases will continue to see improvements in evalutation speed, so I'll make an effort to repeat these benchmark measurements and report them in a blog posting with each new release.
Be sure to visit the What's new in Analytica 5.0 page. There's even a video there to showcase all the huge number of new features!» Back