Jean-François Agaësse
What you mentions looks like the approach that is familiar to SW engineers with the use of widely adopted Dhrystone or more recent benchmarks (such as antutu, ...). However, recent threads/disputes on www have pointed out that these benchmarks might be somewhat of twisted to highlight the benefits of a given architecture and do not provide the full vision. On the other end, they are so complex, that you can pick only ONE part of the outcomes and start creating the buzz disregarding results that do not go in your direction (see the recent "did Intel beat ARM" story)
Not going as far as this in duplicity, we all know that benchmarks only cover a part of the problems. So, applying them for what they are not designed for is pure nonsense. For instance Dhrystone might be used by HW designers to give a merit figure of their core wrt power dissipation, mainly because this is the only one they have. But we all know that the Dhrystone aims at measuring the efficient computation logic and doesn’t exercise realistic use cases – power wise - where the different levels of memory are exercised. Consequently, in the real life, the power figures and the silicon areas where power is burnt will we very different of what you get.
Assuming we’ve managed to solve the business case(s) of having an independent lab running the benchmark (meaning that Aart, Wally and few others agree on a free and fair comparison) we will still not be home free. Based on my experience in Physical Design I can tell that one bad tool will not give good results, but a tool proving OK in a benchmark is not granted to exhibit the same technical advantage on its competitors in the real life. This is because, unlike the well identified SW processing situations we find in SW benchmarks, - BE wise all designs situations are different (when it is about BE tools we speak about “heuristics” which means that a small variation in the initial conditions – design, technology, floorplan shape, tool settings, might give very different results). Moreover, company design flows / style / practices also differ. The long and fruitful partnerships between Semiconductor Companies and EDAs have resulted in some tools specially trimmed to a kind of dominant design style and flow, and that particular tool will not be at top for different methodologies.
So when a team is at par with the current solution, people are very unlikely to themselves look for benchmarking, or trust other’s results for swapping tool for a quite uncertain benefit. For this you need a revolution, like moving from channel routing to surface routing in the middle of 199’s.
But, in my perspective, but since half a decade in the conventional EDA area it is more about evolution than revolution. Doesn’t mean that what has been introduced is not good, but it looks more like variations on the ‘divide and conquer’ strategy a/o more efficient implementation of multithreading.
(But I would be happy to read why I am wrong, or a bit exaggerating). As a consequence, “Independent” official benchmarks are likely to never happen, because in the event they would show a revolution the EDA leaders will start explaining “this is not the right design for our tools” (aha! heard this many times as well?) and if only about limited benefit, No one will make a change for 20%, he has even not experimented himself.
At the end it looks like that based on such benchmark results, EDA vendors will assault the design directors advertising their tools for ‘solving some of the problems you may face. A kind of similar situation we all know when winter starts, with ads in the newspapers a/o on TV for pills fighting against some bacteria’s a/o viruses responsible for the flue. …. Usually you got the other ones, the pills you’ve been sold will not help to defeat.
JFA