We are planning a third release of the DaCapo benchmark suite for the first half of 2012. We specifically solicit contributions of: a) suggestions for new workloads, b) fixes and improvements to the suite, and c) suggestions for candidates for retirement. The best way to contribute is concretely, with code in the form of patches or mercurial bundles. Please use the trackers to log your contributions and/or suggestions. If possible, please make your contributions against the mercurial head.
After three years of development, the new release of the DaCapo benchmark suite is finally available. You can grab it here. Please be sure to read the release notes before using this release. This release includes new workloads, deletes many old ones, overhauls all other workloads. The release also includes many improvements to the harness and commandline interface.
Our continuous performance regression pages (release and development) have been updated to include time stamps and direct links to the jars used. This means you can always download the current development jar.
After a huge development effort, we now have early drafts of two benchmarks based on the Apache DayTrader J2EE workload: tradebeans and tradesoap. At this stage both benchmarks are unstable on some platforms, including our testing environment. We strongly encourage feedback (please use the mailing list). Please try them out by downloading the development jar. Read more about these workloads here.
As part of a major push to get a beta version of the suite ready, we have started culling those benchmarks that will not appear in the next release. The following benchmarks have been removed: antlr (jython now uses antlr internally), bloat (we have a number of program analysis tools), chart (batik has some similarity as a vector graphics renderer), hsqldb (replaced by derby).
A new benchmark, avrora, has been added to the suite. Avrora was proposed by Ben Titzer from Sun. Avrora is a parallel discrete event simulator that performs cycle accurate simulation of a sensor network. Avrora exhibits interesting patterns of parallelism, and is unlike any of the existing DaCapo workloads, so makes a very interesting addition. Please try it out by downloading the development jar here. Thank you very much Ben!
Almost all benchmarks have been updated to reflect the latest publicly available releases. For jython, lusearch, luindex, and pmd this marks significant changes. Only eclipse and xalan remain to be updated.
In anticipation of the upcoming release, we have created a TODO list of work that needs to be done before the upcoming release. Please send email to the mailing list or directly to Steve Blackburn if you think you can help.
A new paper describing the development of the DaCapo benchmark suite and broader issues of methodology appears in August 2008 CACM.
We perform comprehensive, 12-hourly performance regressions, running various VMs against the DaCapo svn head (for the upcoming release), and against the 2006 release.
Slides from a presentation discussing the development of the DaCapo suite and the broader topic of evaluation methodology are now available.
Cliff Click pointed us to the fragger widgit, which artificially injects fragmentation into a heap. We are considering including this as a runtime option in the DaCapo suite, as it could be very for those using DaCapo for memory management and locality work. Thanks Cliff.
Stability and performance comparisons among a number of JVMs running against the DaCapo suite can now be found here, with tests against the current DaCapo development head available here. These numbers are updated daily. Note that they currently just present raw data, which tends to be very noisy given the run-to-run variation due to the non-determinism of adaptive optimization in modern JVMs.
We have started overhauling our workloads in anticipation of our next release. This includes the revival of the batik workload and a reworking of hsqldb to use the Apache derby database engine. We have also started overhauling the implementation of multithreading within the DaCapo harness.
We have started evaluating Apache Geronimo and its DayTrader benchmark for potential inclusion in the next DaCapo benchmark suite.
A second minor maintenance release (dacapo-2006-10-MR2) is now available. This includes minor bug fixes and a repackaging of the benchmarks suitable for people wanting to use tools like Soot to perform ahead of time analysis of the code. The workloads are unchanged.
With assistance from Chris Kulla, we have a draft of a new SunFlow benchmark in our subversion repository as a candidate for inclusion in the DaCapo suite. To evaluate the benchmark, grab the svn head ("svn co https://dacapobench.svn.sourceforge.net/svnroot/dacapobench/benchmarks/trunk dacapo"), build the sunfow benchmark ("cd dacapo/benchmarks; ant sunflow.source clean sunflow jar.quick"), then run it ("java -jar dacapo.jar sunflow"). SunFlow, along with other candidates, will be evaluated over the next 6-12 months for possible inclusion in the next release of the suite. Note that SunFlow requires Java 1.5 or later.
A minor maintenance release (dacapo-2006-10-MR1) is now available. This includes a bug fix and minor enhancements to the interface. The workloads are unchanged.
The first full release of the dacapo benchmark suite (dacapo-2006-10)
is now available. This release includes the following bug fix:
- Fixed a bug in pmd sources. Three of the pmd input sources included a non-UTF-8 character where the resulting behavior is undefined. Some JVMs (correctly) produced different behavior for these files. The offending character has been removed from the three input files.
We have produced release candidates in anticipation of our initial full release around October 23. These are now available for download and evaluation. Feedback recieved prior to October 20 may be included in the full release. New features for release candidates include:
- Further improvements to the validation mechanism. We now accommodate the DOS "\" path separator--hopefully this is the last of the OS-specific issues.
- We have added a new build target split-deps. This allows those who build from source to separate out dependencies into a separate jar (this feature was added after feedback on the mailing list, and is used to enable whole program analysis). [Thanks to Eric Bodden of McGill!]
- Improvements to the harness. Fixed a bug that prevented output from subsequent benchmarks being displayed when executing more than one benchmark at a time. Improved usage message. [Thanks to Vladimir Strigun of Intel].
- The eclipse benchmark now uses a self-contaied dummy jre for its build tasks, so it should work on any jvm without special action (previously a third-party jre needed to be specified on the command line of jvms which Eclipse did not recognise).
- Minor improvements to the build scripts. This includes updates to source URLs, refactoring, comments.
We have made a new beta release, beta-2006-10, which has a number of improvements including:
- The xalan benchmark has been re-written. This version has a more realistic workload. We also force the use of version 2.4.1 and ensure that xalan is used (rather than a bundled xslt processor). [Thanks to Kev Jones of Intel!]
- The validation mechanism has been completely overhauled. The validation mechanism ensures that benchmarks only "PASS" if they complete correctly. The harness should now work properly on any OS, and deals with the users' pwd which creeps into some benchmark output. [Thanks to Robin Garner of ANU!]
- The lusearch benchmark has been improved, so that each query thread includes a mix of unique and shared queries (previously all queries were unique/non-shared).
- The build system has been improved considerably and is a little better documented.