Benchmark

The sbt-hydra plugin offers a hydraBenchmark command that makes it dead simple to benchmark Hydra performance against vanilla Scala on your projects. Assuming Hydra is installed, all you need to do is entering the sbt shell and execute the hydraBenchmark command:

$ sbt
[info] ...
> hydraBenchmark 10

The argument to hydraBenchmark dictates how many runs to perform. The higher is the number of runs, the more precise is the final comparison. Each run compiles both main and test sources for all dependent and aggregate projects of the currently selected project (when entering the sbt shell, the current project is the build root project - read the sbt documentation about navigating projects interactively for details).

There are two important gotchas to ensure the reported results are meaningful:

1) All applications eating up CPU (e.g., IDEs, browsers, VMs, and similar) must be closed (if you are on a Unix/OS X machine, use top to check the machine is idle) . Ideally, you may even want to reboot your machine before starting the benchmark.

2) Check that sbt has enough memory to compile your project. If you are unsure that's the case, you can check your project's memory consumption with the help of a profiler. Also, you may find some useful information in the memory tuning section.

Note that if you are using a laptop, it's highly recommended to run the benchmark when your machine is plugged to a power supply. Also, check that all energy savings settings are disabled.

If it looks all good, just go ahead and execute the hydraBenchmark command on your project:

$ sbt
[info] ...
> hydraBenchmark 10
...
[info] Warming up JVM.

The benchmark's execution starts by warming up the JVM. Why? Because each time we start sbt the JVM is in a “cold” state: the bytecode is not yet JIT-compiled and all the project dependencies are not in the OS filesystem cache yet. As the code runs the JVM “warms up” by JIT-compiling methods that are executed frequently, and build times improve quickly. All versions of the code benefit from a warm JVM. Therefore, for the collected compile times to be stable and meaningful, we must start by warming up the JVM. This boils down to execute a number of compile cycles without recording the compilation time. When the JVM is deemed warm, each of the subprojects is compiled N times (where N is the number of runs passed to the hydraBenchmark command), and the exact compile time for each subproject is tracked.

Once execution has completed, a report similar to the following is produced:

[info] Hydra compilation statistics summary (#runs = 10):
[info]  coreJS/compile:compile  - median:  14.011s, min:   13.43s, max:  15.429s
[info]  coreJS/test:compile     - median:      50s, min:      50s, max:      51s
[info]  coreJVM/compile:compile - median:  10.146s, min:  10.283s, max:  10.988s
[info]  coreJVM/test:compile    - median:      49s, min:      48s, max:      50s
[info] Vanilla Scala compilation statistics summary (#runs = 10):
[info]  coreJS/compile:compile  - median:      34s, min:  23.235s, max:      37s
[info]  coreJS/test:compile     - median:    3m52s, min:    2m25s, max:    4m15s
[info]  coreJVM/compile:compile - median:  28.412s, min:  18.911s, max:      31s
[info]  coreJVM/test:compile    - median:    3m55s, min:    2m24s, max:    4m19s
[info] Hydra vs vanilla Scala final comparison (#runs = 10):
[info]  coreJS/compile:compile  - Hydra is 59% faster (2.43x speedup) than vanilla Scala.
[info]  coreJS/test:compile     - Hydra is 78% faster (4.64x speedup) than vanilla Scala.
[info]  coreJVM/compile:compile - Hydra is 64% faster (2.80x speedup) than vanilla Scala.
[info]  coreJVM/test:compile    - Hydra is 79% faster (4.80x speedup) than vanilla Scala.

Note that there are actually three different summaries:

  • First, a summary of the compile times obtained with Hydra.
  • Second, a summary of the compile times obtained with vanilla Scala.
  • Third, the final comparison of Hydra performances versus vanilla Scala (the median is used for the comparison).

The first two summaries provide detailed information about how much time it took to compile each subprojects' main and test sources, with and without Hydra. In particular, for each project, the median, minimum and maximum compilation time is reported. For the final comparison to be accurate, there should be little variation among these values. If that's not the case, the likely reason is that there is an application running in the background that's eating up CPU (if so, close it and re-run the benchmark). The third report compares the performance of Hydra versus vanilla Scala, using the median for the comparison.

Notice that if a subproject takes too little to compile (less than 5 seconds), the subproject is excluded from the report. The reason is to reduce noise and help you focus your attention on the subprojects that takes significant time to compile, as those are the ones hampering your productivity. If you would like to see the exact compilation time also for these ignored projects, just type last in the sbt shell after the hydraBenchmark command has finished executing.

Note also that if Hydra is disabled on a subproject, the compilation time for the subproject is not reported, as there would be no way to compare the vanilla Scala compile time performance against Hydra. There are two possible reasons why Hydra may be disabled on a subproject: 1) you have explicitly disabled the HydraPlugin on a subproject, or 2) a subproject uses a version of Scala that is not supported by Hydra, e.g., Scala 2.10.x).

Fine print

Benchmarking is hard, and automating it is not a panacea. Keep an eye on the numbers and make sure they make sense. Pay particular attention to large variations in measured times and check the Memory Tuning section for tips.

We noticed a significant performance degradation on AWS m3.2xlarge instances after 8 compilation runs due to the exhausted CodeCache section. The JVM usually prints a warning message like the one below, however in this case it didn't print it. The exact same setup worked fine on m4.2xlarge

Java HotSpot(TM) 64-Bit Server VM warning: CodeCache is full. Compiler has been disabled.
Java HotSpot(TM) 64-Bit Server VM warning: Try increasing the code cache size using -XX:ReservedCodeCacheSize=

To be on the safe side you can add -XX:+PrintCodeCache to your .jvmopts before running the benchmark. This will make the JVM print information about the CodeCache on exit, so you can double-check there was no issue there.