sbt plugin¶
The Hydra sbt plugin smoothly integrates Hydra into sbt. What the plugins does is to override the sbt compile
task to compile your Scala sources with Hydra instead of the vanilla Scala compiler. From your perspective, the user, the only noticeable difference when using the Hydra sbt plugin is that your project's Scala sources are compiled faster.
This section covers the Hydra sbt plugin's functionalities. Read here if you are looking for instructions on how to get started.
How does it work?¶
By adding the sbt-hydra
plugin to your build, the auto plugin HydraPlugin
is automatically enabled in all Scala projects defined in the build.sbt
. This entails that no changes are required in your build file to work with Hydra.
As mentioned before, compilation with Hydra happens in parallel. To do so, Hydra spawns a number of workers equal to the number of available physical cores and, by default, it reaches optimal speedup by automatically tuning each workers' workload. In addition to parallelizing compilation, Hydra also collects and pushes compile-time metrics to a web-based dashboard.
Let's explore next the configuration keys offered by the HydraPlugin
.
Configuration¶
The HydraPlugin
provides a number of configuration keys that can be used for fine-tuning how Hydra works.
Settings and tasks:¶
hydraWorkers
: Number of workers to use for compiling a project's Scala sources (defaults to the machine's number of physical cores).hydraSourcePartitioner
: The source partitioner to use on a project:"auto"
,"explicit"
,"package"
,"plain"
("auto"
is the default).hydraPartitionFile
: The file used by the explicit partitioner to split sources to workers.hydraTimingsFile
: The CSV file where Hydra will log compilation times. (Deprecated. Use the dashboard to inspect compilation times)hydraScalaVersion
: The version of Scala Hydra used to compile Scala sources.hydraIsEnabled
: Flag controlling if Hydra is used to compile a project.hydraMetricsServiceStart
: Starts the metrics service that pushes compilation metrics to the dashboard (note that this task is automatically triggered when entering the interactive shell).hydraMetricsServiceJvmOptions
: JVM options used to initialize the metrics service (defaults toSeq("-Xmx256M")
).hydraInvalidateCaches
: A task that deletes all information gathered from previous builds. Next compilation will start fresh.hydraBaseDirectory
: The directory where Hydra can write its log, caches and other book-keeping files. By default it is.hydra/sbt
, inside the base directoryhydraCheckForUpdatesEnabled
: A flag controlling if availability of Hydra updates should be checked.hydraDisplayInfo
: A flag that controls whether Hydra shows compilation statistics on startup and when certain milestones are hit.
hydraMetricsServiceJvmOptions
is a build-level key. All other keys are scoped to the project, meaning you can provide a different value for each of them on each project. The ones that are relevant for optimizing the execution of Hydra are hydraWorkers
and hydraSourcePartitioner
.
.scala build files
To import these settings in a .scala
build file, add this import:
import com.triplequote.sbt.hydra.HydraPlugin.autoImport._
Commands¶
Hydra provides a few commands for your convenience. sbt commands live outside the scoping system:
hydraBenchmark
: Compile all projects with Hydra and vanilla Scala and report speedup values.hydraCompilationStats
: A command to show various statistics related to compilation time, including saved time. It will pick up specific speedups per project from the most recent hydraBenchmark run (if none, it defaults to an estimated 50% speedup).hydraActivateLicense
andhydraDeactivateLicense
: Manage your Hydra license. See License for more detailshydraCheckForUpdates
: Check if there is a new version of Hydra. Usually performed on startuphydraStartLocalDashboard
: Start a local Dashboard server. Requires Docker.
Degree of parallelism¶
The degree of parallelism can be controlled via the key hydraWorkers
. By default hydraWorkers
is equal to the number of physical cores on your system (half the number of cores reported by Java, to account for hyper-threading). You can easily change this as any other sbt setting:
set every hydraWorkers := 8
Warning
Note that all Hydra settings are project-level settings, so assigning a new value using in Global
won't work. If you want to set a new value for all projects, use set every
in the sbt shell.
Depending on your hardware architecture you may obtain faster compile time by assigning a different value to it. However, you should never assign to it a value smaller than 2 or bigger than the number of available CPUs.
Sources partitioning¶
hydraSourcePartitioner
controls how sources are partitioned and assigned to workers. Four strategies are available:
"auto"
: Automatically balances workers based on compilation times of individual sources. This is the default strategy."explicit"
: Partition sources according to an explicit partition file."package"
: Partition sources respecting package boundaries. This may not balance perfectly between workers, but it may lead to less "cross-talk" between workers."plain"
: Tries hard to assign an equal number of sources to each worker. This works well when each of your sources takes similar time to compile.
The default "auto"
partition strategy will usually deliver optimal results. Read the Tuning section for a more in-depth discussion about partition strategies.
Partition file¶
If you are using the "explicit"
partition strategy, you can use hydraPartitionFile
to tell Hydra from where to read the partition file. This setting is scoped per project and configuration, so a different file for each sub-project and each configuration. For more details, please read the Tuning section.
hydraPartitionFile
must be scoped to a configuration. If you just use hydraPartitionFile := <path>
the setting is ignored. Make sure to always add hydraPartitionFile in Compile := <path>
or hydraPartitionFile in Test := <path>
when modifying it.
Invalidate caches¶
Hydra "learns" about your project at each compilation event and uses this information to compile faster in the future. For example, Hydra measures how long each file takes to compile, and uses this information to automatically balance the workload of each worker. In case you need to start "fresh", you can run this task to remove all Hydra data.
> hydraInvalidateCaches [info] Deleting /Users/dragos/sandbox/unused/.hydra/sbt/core/compile [info] Deleting /Users/dragos/sandbox/unused/.hydra/sbt/frontend/compile
This task only removes Hydra-specific data. No classfiles are removed, so running compile
right after would not cause a full build.
Warning
This task is scoped to the project and configuration. If you want to remove the caches for tests you'd need to run test:hydraInvalidateCaches
.
The .hydra directory¶
You'll notice that Hydra creates a .hydra subdirectory in your project root. This directory contains information about each compiled project, metrics files and hydra.log
. You can control where this directory is placed by setting hydraBaseDirectory in ThisBuild
.
Note
You should persist this directory between CI builds in order to get the best performance.
Metrics service¶
The Metrics service pushes compilation metrics to the dashboard after each successful full compile cycle. The Metrics service is automatically started when entering the interactive shell (during onLoad
) and it's run as an external process. In particular, the metrics service process will survive even if you quit the sbt interactive shell.
The hydraMetricsServiceStart
task allows you to explicitly start the metrics service, but you will rarely need this unless the Metrics service was manually stopped.
Timings file (deprecated)¶
This feature is deprecated and you should use the dashboard for analyzing compilation time.
Hydra can append a line in a CSV file each time it builds, making it easier to see how much time is spent actually compiling over a period of time. The file will look like the following:
Time, Tag, Workers, Files, Duration (ms) 2017/05/12 11:32:01, specs2-core/compile, 4, 99, 11810 2017/05/12 11:32:13, specs2-core/test, 4, 103, 12712
The format should be directly importable in any spreadsheet software.
By default, Hydra writes all measurements to .hydra/<build-tool>/timings.csv
in the base directory of your build, regardless of what sub-project it compiles. If you want to change the file to a different name but still use the same file for all projects in your build you could do something like the following:
// global setting, not a project setting hydraTimingsFile := Some((baseDirectory in ThisBuild).value / "measurements.csv")
You can take advantage of sbt scoping rules to set up different CSV files per project. For example, you could set hydraTimingsFile
at the project level:
lazy val myProject = (project in file(".")) .settings( hydraTimingsFile := Some(baseDirectory.value / "measurements.csv") )
Note that
baseDirectory
is used without a scope so it will pick up the project-level value.
To disable this feature set the value to None
:
hydraTimingsFile := None
Hydra Scala version¶
The key hydraScalaVersion
controls the Hydra Scala version to use. By default, its value is automatically determined from the version of sbt-hydra
you are using. Our recommendation is to not touch this key, but rather upgrade the version of sbt-hydra
to use the latest and greatest Hydra Scala.
Disabling Hydra¶
The hydraIsEnabled
key allows to disable Hydra. As you might know, sbt already provides API for disabling a plugin, and we recommend you to use disablePlugins(HydraPlugin)
if you decide to disable Hydra on a project.
So, why having an additional key for the same purpose? It's because it allows us to disable Hydra on projects that use a major version of Scala we don't support. For instance, if you have a multiple subprojects build, and some of your subprojects use Scala 2.10, the Hydra sbt plugin will compile all these Scala 2.10 subprojects using the vanilla Scala compiler, without you having to explicitly disable Hydra on these subprojects.
Disabling Hydra statistics¶
By default, Hydra will greet you with a message showing some statistics about compilation:
Hydra wishes you a productive day! Your average compilation time is 17s, you build on average 39.50 times per day and the average number of files per build is 30.82. All time: Saved: 45 min 58s out of 1 hour 31 min 57s Type `hydraCompilationStats` to see more statistics about your compilation habits!
You can disable the automatic startup message, as well as messages when certain milestones are reached (i.e. "30 min saved today"), this by setting hydraDisplayInfo
to false.
The estimated time savings are based on the most recent run of hydraBenchmark
, and each project and configuration is using the corresponding speedup number. If there wasn't any benchmark run yet, it will use an estimated speedup of 2.0x. These numbers will be recalculated using the actual speedup when hydraBenchmark
is run.
IntegrationTest + Hydra¶
The sbt IntegrationTest
configuration is used to define a project containing integration tests. To set up Hydra on your integration test project simply append inConfig(IntegrationTest)(HydraPlugin.hydraConfigSettings)
to the project settings:
lazy val myIntegrationTestProject = (project in file(".")) .configs(IntegrationTest) .settings( Defaults.itSettings, // other settings here ) .settings(inConfig(IntegrationTest)(HydraPlugin.hydraConfigSettings)) // always the last set of settings
To access
HydraPlugin
in a .scala build file you will need toimport com.triplequote.sbt.hydra.HydraPlugin
.
Logging¶
Hydra outputs a log file hydra.log
inside the .hydra/<build-tool>
folder located in the project's root directory. By default, the log level is set to INFO
. But you can change the log level via the hydra.logLevel
environment variable. Next is an example showing how to set the log level to DEBUG
$ sbt -Dhydra.logLevel=DEBUG
You can also change the log filename via hydra.logFile
:
$ sbt -Dhydra.logLevel=DEBUG -Dhydra.logFile=myfile.log
Note that if you try to put the log file under
target/
you will lose it after aclean
and there will be no logging from Hydra until you restartsbt
Concurrent Restrictions¶
sbt allows you to restrict task concurrency via the concurrentRestrictions
setting (read here the related sbt documentation). By default, the sbt-hydra plugin adds Tags.limit(HydraTag, EvaluateTask.SystemProcessors)
to the default concurrent restrictions provided by sbt. This is done so that the modules that are compiled in parallel with Hydra is capped. The goals are to maximize locality, and prevent high memory consumption that can lead to long GC, which in our experience it usually delivers the best compile time results. If in your project you have modified the value assigned to concurrentRestrictions
, make sure that it still contains an entry to limit compilation with Hydra. To check this, just type show concurrentRestrictions
in the sbt shell:
$ sbt ... > show concurrentRestrictions [info] * Limit all to 8 [info] * Limit forked-test-group to 1 [info] * Limit hydra to 8
Notice that the outputted value depends on the number of both physical and logical CPUs of your machine.
If "Limit hydra to 8" is part of the output, you are good and there is no need for you to read further. Otherwise, it's possible that you are overriding the value set for concurrentRestrictions
in your build. You can check if this is the case by grepping for concurrentRestrictions :=
in your project's build files.
1) If you find a hit, and your intention was to add a custom restriction to the default concurrentRestrictions
(but without overwriting the defaults), replace concurrentRestrictions :=
with concurrentRestrictions ++=
. reload
your build and concurrentRestrictions
should now include the expected limit for Hydra.
2) If you find a hit, and your intention was indeed to overwrite the default concurrentRestrictions
, then add Tags.limit(HydraTag, 4)
to the specified restrictions. For instance:
concurrentRestrictions in Global := Seq( ... // your custom restrictions Tags.limit(HydraTag, 4) )
3) If you don't find a hit, then it's possible that one of the sbt plugins you are using is overwriting concurrentRestrictions
. In this case you will need to overwrite concurrentRestrictions
on your turn, and explicitly provide the restriction for the HydraTag
tag. Here is how you can restore the default sbt concurrentRestrictions
and at the same time limiting compilation with Hydra
:
concurrentRestrictions in Global := Tags.limit(HydraTag, EvaluateTask.SystemProcessors) +: Defaults.defaultRestrictions.value
Parallel execution¶
If task parallelExecution
is enabled (which is the default in sbt) and in your build you have many subprojects that can be compiled independently, finding the optimal limit for the HydraTag
tag may require some experiment. As a rule of thumb, we recommend that it never exceeds the number of cores available on your machine (typically 12 on modern laptops). For instance, if you'd like to force your projects to be compiled sequentially (as this might improve memory locality), add the following setting to your build:
concurrentRestrictions in Global := Tags.limit(HydraTag, hydraDefaultCpus) +: Defaults.defaultRestrictions.value
If you have many sub-projects that can be compiled in parallel you may find an optimal result if you combine the two approaches. For instance, you may decide to use 4 workers for projects that are at the bottom of the dependency tree and two workers for the leaves, while at the same time restricting overall parallelism to 8. This will allow up to 4 leaf projects to be compiled in parallel by sbt, while each one in turn is parallelized by Hydra on two cores.