Scala compilation metrics¶
In addition to compiling faster, Hydra collects metrics about where compilation time is spent. There are two kinds of metrics that we collect:
- Project-wide metrics, on every compilation
- Per-file compilation metrics, on full builds only
Project-wide: the timings file¶
This file contains buid times for all submodules in the current project, in addition to a few more data points. It's located under .hydra/sbt/timings.csv
for sbt builds, and similarly for other integrations, where sbt
is replaced by the tool name.
Time | Tag | Workers | Files | Duration | GC Time |
---|---|---|---|---|---|
2018/05/01 11:39:52 | core/compile | 2 | 2 | 1250 | 200 |
Here's a breakdown of metrics of interest:
Metric | Description |
---|---|
Tag | Project and configuration |
Workers | The number of workers used during the build |
Files | The total number of files that were compiled |
Duration | Total compilation time in ms |
GC Time | Time spent doing garbage collection while compiling (since 0.9.9) |
Note that GC Time is a JVM-wide number and can't always be directly attributed to the Scala compiler. However, large numbers give a good indication weather there is a lot of memory pressure during compilation.
Per-file: unit timings file¶
Our metrics are low-overhead (around 2%) and are enabled for full builds only. Metrics are saved in one file per project and configuration, for example:
.hydra/sbt/core/compile/unit-timings.csv .hydra/sbt/core/test/unit-timings.csv
This file contains detailed compilation metrics for each file in a full build. These metrics are particularly useful since they provide an indication of where most of the time is spent, and hints at possible issues.
Worker ID | File | Total Time | LateFile? | Spans | Parser nodes | Typer nodes | LoC | LoC/s |
---|---|---|---|---|---|---|---|---|
1 | core/../SparkContext.scala | 1421 | false | 1123 | 8012 | 10172 | 1506 | 1060 |
0 | core/../JsonProtocol.scala | 1400 | false | 906 | 5748 | 14903 | 878 | 627 |
Here's a breakdown of each metric of interest
Metric | Description |
---|---|
Worker ID | On which worker was this file compiled |
File | The file name |
Total Time | The time it took to compile this file in ms. |
Late file | Not unused, always false |
Spans | The number of timing spans |
Parser nodes | The number of AST nodes after parsing. (since 0.9.12) |
Typer nodes | The number of AST nodes after type-checking (since 0.9.12) |
LoC | Lines of code in this file, excluding comments (since 0.9.12) |
LoC/s | Compilation speed in number of compiled lines of code per second (since 0.9.12) |
A large number of typer nodes compared to parser nodes indicates that there is a lot of macro expansion happening. This impacts compilation times in two ways: type-checking lakes longer, and subsequent phases have a lot more code to generate.
Lines of code are a single-threaded metric. Since each file is assigned to one worker, compilation happens on a single thread. The lines of code per second shows how fast a single worker can compile one file. Typical values range between 500 LoC/s (for heavy macro or type-intensive code) and 2000 LoC/s, depending on project code style and Scala version.