Scala compilation metrics¶

In addition to compiling faster, Hydra collects metrics about where compilation time is spent. There are two kinds of metrics that we collect:

Project-wide metrics, on every compilation
Per-file compilation metrics, on full builds only

Project-wide: the timings file¶

This file contains buid times for all submodules in the current project, in addition to a few more data points. It's located under .hydra/sbt/timings.csv for sbt builds, and similarly for other integrations, where sbt is replaced by the tool name.

Time	Tag	Workers	Files	Duration	GC Time
2018/05/01 11:39:52	core/compile	2	2	1250	200

Here's a breakdown of metrics of interest:

Metric	Description
Tag	Project and configuration
Workers	The number of workers used during the build
Files	The total number of files that were compiled
Duration	Total compilation time in ms
GC Time	Time spent doing garbage collection while compiling (since 0.9.9)

Note that GC Time is a JVM-wide number and can't always be directly attributed to the Scala compiler. However, large numbers give a good indication weather there is a lot of memory pressure during compilation.

Per-file: unit timings file¶

Our metrics are low-overhead (around 2%) and are enabled for full builds only. Metrics are saved in one file per project and configuration, for example:

.hydra/sbt/core/compile/unit-timings.csv
.hydra/sbt/core/test/unit-timings.csv

This file contains detailed compilation metrics for each file in a full build. These metrics are particularly useful since they provide an indication of where most of the time is spent, and hints at possible issues.

Worker ID	File	Total Time	LateFile?	Spans	Parser nodes	Typer nodes	LoC	LoC/s
1	core/../SparkContext.scala	1421	false	1123	8012	10172	1506	1060
0	core/../JsonProtocol.scala	1400	false	906	5748	14903	878	627

Here's a breakdown of each metric of interest

Metric	Description
Worker ID	On which worker was this file compiled
File	The file name
Total Time	The time it took to compile this file in ms.
Late file	Not unused, always false
Spans	The number of timing spans
Parser nodes	The number of AST nodes after parsing. (since 0.9.12)
Typer nodes	The number of AST nodes after type-checking (since 0.9.12)
LoC	Lines of code in this file, excluding comments (since 0.9.12)
LoC/s	Compilation speed in number of compiled lines of code per second (since 0.9.12)

A large number of typer nodes compared to parser nodes indicates that there is a lot of macro expansion happening. This impacts compilation times in two ways: type-checking lakes longer, and subsequent phases have a lot more code to generate.

Lines of code are a single-threaded metric. Since each file is assigned to one worker, compilation happens on a single thread. The lines of code per second shows how fast a single worker can compile one file. Typical values range between 500 LoC/s (for heavy macro or type-intensive code) and 2000 LoC/s, depending on project code style and Scala version.