Greatly speed up doctests by compiling compatible doctests in one file #126245

GuillaumeGomez · 2024-06-10T21:55:26Z

Take 2 at #123974. It should now be much simpler to review since it's based on #125798.

I split the changes as much as possible to make it as easy as possible to review (even though it's a huge lot of code changes...).

The following tests are not included into the combined doctests:

compile_fail
If there are crate attributes (deny/allow/warn are ok)
have invalid AST
test_harness
no capture
--show-output (because the output is nicer without the extra code surrounding it)

Everything else is merged into one file. If this one file fails to compile, we go back to the current strategy: compile each doctest separately.

Because of the edition codeblock attribute, I couldn't make them all into one file directly so I grouped them by edition, but it should be pretty anecdotic.

In case the users want a specific doctest to be opted-out from this doctest merge, I added the standalone codeblock attribute:

/// ```rust,standalone
/// // This doctest will be run in its own process!
/// ```

Now the interesting part, I ran it on a few crates and here are the results (with cargo test --doc to only include doctests):

crate	nb doctests	before this PR	with this PR
sysinfo	227	4.6s	1.11s
geos	157	3.95s	0.45s
core	4604	54.08s	13.5s (merged: 0.9s, standalone: 12.6s)
std	1147	12s	3.56s (merged: 2.1s, standalone: 1.46s)

r? @camelid

try-job: x86_64-msvc
try-job: aarch64-apple

src/doc/rustdoc/src/write-documentation/documentation-tests.md

camelid

First round of review! Wondering if it'd make sense to run this on crater (with the edition-handling temporarily removed so it'd run unconditionally) to check breakage?

src/librustdoc/doctest.rs

camelid · 2024-06-11T19:04:40Z

src/librustdoc/doctest.rs

+
+    run_tests(
+        test_args,
+        nocapture,


nit: Isn't nocapture now passed as part of opts?

It's actually part of RustdocOptions, good catch!

Same goes for test_args, I love when code gets simpler like that. :3

src/librustdoc/doctest/make.rs

src/librustdoc/doctest.rs

tests/rustdoc-ui/doctest/wrong-ast-2024.rs

tests/rustdoc-ui/doctest/wrong-ast.rs

Kobzol · 2024-06-11T20:09:45Z

The comment in documentation claims that the slowest part that limits doctests is the compilation time (not the runtime to actually run all the tests). Instead of merging all the doctests into a single program, so that it runs in the same process (which can cause issues mentioned in the previous PR), could we instead perform the compilation in a smarter way, to generate N binaries, but still invoking the compiler just once, to amortize the compilation costs (something like my rustc daemon experiments)? Do you have any expectation of how much time would be saved if compilation was approx. as fast as compiling a single program, but we still invoked N binaries to run the tests?

Or, as a different alternative, compile everything into a single program, but then execute it in N processes (each process would run just a single test), to avoid the issues?

camelid · 2024-06-11T20:16:42Z

Interesting. I was actually wondering if there was some way to daemonize rustc. However IIRC one of the slowest parts is linking, which would still be an issue with the approach you suggest :(

Kobzol · 2024-06-11T20:19:12Z

It would also be very non-trivial, implementation wise, I imagine. But what about the second approach? We could compile a single binary that looks something like this:

fn test_0() {}
fn test_1() {}

fn main() {
   let arg = args()[0];
   if arg == "test_0" { test_0(); }
   if arg == "test_1" { test_1(); }
}

And the rustdoc driver would then execute this binary N times (in parallel for all threads on the given machine), and run each test in a separate process.

camelid

This doesn't seem to handle the fact that different doctests may use different #![feature)]s.

camelid · 2024-06-11T20:26:45Z

This doesn't seem to handle the fact that different doctests may use different #![feature)]s.

I think this would unfortunately break the second approach too @Kobzol

Kobzol · 2024-06-11T20:29:22Z

I think this would unfortunately break the second approach too @Kobzol

Sure, there are possibly additional issues with the "add everything to a single binary" approach, but as far as we can get that working somehow, it seems to me that moving from "run all tests within a single process" to "run each test in a separate process" should be relatively straightforward, and should resolve the issues with shared state between tests.

camelid · 2024-06-11T21:10:05Z

Agreed. So sounds like the idea would be:

merge doc tests into groups based on each one's combination of edition, no_std/no_core, enabled feature gates, etc.
create unified test for each merged group where only one test is run per execution, dependent on the args provided by rustdoc to the binary
run each instance (i.e. provided args) of each merged test in parallel

This should obviate the need for doing this over an edition boundary since it shouldn't break any code. It would also automatically work with code that configures global state, without needing to know a standalone attribute. We'd also skip introducing this attribute to rustdoc since it wouldn't be necessary.

src/librustdoc/doctest.rs

Scripter17 · 2024-06-11T23:51:40Z

However IIRC one of the slowest parts is linking

Would nightly's new linker (LLD, I think) be a relevant thing to look through for insight?

camelid · 2024-06-12T00:49:37Z

I'm not sure. It'd be worth testing. We should also look into turning off debuginfo for doctests since it's unused.

Kobzol · 2024-06-12T06:10:15Z

Would nightly's new linker (LLD, I think) be a relevant thing to look through for insight?

LLD should help slightly, but it's not a panacea.

I tried LLD for doctesting core, and BFD (the default Linux linker) ran for ~42s, LLD ran for 40s. It seemed that most (maybe 90%) of the time was spent running the tests though, not compiling them (assuming that compilation and running isn't overlapped).

That would also sadly suggest that the overhead from spawning the test processes (even on Linux) is the dominant factor here, which means that my proposed approach wouldn't help that much.

GuillaumeGomez · 2024-06-12T12:46:23Z

Agreed. So sounds like the idea would be:
* merge doc tests into groups based on each one's combination of edition, no_std/no_core, enabled feature gates, etc.

* create unified test for each merged group where only one test is run per execution, dependent on the args provided by rustdoc to the binary

* run each instance (i.e. provided args) of each merged test in parallel
This should obviate the need for doing this over an edition boundary since it shouldn't break any code. It would also automatically work with code that configures global state, without needing to know a standalone attribute. We'd also skip introducing this attribute to rustdoc since it wouldn't be necessary.

I agree but with some small changes: no_std tests should be on their own, however doctests with features should likely not be part of the merged doctests (we can always change that later on) and therefore no_core as well since it requires a feature to be enabled.

However I"m curious about:

* create unified test for each merged group where only one test is run per execution, dependent on the args provided by rustdoc to the binary

* run each instance (i.e. provided args) of each merged test in parallel

I generate merged doctests like rustc does with #[test], so I'm not sure it'd be worth to spawn a process for each test.

Kobzol · 2024-06-12T12:48:53Z

I generate merged doctests like rustc does with #[test], so I'm not sure it'd be worth to spawn a process for each test.

This approach would resolve the problems with sharing state between tests. But it's quite possible that it would also make the perf. gains much smaller.

GuillaumeGomez · 2024-06-12T12:55:38Z

Forking on linux has a huge overhead, so the test code itself would likely take less time than spawning it. I'm really not convinced by this approach. Let me check locally the diff.

Kobzol · 2024-06-12T13:01:25Z

If you have a recent-ish glibc and kernel, forking is relatively fast (you could spawn thousands of doctests per second, if needed). Actually I'm pretty sure that forking on Linux would be much faster than on macOS and Windows, so I would worry about these more 😆

But based on some experiments that I have posted above, it seems to me that for crates that have a lot of doc-tests (e.g. core), the primary bottleneck is not compilation, but rather execution of the doctests. If that is indeed the case, then the process spawning approach will still probably be much slower than what you had here (>2x speedup on core) :(

But maybe it's only because the rustc test machinery has a lot of costs when starting. I wonder what would happen if we just literally did this, or, as an alternative, set up everything with libtest, and then fork the process just before executing each test.

GuillaumeGomez · 2024-06-12T13:07:20Z

True, forking on linux should be pretty fast nowadays. ^^'

But maybe it's only because the rustc test machinery has a lot of costs when starting. I wonder what would happen if we just literally did this, or, as an alternative, set up everything with libtest, and then fork the process just before executing each test.

Hum... I'm definitely not convinced. Especially since you mentioned Windows and MacOS being slow to fork. I can provide a big source code for what rustdoc generates for core later once I'm done with what I'm doing and you can experiment with forking?

…n in the same process (needed for miri)

… one

…thers

…ot part of merged doctest

…new run-make API

…-2024` test

GuillaumeGomez · 2024-08-07T17:55:43Z

@bors try

…2, r=<try> Greatly speed up doctests by compiling compatible doctests in one file Fixes rust-lang#75341. Take 2 at rust-lang#123974. It should now be much simpler to review since it's based on rust-lang#125798. I split the changes as much as possible to make it as easy as possible to review (even though it's a huge lot of code changes...). The following tests are not included into the combined doctests: * `compile_fail` * If there are crate attributes (`deny`/`allow`/`warn` are ok) * have invalid AST * `test_harness` * no capture * `--show-output` (because the output is nicer without the extra code surrounding it) Everything else is merged into one file. If this one file fails to compile, we go back to the current strategy: compile each doctest separately. Because of the `edition` codeblock attribute, I couldn't make them all into one file directly so I grouped them by edition, but it should be pretty anecdotic. In case the users want a specific doctest to be opted-out from this doctest merge, I added the `standalone` codeblock attribute: ```rust /// ```rust,standalone /// // This doctest will be run in its own process! /// ``` ``` Now the interesting part, I ran it on a few crates and here are the results (with `cargo test --doc` to only include doctests): | crate | nb doctests | before this PR | with this PR | | - | - | - | - | | sysinfo | 227 | 4.6s | 1.11s | | geos | 157 | 3.95s | 0.45s | | core | 4604 | 54.08s | 13.5s (merged: 0.9s, standalone: 12.6s) | | std | 1147 | 12s | 3.56s (merged: 2.1s, standalone: 1.46s) | r? `@camelid` try-job: x86_64-msvc try-job: aarch64-apple

bors · 2024-08-07T17:56:55Z

⌛ Trying commit fcaa14e with merge 29ec5d5...

bors · 2024-08-07T19:55:13Z

☀️ Try build successful - checks-actions
Build commit: 29ec5d5 (29ec5d5b8d41ac79dd4a535fdd1dd2c4f791ca1a)

rustbot assigned camelid Jun 10, 2024

rustbot added A-run-make Area: port run-make Makefiles to rmake.rs S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue. labels Jun 10, 2024

This comment was marked as resolved.

Sign in to view

GuillaumeGomez mentioned this pull request Jun 10, 2024

Greatly speed up doctests by compiling compatible doctests in one file #123974

Closed

the8472 reviewed Jun 10, 2024

View reviewed changes

src/doc/rustdoc/src/write-documentation/documentation-tests.md Outdated Show resolved Hide resolved

camelid added A-doctests Area: Documentation tests, run by rustdoc A-edition-2024 Area: The 2024 edition needs-fcp This change is insta-stable, so needs a completed FCP to proceed. labels Jun 11, 2024

camelid requested changes Jun 11, 2024

View reviewed changes

rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 11, 2024

camelid requested changes Jun 11, 2024

View reviewed changes

traviscross mentioned this pull request Jun 11, 2024

Tracking Issue for Rust 2024: Rustdoc tests in one binary #124853

Open

3 tasks

notriddle reviewed Jun 11, 2024

View reviewed changes

src/librustdoc/doctest.rs Outdated Show resolved Hide resolved

GuillaumeGomez added 25 commits August 7, 2024 19:52

Correctly handle internal_features attribute

3632fcf

If no argument is provided to merged doctests binary, they will be ru…

9448e0e

…n in the same process (needed for miri)

Only show rustdoc doctest compilation output if nocapture is used

b1aa96a

Disable merged doctests by default

b7a0d2b

Correctly handle macros using $crate in merged doctests

6b57b17

Add 2024 edition doctests to cover corner cases

7286d97

Improve code by removing unneeded function arguments

7471156

Add documentation on DocTest and RunnableDoctest structs

b4e3659

Run mergeable doctest as part of standalone doctests if there is only…

486fe80

… one

Reduce merged doctest source code size

b55e9b0

Rename DocTest into DocTestBuilder

59d3b71

If there are crate attributes, we prevent doctest to be merged with o…

e7ff92f

…thers

Improve code readability

0ed4e62

Remove need for unsafe code in merged doctests

b92ad6f

Don't special-case if there is only one merged doctest

c66731f

Unify naming of DocTest

c8280b4

Move is_multiple_tests argument into RunnableDocTest

8a065b6

Improve documentation for internal doctest API

60a7936

Add more merged doctests tests

530a646

Add doctest to ensure that doctests with crate-level attributes are n…

d57e6b5

…ot part of merged doctest

Only keep "useful" code in tests/rustdoc-ui/2024-doctests-checks.rs

7989409

Fix commands syntax in rustdoc-ui tests

10143bb

Run fmt

80d4ccc

Update tests/run-make/doctests-keep-binaries-2024/rmake.rs test to …

1a256bf

…new run-make API

Ignore cross compile check for `tests/run-make/doctests-keep-binaries…

fcaa14e

…-2024` test

GuillaumeGomez force-pushed the doctest-improvement-v2 branch from 492fc27 to fcaa14e Compare August 7, 2024 17:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Greatly speed up doctests by compiling compatible doctests in one file #126245

Greatly speed up doctests by compiling compatible doctests in one file #126245

GuillaumeGomez commented Jun 10, 2024 •

edited

Loading

This comment was marked as resolved.

camelid left a comment

camelid Jun 11, 2024

GuillaumeGomez Jun 20, 2024

GuillaumeGomez Jun 20, 2024

Kobzol commented Jun 11, 2024 •

edited

Loading

camelid commented Jun 11, 2024

Kobzol commented Jun 11, 2024

camelid left a comment

camelid commented Jun 11, 2024

Kobzol commented Jun 11, 2024

camelid commented Jun 11, 2024

Scripter17 commented Jun 11, 2024 •

edited

Loading

camelid commented Jun 12, 2024

Kobzol commented Jun 12, 2024 •

edited

Loading

GuillaumeGomez commented Jun 12, 2024

Kobzol commented Jun 12, 2024

GuillaumeGomez commented Jun 12, 2024

Kobzol commented Jun 12, 2024

GuillaumeGomez commented Jun 12, 2024

GuillaumeGomez commented Aug 7, 2024

bors commented Aug 7, 2024

bors commented Aug 7, 2024

Greatly speed up doctests by compiling compatible doctests in one file #126245

Are you sure you want to change the base?

Greatly speed up doctests by compiling compatible doctests in one file #126245

Conversation

GuillaumeGomez commented Jun 10, 2024 • edited Loading

This comment was marked as resolved.

camelid left a comment

Choose a reason for hiding this comment

camelid Jun 11, 2024

Choose a reason for hiding this comment

GuillaumeGomez Jun 20, 2024

Choose a reason for hiding this comment

GuillaumeGomez Jun 20, 2024

Choose a reason for hiding this comment

Kobzol commented Jun 11, 2024 • edited Loading

camelid commented Jun 11, 2024

Kobzol commented Jun 11, 2024

camelid left a comment

Choose a reason for hiding this comment

camelid commented Jun 11, 2024

Kobzol commented Jun 11, 2024

camelid commented Jun 11, 2024

Scripter17 commented Jun 11, 2024 • edited Loading

camelid commented Jun 12, 2024

Kobzol commented Jun 12, 2024 • edited Loading

GuillaumeGomez commented Jun 12, 2024

Kobzol commented Jun 12, 2024

GuillaumeGomez commented Jun 12, 2024

Kobzol commented Jun 12, 2024

GuillaumeGomez commented Jun 12, 2024

GuillaumeGomez commented Aug 7, 2024

bors commented Aug 7, 2024

bors commented Aug 7, 2024

GuillaumeGomez commented Jun 10, 2024 •

edited

Loading

Kobzol commented Jun 11, 2024 •

edited

Loading

Scripter17 commented Jun 11, 2024 •

edited

Loading

Kobzol commented Jun 12, 2024 •

edited

Loading