A Gcc Compiler Server

A Gcc Compiler Server A Gcc Compiler Server

Per Bothner

Apple Computer

`<per@bothner.com>`
`<pbothner@apple.com>`

May 2003

Classic compiler structure

Overall structure of the gcc program is the same as original K&R C Compiler:
- A user-mode program (gcc/cc) processes arguments, and decides which other programs to run.
- The compiler proper (cc1/...) is invoked once for each source file.
- Result of cc1 is an assembly file, which is assembled using as program.
This tried-and-true approach is running into problems.

Slow compilation

The classic approach leads to lots of extra work:
- Forking a new cc1 (and as) for each source file.
- Initializing cc1 internal state (such as predefined declarations).
- Processing external modules (header files) is re-done each time.
The latter is most significant as it leads to O(N²) behavior.

Re-reading header files

C++ inline functions and templates are typically in headers.
Compilation time is often dominated by header files.
- Assume a top-level files on average includes N headers.
- Then compiling M files has to process M*N headers.
This motivates pre-compiled header (PCH) files.
A server can give us comparable benefits, if we can re-use header files.
This might be easier and more flexible than PCH.

Inter-module optimization

The classic approach hurts run-time as well as compile-time.
Compiler has available only information in the current file, plus included headers.
Compiler cannot make use of information in other modules until they get linked together.
Specifially cannot inline across modules.
This is why critical information (inline functions and templates) migrates into headers.
This talk focuses on compile speed, but be aware that it also enables important optimizations.

This is work-in-progress

This requires substantial changes to gcc.
It almost-works for C.
Some progress on support for C++.
Most of what I say about cc1 also applies to other Gcc compilers.
Focus on languages that use cpplib.
Similar issues arise in any language that imports other modules.

Multi-input mode

Read multiple top-level source files, generate single assembler file.
Equivalent to multiple compiles plus linking resulting object files.
Speeds up compilation, enables inter-module optimizations.
Needs to re-initialize front-end for each source file, without re-initializing back-end.
Both gcc and cc1 changes.
Mostly works, but some issues remain, such as renaming statics.
gcj has supported this mode for a while in an ad hoc manner.

Server mode

cc1 waits for compilation requests.
Listens to Unix domain socket bound to ./.cc1-server.
Each request names one or more source files to compile, and an output assembler file.
Works serially: When done with one request, waits for next one.
Speeds up compile-edit-debug cycle.
Speeds up batch compiling of entire directories/projects.

Initialization

Before "real work", compiler has to initialize builtins and data structures.
We now have 3 levels of initialization:
- Real one-time initialization.
- Initializing rtl and assembly generation, for each output file.
- Initializing pre-defined macros and identifiers, for each top-level input file.
Existing code tangles these together with needless interdependencies.

Re-using text, tokens, or trees?

When processing a header file, we want to remember what we read so it's faster the next time.
We have a choice between:
- Saving the text in the buffer. Simple, low-overhead, but doesn't buy much.
- Saving the tokens in the buffer, either before or after preprocessing. Requires new memory intensive data structure.
- Saving the semantic data resulting from the header files - i.e. trees.
The latter gives us the biggest potential pay-off, so that is what we do.

Dependencies and invalidation vs re-use

A header file provides (exports) various declarations, including macros, types, external declarations, and inline functions.
Goal: When including a file that has been processed before, just re-use declaration nodes from last time.
Then we can just skip the header.
Complication: A definition may depend on other definitions.
Must check that these have not changed.

Inconsistent header file use

inc.h:
```
struct device { dev_t index; };
```

a.c:
```
#define dev_t int
#include "inc.h"
```

b.c:
```
typedef short dev_t;
#include "inc.h"
```

Extended one-definition rule

Such inconsistencies are rare, but may happen.
C++'s one-definition rule requires that if two compilation units see definitions of the "same" name, they must be token-by-token equivalent.
"Extended one-definition rule":
In a "well-behaved program" -
- a shared definition is defined in a single location in a header file;
- the "meaning" of that definition does not change between compilation units.
We optimize for this assumption, but must tolerate violations.
Bonus: Can warn about violations, which are probably unintended.

Fragments

Checking if we can re-use a header file is complicated by conditional compilation and nested includes.
Re-use and dependency checking is simplified by using smaller units.
Unit of re-use is a fragment between cpp directives.
Makes cpplib changes modest.

cpplib fragment handling

Use cpplib's (existing but disabled) include file cache.
Also remember chain of fragments.
cpplib logic and directive handling mostly unchanged.
After directive or file start call enter_fragment call-back.
Before next directive or file end call exit_fragment call-back.
If first time seen, tells front-end to remember declarations.
If fragment was previously-read, call-back checks dependencies for validity. On success, remembered declarations are "pushed" into top binding_level.
If enter_fragment succeeds, cpplib skips forward to end of fragment.

Remembering and restoring declarations

Each front-end is responsible for:
- remembering top-level declarations;
- remembering / checking dependencies;
- restoring declarations if re-use is ok.
Hooks/conventions to make this relatively easy.
Complication: Old tree nodes get modified with new information. E.g. C++ function overloading.
Likely solution: Remember modifications in undo buffers. Before re-starting compilation, we must undo modifications to old trees.

Fragment/nesting overlap

Each declation needs to belong to a single fragment.
Problems if a declaration (such as a struct) is spread out over multiple fragments.
If one of the fragments gets re-used, but for some reason another fragment gets invalidated, then parser may get fed nonsense.

Non-nesting example

//  fragment 1
extern void F1(T1);

struct St {
  int value;
#ifdef __cplusplus
//  fragment 2
  int getValue() { return value; }
#endif
//  fragment 3
};

extern void F2(T2);

Handling non-nesting

Easy: a nesting counter that is non-zero when inside a declaration.
Pre-invalidate fragments if nesting>0 on fragment enter or exit.
Better: combine fragments. If nesting>0 on fragment boundary, combine neighboring fragments into one.
Difficulty: must test conditionals before first combined fragment.
Dis-allowed if conditional moved across #define, #undef or #include.

Other complications

Dependencies are of the form: Fragment F was compiled with identifier I bound to D.
We also have negative dependencies: Fragment F was compiled with I undefined.
Need an efficient way to record these negative dependencies.
See paper for discussion of this and other complications.

Preliminary results

Multiple trivial identical C programs. Each just includes Carbon.h, which includes many of Apple's GUI headers.
After initial compile, subsequent files 3 times as fast as without server.
Compiling a mix of medium-sized gcc ,c is over 30% faster using client+server.
Compiling 9 Tcl .c files yields similar speed-up.

Overhead of using server seems in the noise.
Actual numbers will depend on fraction of header-code re-use.
Numbers may get worse if we add more complete dependency checking.

Conclusion and status

Looks promising, but not ready for real use.
Closest to useful for C, but some work done for C++.
Little checked in so far, but hopefully more soon.
Full patch (relative to 3.4 mainline) available on request.

Per Bothner

Apple Computer

<per@bothner.com><pbothner@apple.com>

May 2003

`<per@bothner.com>`
`<pbothner@apple.com>`