Back in the 90s, I implemented precompiled headers for my C++ compiler (Symantec C++). They were very much like modules. There were two modes of operation:
1. all the .h files were compiled, and emitted as a binary that could be rolled in all at once
2. each .h file created its own precompiled header. Sounds like modules, right?
Anyhow, I learned a lot, mostly that without semantic improvements to C++, while it made compilation much faster, it was too sensitive to breakage.
This experience was rolled into the design of D modules, which work like a champ. They were everything I wanted modules to be. In particular,
The semantic meaning of the module is completely independent of wherever it is imported from.
Anyhow, C++ is welcome to adopt the D design of modules. C++ would get modules that have 25 years of use, and are very satisfying.
Yes, I do understand that the C preprocessor macros are a problem. My recommendation is, find language solutions to replace the preprocessor. C++ is most of the way there, just finish the job and relegate the preprocessor to the dustbin.
> just finish the job and relegate the preprocessor to the dustbin.
Yup, I think this is the core of the problem with C++. The standards committee has drawn a bad line that makes encoding the modules basically impossible. Other languages with good module systems and fast incremental builds don't allow for preprocessor style craziness without some pretty strict boundaries. Even languages that have gotten it somewhat wrong (such as rust with it's proc macros) have bound where and how that sort of metaprogramming can take place.
Even if the preprocessor isn't dustbined, it should be excluded from the module system. Metaprogramming should be a feature of the language with clear interfaces and interactions. For example, in Java the annotation processor is ultimately what triggers code generation capabilities. No annotation, no metaprogramming. It's not perfect, but it's a lot better than the C/C++'s free for all macro system.
Or the other option is the go route. Don't make the compiler generate code, instead have the build system be responsible for code generation (calling code generators). That would be miles better as it'd allow devs to opt in to that slowdown when they need it.
I can't think of a C++ project I've worked on that didn't rely on being able to include C headers and have things usually just work. Are there ways of banning C macros from "modular" C++ without breaking that? (Many would find it unacceptable if you had to go through every C dependency and write/generate some sort of wrapper.)
D resolved this problem by creating D versions of the C system headers.
Yes, this was tedious, but we do it for each of our supported platforms.
But we can't do it for various C libraries. This created a problem for us, as it is indeed tedious for users. We created a repository where people shared their conversions, but it was still inadequate.
The solution was to build a C compiler into the D compiler. Now, you can simply "import" a C .h file. It works surprisingly well. Sure, some things don't work, as C programmers cannot resist put some really crazy stuff in the .h files. The solution to that problem turned out be we discovered that the D compiler was able to create D modules from C code. Then, the user could tweak by hand the nutburger bits.
> The solution was to build a C compiler into the D compiler.
This is the same solution that Apple chose for Swift <-> Objective C interop. I wonder if someone at Apple was inspired by this decision in D!
Did anyone reach out to you for input during the modules standardization process? D seems like the most obvious prior art, but the modules standardization process seems like it was especially cursed
Nobody from C++ reached out to me for the modules.
Herb Sutter, Andrei Alexandrescu and myself once submitted an official proposal for "static if" for C++, based on the huge success it has had in D. We received a vehement rejection. It demotivated me from submitting further proposals. ("static if" replaces the C preprocessor #if/#ifdef/#ifndef constructions.)
C++ has gone on to adopt many features of D, but usually with modifications that make them less useful.
static if was more or less added in C++17 under the name `if constexpr`. It's not exactly the same since the discarded statement is still checked if not dependent on a template, but like most things in C++, it's similar enough to footgun yourself.
"if constexpr" introduces a new scope, while "static if" does not. A major divergence, enough to make the features very, very different in terms of what they can actually be used for.
That was a giant mistake on C++'s part, as you cannot do this:
Forcing a new scope cuts the utility of it about in half, and there's no way around it. But if you need a scope with D's static if:static if (feature) { int bar() { betty(); }} ... lots of code ... static if (feature) bar();
the extra { } will do it. But, as it turns out, this is rarely desirable.static if (expression) {{ int x; foo(x); }}
I would use std::enable_if or C++20 concepts. Both work fine for selectively enabling functions.
C++20 concepts are a step in the right direction, but:
- D had this feature long before C++ did.
- It isn't the same thing as "static if". Without "static if", conditionally compiling variables into classes is much more elaborate, basically requiring subclassing to do, which is not really semantically how subclassing should be used (the result is also way more confusing and oblique than the equivalent directly expressed construct you'd have using "static if").
Yes, but not for accommodating missing definitions or platform specific syntax. A true static if MUST allow invalid syntax in branches not taken if it's to fully replace the preprocessor.
I've often wondered how the evolution of C and C++ might have been different if a more capable preprocessor (in particular, with more flexible recursive expansion and a better grammar for pattern matching) had caught on. The C++ template engine can be used to work around some of those limits, but always awkwardly, not least due to the way you need Knuth's arrow notation to express the growth in compiler error message volume with template complexity. By the time C++ came out we already had tools like m4 and awk with far more capability than cpp. It's pretty ridiculous that everything else about computing has radically changed since 1970 except the preprocessor and its memory-driven constraints.
C++ modules aren't influenced by macros on import, nor do they export any macros, so I'm curious what the problem is?
What has to change in C++ templates for this to work?
It seems particularly tricky to define a template in a module and then instantiate it or specialize it somewhere else.
In order to make things work smoothly, the module has to have its own namespace, and a namespace that is closed.
D also has an `alias` feature, where you can do things like:
where from then on, `abc.T` can be referred to simply as `Q`. This also eliminates a large chunk of purpose behind the preprocessor.alias Q = abc.T;
Neat, closed namespaces sound great!
C++ has adapted the ‘using’ keyword now to seem fairly similar to alias, but can’t completely subsume macros unfortunately.
Closed namespaces means the module can be reliably imported anywhere without changing its semantic meaning.
I've really wanted that in C for a long time. It seems like a very trivial thing as well.
Yeah, it's actually easy to implement, too.
It replaces the preprocesser:
with hygiene. Once you get used to it, it has all kinds of nice uses.#define Q abc.T
C++ code generally uses `using` for typedefs. Macro-based typedefs are exceptionally rare and frowned upon.
Including or importing a templated class/function should not require bringing in the definition. That's why #includes and imports are so expensive, as we have to parse the entire definition to determine if template instantiations will work.
For normal functions or classes, we have forward declarations. Something similar needs to exist for templates.
And it doesn't in D. We call such modules ".di files", which consist only of declarations.
D does not require names in global scope to be declared lexically before they are used. C++ only does this for class/struct scopes. For example:
compiles and runs (and runs, and runs, and runs!!!).int bar() { foo(); } int foo() { bar(); }
> D does not require names in global scope to be declared lexically before they are used. C++ only does this for class/struct scopes.
But how do you handle a template substitution failure? In C++:
The compiler has no idea whether bar(1, 2); will compile unless it parses the full definition. I don't understand how the compiler can avoid parsing the full definition.template<typename T> auto bar(T x, T y) { return x + y;}
The expensive bit in my experience isn't parsing the declaration, it's parsing the definition. Typically redundantly over thousands of source files for identical types.
I think modularization of templates is really hard. Best thing I can think of is a cache e.g. for signatures. But then again this is basically what the mangling already does anyways in my understanding.
Not just signatures, you need the whole AST more or less.
This seems incredibly wasteful, but of course still marginal better than just #including code which is the alternative.