Why is the ‘auto’ keyword useful for compiler writers in C?

Author answer:
I just emailed Mr Van der Linden, and here is what he said:

Yes, I agree with the people who answered on stack overflow.
I don’t know for certain, because I never used the language B, but it seems highly plausible to me that “auto” ended up in C because it was in B.

Even when I was professionally kernel and compiler programming in C in the 1980’s, I never saw any code that I can recall that used “auto”.

The key takeaway is that the auto keyword doesn’t add any extra information, and thus is redundant and unneeded. It was a mistake to bring it into C!

I also asked for some explanation about what he meant by speaking about compiler writing and symbol table. Here is his response:

Say you are writing a compiler that will translate C source code into linker objects (object files that can be linked).

Whenever your lexer (front end of the compiler) finds a sequence of characters that form a user-defined symbol (might be a variable, might be a function name, might be a constant, etc), the compiler will store that name in a table called the “symbol table”. It will also store everything else it knows about the symbol – if it is a variable, it will store its type, if a constant it will store the value, if a function it will note that it can be invoked, etc etc. It will also store the scope of the name (the lines of code in which this symbol is known). The symbol table is one of the core data structures of a compiler, and some of it is carried forward into the object file. The object file needs to know any names that are to be addressable by external code objects, so the linker can associate them the use of a name with the object in which it is stored.

Then later, when the compiler comes across the same name, the compiler looks in the symbol table to see if it knows all about the name already. One of the useful items to store about a name is “where the compiler will allocate storage for it”. That storage has to be maintained as long as the symbol remains in scope. So it is useful for the symbol table to know where it should allocate the storage at runtime. I gave 3 examples of different places where a variable might be stored. The “auto” keyword tells the compiler “this is a variable, and you should store this on the stack and its scope is the function it is declared in”.

Only, the compiler doesn’t need to be told this, because this is already true for all variables declared within a function.
I hope this explanation makes sense.

I guess I completely misunderstood his statements by thinking that auto may have some usages when writing a compiler in C, in the code dealing with symbol table, but it seems that he meant auto is useless, but C compiler writers must handle it and understand it.
I nevertheless asked him to confirm my mistake, and it was indeed a misunderstanding of mine :

Perhaps the best way to think about this is:

  1. “auto” has no semantic effect in C
  2. we think it came over from B, but don’t know for sure.
  3. It conveys info to someone writing a compiler for C code.
  4. But that info is a duplicate of other info that the compile writer has.
  5. So a compiler writer can take note of either piece of info to update the symbol table
  6. Or indeed, they can check that the two pieces of info are consistent, and if not, issue an error message.

Leave a Comment