Sunday, May 17, 2009

It's all C's fault

C's type casting syntax creates a lot of trouble for parsers.

(a)-b

What does the above code mean? are you casting negative b to type a? or are you subtracting b from a? It's ambiguous and depends on whether a is a type or a value.

Well, actually that's not the whole story. In Java, the only two ambiguous binary/unary operators are + and - which are only applicable as unary operators to primitive types. So any instance of (___)-b could be disambiguated by whether or not the ___ was one of the 7 primitive types. This is because you can't cast a primitive to anything but another primitive. In C++ this is not the case.

C++ has operator overloading, but barely dodges the bullet by enforcing the prototyping rule. To paraphrase, "every identifier must be defined prior to its usage." including types and where "prior" refers to position in source code. So if the C++ parser is traveling forward, it will have an accurate bank of identified types whenever it needs to disambiguate code such as (a)-b. All it needs to do is look up what category a belongs to--type, variable, function, etc,--and it will be able to make the right decision from there.

C++ can't cleanly separate parsing from semantic analysis, but C++ sucks anyway so who cares.

1 comment:

Andy said...

This post is kind of weak.