0%

Flex handles ambiguous patterns

Most flex programs are quite ambiguous, with multiple patterns that can match the same input. Flex resolves the ambiguity with two simple rules:

  • Match the longest possible string every time the scanner matches input.
  • In the case of a tie, use the pattern that appears first in the program.

These turn out to do the right thing in the vast majority of cases. Consider this snippet
from a scanner for C source code:

1
2
3
4
5
6
7
8
9
10
"+"
"="
"+=" { return ADD; }
{ return ASSIGN; }
{ return ASSIGNADD; }
"if"
"else"
[a-zA-Z_][a-zA-Z0-9_]* { return KEYWORDIF; }
{ return KEYWORDELSE; }
{ return IDENTIFIER; }

For the first three patterns, the string += is matched as one token, since += is longer than
+ . For the last three patterns, so long as the patterns for keywords precede the pattern
that matches an identifier, the scanner will match keywords correctly.

处无为之事,行不言之教;作而弗始,生而弗有,为而弗恃,功成不居!

欢迎关注我的其它发布渠道