1.Introduction
Flex是Fast Lexical Analyzer的简称,是词法分析器生成器。以下是摘自sourceforge的英文介绍:
Flex is a tool for generating scanners. A scanner is a program which recognizes lexical patterns in text.
The flex program reads the given input files, or its standard input if no input file names are given, for a
description of a scanner to generate. The description is in the form of pairs of regular expressions and
C code, called rules. Flex generates as output a C source file, lex.yy.c by default, which defines a routine
yylex(). The file can be conpiled and linked with the flex runtime library to produce an executable. When
the executable is run, it analyses its input for occurrences of the regular expressions. Whenever it finds one,
it executes the corresponding C code.
2.Flex输入文件的格式
Flex的输入文件由三个部分(section)组成,各个部分之间以%%隔开。
definitions
%%
rules
%%
user code
2.1 Format of the Definitions Section
The definitions section contains declarations of simple name definitions to simplify the scanner specification,
and declarations of start conditions.
Name definitions have the form:
name definition
name 是以字母或者下划线开始,后面跟字母、数字、下划线或者破折号("-"),而definition部分是以紧接在name后
的非空格字符开始,直到本行结束。definition部分可以用{name}的形式来引用,它将被扩展为(definition)。例如:
DIGIT [0-9]
ID [a-z][a-z0-9]*
这里的DIGIT被定义为匹配单个数字的正则表达式,而ID被定义为匹配以字母开始后跟0个或者多个字符、数字的正则
表达式。如果我们这样写:{DIGIT}+"."{DIGIT}*,那么随后它将被扩展为([0-9])+"."([0-9])*。
不带缩进的注释部分(形如/*...*/)将被原封不动的拷贝到输出文件中(lex.yy.c)。
任何缩进的文本或者以‘%{’和‘%}’包括起来的文本将被原封不动的拷贝到输出文件,但是符号‘%{’和‘%}’将被删除。在lex
文件中%{和%}不能缩进。
(未完待续)