正则表达式
ABAP中可以使用regex的地方
除了下面两个语句可以使用regex外:
下面主函数的参数也可以使用regex:
- count()
- contains()
- find()
- match()
- matches()
- replace()
另外,还有两个类也可以使用regex:
正则式语法规则
下面都是针对单个字符匹配的:
Special character |
Meaning |
. |
任何单个字符 |
C |
与.意义一样:可匹配任何单个字符 |
d |
任何单个数字字符 |
D |
任何单个非数字字符 |
l |
任何单个小写字母字符lowercase letter |
L |
任何单个非小写字母字符lowercase letter |
s |
任何一个空白字符a blank character |
S |
任何一个非空白字符a blank character |
u |
任何单个大写字母字符uppercase letter |
U |
任何单个非大写字母字符uppercase letter |
w |
任何单个字母数字字符,包括下划线 _ any alphanumeric character including _ |
W |
任何单个非字母数字字符,并排除下划线 _ non-alphanumeric character except for _ |
[ ] |
任何单个字符集a value set for single characters |
[^ ] |
字符集之外的其他任何单个字符 如果^不在[…]里最前面,则^表示只是一个普通的^字符,而不是取反的意思,如[A^B] 中的^ 会匹配普通的字符^ |
[ - ] |
范围字符集 a range in a value set for single characters 如果-不在[…]两个字符之间(但a-z-Z这种不算),则-表示只是一个普通的-字符,而不是范围意思,如[A-Za-z0-9-]中的最后-可以匹配到- |
[[:alnum:]] |
[:alpha:] 与 [:digit:]并集(字母+数字alphanumeric characters) |
[[:alpha:]] |
字母字符集 |
[[:blank:]] |
空白字符集 blank characters and horizontal tabulators(制表) in a value set |
[[:cntrl:]] |
控制字符集all control characters in a value set |
[[:digit:]] |
与 d 等效 |
[[:graph:]] |
除开空白字符与垂直制表符外的所有可显示字符 Description of all characters in a value set that can be displayed apart from blank characters and horizontal tabulators |
[[:lower:]] |
所有小写字母字符集 lowercase letters in a value set |
[[:print:]] |
[:graph:] 与 [:blank:],可显示字符集,all characters in a value set that can be displayed |
[[:punct:]] |
所有的标点字符集all punctuation characters in a value set |
[[:space:]] |
空白字符+制表符+回车换行符 blank characters, tabulators, and carriage feeds in a value set |
[[:unicode:]] |
所有大于255的Unicode字符 Unicode characters in a value set with a code larger than 255 |
[[:upper:]] |
大写字母字符集 uppercase letters in a value set |
[[:word:]] |
字母数字+下划线 alphanumeric characters in a value set, including _ |
[[:xdigit:]] |
十六进制数字 hexadecimal digits in a value set |
a f v |
各种控制字符 Various platform-specific control characters |
[..] |
为以后增强所保留,目前不支持 Reserved for later enhancements |
[==] |
为以后增强所保留,目前不支持 Reserved for later enhancements |
Special character |
Meaning |
|||||||||||||||
{n} |
出现n次 |
|||||||||||||||
{n,m} |
出现n到m次 |
|||||||||||||||
{n,m}? |
为以后增强所保留,目前不支持 Reserved for later enhancements {n,m}?, *?,+?属于非贪婪,但目前ABAP中不支持非贪婪 |
|||||||||||||||
? |
0次或1次 |
|||||||||||||||
* |
0次或多次 |
|||||||||||||||
*? |
为以后增强所保留,目前不支持 Reserved for later enhancements |
|||||||||||||||
+ |
1次或多次 |
|||||||||||||||
+? |
为以后增强所保留,目前不支持 Reserved for later enhancements |
|||||||||||||||
| |
或者 Linking of two alternative expressions
r|st = r|(?:st)不等于(?:r|s)t r|s+ = r|(?:s+)不等于(?:r|s)+
|
|||||||||||||||
( ) |
捕获组(分组) |
|||||||||||||||
(?: ) |
非捕获组(分组,但不捕获) |
|||||||||||||||
1, 2,3... |
||||||||||||||||
Q ... E |
在Q ... E之间的所有字符都会看作是普通的字符:
Definition of a string of literal characters |
|||||||||||||||
(? ... ) |
为以后增强所保留,目前不支持 Reserved for later enhancements (? ... ) 已被保留,除了(?:...)、(?=...)、(?!...)三种目前支持之外,其他所以的(? ... )这种形式都会抛异常CX_SY_INVALID_REGEX |
Special character |
Meaning |
字符串的开头或行的开头 Anchor character for the start of a line |
|
字符串的开头 Anchor character for the start of a character string |
|
字符串结尾、字符串结尾的 之前或行的结尾(不会匹配到 ,如果最末有 ,只会匹配到它之前) Anchor character for the end of a line |
|
字符串的结尾 Anchor character for the end of a character string |
|
字符串的结尾或字符串结尾的 之前 Like z, whereby line breaks at the end of the character string are ignored |
|
< |
单词的开头,即匹配单词边界,不匹配任何字符 Start of a word |
> |
单词的结尾,即匹配单词边界,不匹配任何字符 End of a word |
|
单词的开头或结尾,即匹配单词边界,不匹配任何字符 Start or end of a word |
B |
与上面的相反,即非单词边界 Space between characters within a word |
正向搜索(正) Preview condition |
|
正向搜索(负) Negated preview condition |
|
(?> ) |
Cut operator不知道这个有什么用???? |
Special character |
Meaning |
整个匹配 Placeholder for the whole found location |
|
子匹配分组(捕获组)引用 Placeholder for the register of subgroups |
|
匹配到的子串之前的所有字符串 Placeholder for the text before the found location |
|
匹配到的子串之后的所有字符串 Placeholder for the text after the found location |
(?=…)、(?!...)
注:SAP反向搜索(?<=...)、(?<!...)目前不支持,只支持正向搜索(?=...)、(?!...),但Java都支持
DATA text TYPE string.
DATA result_tab TYPE match_result_tab WITH HEADER LINE.
text = `Shalalala!`.
FIND ALL OCCURRENCES OF REGEX '(?:la)(?=!)'
IN text RESULTS result_tab[].
LOOP AT result_tab .
WRITE: / result_tab-offset, result_tab-length.
ENDLOOP.
7 2
A 、z、与^ 、$区别
^ 指定的匹配必须出现在字符串的开头或行的开头(即 后面第一个字符)。
$ 指定的匹配必须出现在以下位置:字符串结尾、字符串结尾的
之前或行的结尾(不会匹配到
,如果最末有
,只会匹配到它之前)。
A 指定匹配必须出现在字符串的开头(忽略 Multiline 选项)。
z 指定匹配必须出现在字符串的结尾(忽略 Multiline 选项)。
指定匹配必须出现在字符串的结尾或字符串结尾的
之前(忽略 Multiline 选项)。
A 、z、与^ 、$不同的是,只搜索第一行
DATA text TYPE string.
"在ABAP中如果要在字符中插入回车换行字符时,需要使用 ||,而不是使用单引号引起来
text = |zzz
abc
|.
IF contains( val = text regex = '^abc' ) .
WRITE:/ 'Yes'.
ELSE.
WRITE:/ 'No'.
ENDIF.
"如果是将 ^ 换成 A 时,则结果不会匹配
IF contains( val = text regex = 'Aabc' ).
WRITE:/ 'Yes'.
ELSE.
WRITE:/ 'No'.
ENDIF.
IF contains( val = text regex = 'z$' ).
WRITE:/ 'Yes'.
ELSE.
WRITE:/ 'No'.
ENDIF.
IF contains( val = text regex = 'c$' ).
WRITE:/ 'Yes'.
ELSE.
WRITE:/ 'No'.
ENDIF.
IF contains( val = text regex = 'zz' ).
WRITE:/ 'Yes'.
ELSE.
WRITE:/ 'No'.
ENDIF.
IF contains( val = text regex = 'z' ).
WRITE:/ 'Yes'.
ELSE.
WRITE:/ 'No'.
ENDIF.
IF contains( val = text regex = 'cz' ).
WRITE:/ 'Yes'.
ELSE.
WRITE:/ 'No'.
ENDIF.
IF contains( val = text regex = 'c' ).
WRITE:/ 'Yes'.
ELSE.
WRITE:/ 'No'.
ENDIF.
Yes
No
Yes
Yes
No
No
No
Yes
$0…、$&、$`、$′
替换串里可以出现$、&、`、 ′
$0、$&:表示的是整个正则式所匹配到的子串,即好比整个正则式使用 ( ) 括起来一样,但不使用 ( ) 括起来整个regex所匹配的子串还是$0,即$0与整个regex是否使用了 ( ) 括起来没有关系,但是,如果使用了( )将整个regex括起来了,则对后面的$1…是有影响的,整个regex此时会是$1,这与Java是不一样的
DATA text TYPE string.
text = `Yeah!+`.
REPLACE REGEX `((Y)e(a)h(!))` IN text WITH `-$&-$0-$1-$2-$3-$4-`.
WRITE:/ text.
text = `Yeah!`.
REPLACE REGEX `w+` IN text WITH `-$&-$0-`.
WRITE:/ text.
-Yeah!-Yeah!-Yeah!-Y-a-!-+
-Yeah-Yeah-!
$`表示所匹配到的子串之前的所有字符串,如果多次相同匹配,则所取到的是未经替换过的前面部分源串:
DATA text TYPE string.
text = `again and abc and def`.
REPLACE ALL OCCURRENCES OF REGEX 'and' IN text WITH '($0 $`)'.
WRITE:/ text.
again (and again ) abc (and again and abc ) def
$'表示所匹配到的子串之后的所有字符串:
DATA: text TYPE string.
text = `again and again abc`.
REPLACE ALL OCCURRENCES OF REGEX `again ` IN text WITH `($' $0)`.
WRITE:/ text.
(and again abc again )and (abc again )abc
1、2、3…
在分组正则式里,可以使用1, 2, 3…方式引用regex分组,与$分组不一新的是,只要用括号括起来的最左且最外的Regex为1,而不是