5. 正则表达式必知必会-重复匹配
5.1 有多少个匹配
匹配电子邮箱。
w@w.w 只能匹配 a@b.c 这样的邮箱,不能匹配 abcd@dbcd.com 这样的邮箱地址。
5.1.1 匹配一个或多个字符
匹配同一个字符或字符集的多次重复,可以用字符 +,匹配一个或多个字符。
const email1 = 'luwl@qq.com';
const email2 = 'luwl2@163.com';
const reg = /w+@w+.w+/; // 此处正则不能使用全局 g,否则,下面第二个 console 打印出来是 false,参考下面
console.log(reg.test(email1)); // true
console.log(reg.test(email2)); // true
匹配一个或多个字符集
const str =
'Send personal email to luwl@qq.com or luwl.p@qq.com. For questions use support@qq.com or support@vip.qq.com.';
const reg = /w+@w+.w+/g;
let match;
while ((match = reg.exec(str))) {
console.log(`匹配项: ${match[0]}, index: ${match.index}`);
}
// 匹配项: luwl@qq.com, index: 23
// 匹配项: p@qq.com, index: 43 <-- 不对
// 匹配项: support@qq.com, index: 71
// 匹配项: support@vip.qq, index: 89 <-- 不对
const str =
'Send personal email to luwl@qq.com or luwl.p@qq.com. For questions use support@qq.com. If your message is urgent try support@vip.qq.com.';
const reg = /[w.]+@[w.]+.w+/g; // <-- 修改正则,匹配字符集 [w.]
let match;
while ((match = reg.exec(str))) {
console.log(`匹配项: ${match[0]}, index: ${match.index}`);
}
// 匹配项: luwl@qq.com, index: 23
// 匹配项: luwl.p@qq.com, index: 38
// 匹配项: support@qq.com, index: 71
// 匹配项: support@vip.qq.com, index: 89
5.1.2 匹配零个或多个字符
元字符 + 匹配一个或多个字符,元字符 * 匹配零个或多个字符。
const str = '.luwl@qq.com';
const reg1 = /[w.]+@[w.]+.w+/;
console.log(reg1.exec(str)); // [".luwl@qq.com", index: 0, input: ".luwl@qq.com", groups: undefined]
// email 的开头是点,不符合 email 的格式
const reg2 = /w+[w.]+@[w.]+.w+/;
console.log(reg2.exec(str)); // ["luwl@qq.com", index: 1, input: ".luwl@qq.com", groups: undefined]
const reg3 = /w+[w.]*@[w.]+.w+/; // <-- * 前面的是可选的
console.log(reg3.exec(str)); // ["luwl@qq.com", index: 1, input: ".luwl@qq.com", groups: undefined]
5.1.3 匹配零个或一个字符
元字符 ?,匹配一个字符或字符集的 0 次或 1次。
const str =
'The url is http://wendys.com/, to connect securely use https://wendys.com/ . This is a wrong url: httpssssss://wendys.com/ .';
const reg1 = /http://[w./]+/g;
let match1;
while ((match1 = reg1.exec(str))) {
console.log(match1[0]);
}
// http://wendys.com/
// 匹配不到 https://wendys.com
const reg2 = /https*://[w./]+/g; // <-- 匹配 0 次或多次 s
let match2;
while ((match2 = reg2.exec(str))) {
console.log(match2[0]);
}
// http://wendys.com/
// https://wendys.com/
// httpssssss://wendys.com/ <-- 匹配到错误的 url
const reg3 = /https?://[w./]+/g; // <-- 匹配 0 次或 1 次 s
let match3;
while ((match3 = reg3.exec(str))) {
console.log(match3[0]);
}
// http://wendys.com/
// https://wendys.com/
5.2 匹配的重复次数
重复次数可以用 {数值} 来表示。
5.2.1 为重复匹配的次数设定一个精确的值
匹配十六进制的颜色。
const color1 = '#d8d8d8';
const color2 = '#666666';
const color3 = '#ddd';
const reg = /#[a-zA-Z0-9]{6}/; // <-- 重复6遍
console.log(reg.exec(color1)); // ["#d8d8d8", index: 0, input: "#d8d8d8", groups: undefined]
console.log(reg.exec(color2)); // ["#666666", index: 0, input: "#666666", groups: undefined]
console.log(reg.exec(color3)); // null
5.2.2 为重复匹配次数设定一个区间
验证日期格式。
const date1 = '07-19-2019';
const date2 = '7-19-2019';
const date3 = '7/19/2019';
const date4 = '7/9/2019';
const date5 = '7/9/19';
const date6 = '7/9/1';
const reg = /d{1,2}[/-]d{1,2}[/-]d{2,4}/; // <-- 不能检查日期值是否有效,只能检查格式是否正确
console.log(reg.exec(date1)); // ["07-19-2019", index: 0, input: "07-19-2019", groups: undefined]
console.log(reg.exec(date2)); // ["7-19-2019", index: 0, input: "7-19-2019", groups: undefined]
console.log(reg.exec(date3)); // ["7/19/2019", index: 0, input: "7/19/2019", groups: undefined]
console.log(reg.exec(date4)); // ["7/9/2019", index: 0, input: "7/9/2019", groups: undefined]
console.log(reg.exec(date5)); // ["7/9/19", index: 0, input: "7/9/19", groups: undefined]
console.log(reg.exec(date6)); // null
5.2.3 匹配 “至少重复多少次“
const str = '1001: $496.80; 1002: $1290.69; 1003: $26.43; 1004: $613.42; 1005: $7.61; 1006: $414.90; 1007: $25.00;';
const reg = /d{4}: $d{3,}.d{2}/g;
let match;
while ((match = reg.exec(str))) {
console.log('匹配项: ' + match[0] + ' index: ' + match.index); // <-- 找出金额大于等于100的
}
// 匹配项: 1001: $496.80 index: 0
// 匹配项: 1002: $1290.69 index: 15
// 匹配项: 1004: $613.42 index: 45
// 匹配项: 1006: $414.90 index: 73
5.3 防止过度匹配
const str = '<p>this is a test, and this is <b>important</b>, this is also <b>important</b>.</p>';
const reg = /<b>.*</b>/g; // <-- 匹配 <b> 中的内容
console.log(reg.exec(str));
// ["<b>important</b>, this is also <b>important</b>", index: 31, input: "<p>this is a test, and this is <b>important</b>, this is also <b>important</b>.</p>", groups: undefined]
// 匹配出来的和预期不一样,预期的应该是两个
- 和 + 是“贪婪型“元字符。
使用这些字符的“懒惰型”版本,就可以了。
贪婪型元字符 | 懒惰型元字符 |
---|---|
* | *? |
+ | +? |
{n, } | {n, }? |
const str = '<p>this is a test, and this is <b>important</b>, this is also <b>important</b>.</p>';
const reg = /<b>.*?</b>/g; // <-- 匹配 <b> 中的内容
let match;
while ((match = reg.exec(str))) {
console.log(`匹配项: ${match[0]}, index: ${match.index}`);
}
// 匹配项: <b>important</b>, index: 31
// 匹配项: <b>important</b>, index: 62