re 库介绍
re 库是 Python 处理文本的标准库
Python re 库主要定义了:
- 9 个常量
- 12 个函数
- 1 个异常
re 库使用
import re
re 库常量
re 库 中的常量表示不可更改的变量, 一般用于做标记. \ re
模块中有 9 个常量, 常量值都是 int 类型:
re.ASCIIorre.Are.IGNORECASEorre.Ire.LOCALEorre.Lre.UNICODEorre.Ure. MULTILINEorre.Mre.DOTALLorre.Sre.VERBOSEorre.Xre.TEMPLATEorre.Tre.DEBUG
re 库源码
class RegexFlag(enum.IntFlag):
ASCII = A = sre_compile.SRE_FLAG_ASCII # assume ascii "locale"
IGNORECASE = I = sre_compile.SRE_FLAG_IGNORECASE # ignore case
LOCALE = L = sre_compile.SRE_FLAG_LOCALE # assume current 8-bit locale
UNICODE = U = sre_compile.SRE_FLAG_UNICODE # assume unicode "locale"
MULTILINE = M = sre_compile.SRE_FLAG_MULTILINE # make anchors look for newline
DOTALL = S = sre_compile.SRE_FLAG_DOTALL # make dot match newline
VERBOSE = X = sre_compile.SRE_FLAG_VERBOSE # ignore whitespace and comments
# sre extensions (experimental, don't rely on these)
TEMPLATE = T = sre_compile.SRE_FLAG_TEMPLATE # disable backtracking
DEBUG = sre_compile.SRE_FLAG_DEBUG # dump pattern after compilation
def __repr__(self):
if self._name_ is not None:
return f're.{self._name_}'
value = self._value_
members = []
negative = value < 0
if negative:
value = ~value
for m in self.__class__:
if value & m._value_:
value &= ~m._value_
members.append(f're.{m._name_}')
if value:
members.append(hex(value))
res = '|'.join(members)
if negative:
if len(members) > 1:
res = f'~({res})'
else:
res = f'~{res}'
return res
__str__ = object.
re.IGNORECASE 使用
- 语法:
re.IGNORECASEorre.I
- 作用:
- 忽略大小写匹配
- 代码:
text = "Hello World."
pattern = r"Hello World."
print("默认模式: ", re.findall(pattern, text))
print("忽略大小写模式: ", re.findall(pattern, text, re.I))
re.ASCII 使用
- 语法:
re.ASCIIorre.A
- 作用:
- 让
\w\ , \\W\ , \\b\ , \\B\ , \\d\ , \\D\ , \\s\ , \\S只匹配 ASCII 编码支持的字符, 而不是 Unicode 编码支持的字符
- 让
- 代码:
text = "a测试b测试c"
pattern = r"\w+"
print("Unicode:", re.findall(pattern, text))
print("ASCII:", re.findall(pattern, text, re.A))
re.DOTALL 使用
- 语法:
re.DOTALLorre.S
- 作用:
- 让
.匹配所有字符, 包括换行符
- 让
- 代码:
text = "测试\n测试"
pattern = r".*"
print("默认模式:", re.findall(pattern, text))
print(".匹配所有模式:", re.findall(pattern, text, re.S))