logo

Python 正则表达式

wangzf / 2023-01-09


目录

re 库介绍

re 库是 Python 处理文本的标准库

Python re 库主要定义了:

re 库使用

import re

re 库常量

re 库 中的常量表示不可更改的变量, 一般用于做标记. \ re 模块中有 9 个常量, 常量值都是 int 类型:

re 库源码

class RegexFlag(enum.IntFlag):
   ASCII = A = sre_compile.SRE_FLAG_ASCII # assume ascii "locale"
   IGNORECASE = I = sre_compile.SRE_FLAG_IGNORECASE # ignore case
   LOCALE = L = sre_compile.SRE_FLAG_LOCALE # assume current 8-bit locale
   UNICODE = U = sre_compile.SRE_FLAG_UNICODE # assume unicode "locale"
   MULTILINE = M = sre_compile.SRE_FLAG_MULTILINE # make anchors look for newline
   DOTALL = S = sre_compile.SRE_FLAG_DOTALL # make dot match newline
   VERBOSE = X = sre_compile.SRE_FLAG_VERBOSE # ignore whitespace and comments
   # sre extensions (experimental, don't rely on these)
   TEMPLATE = T = sre_compile.SRE_FLAG_TEMPLATE # disable backtracking
   DEBUG = sre_compile.SRE_FLAG_DEBUG # dump pattern after compilation

   def __repr__(self):
      if self._name_ is not None:
         return f're.{self._name_}'
      value = self._value_
      members = []
      negative = value < 0
      if negative:
         value = ~value
      for m in self.__class__:
         if value & m._value_:
               value &= ~m._value_
               members.append(f're.{m._name_}')
      if value:
         members.append(hex(value))
      res = '|'.join(members)
      if negative:
         if len(members) > 1:
               res = f'~({res})'
         else:
               res = f'~{res}'
      return res
   __str__ = object.

re.IGNORECASE 使用

text =  "Hello World."
pattern = r"Hello World."
print("默认模式: ", re.findall(pattern, text))
print("忽略大小写模式: ", re.findall(pattern, text, re.I))

re.ASCII 使用

text = "a测试b测试c"
pattern = r"\w+"
print("Unicode:", re.findall(pattern, text))
print("ASCII:", re.findall(pattern, text, re.A))

re.DOTALL 使用

text = "测试\n测试"
pattern = r".*"
print("默认模式:", re.findall(pattern, text))
print(".匹配所有模式:", re.findall(pattern, text, re.S))

re 库函数

re 库异常

正则对象 pattern

参考

  1. re 模块官方文档
  2. re 模块库源码
  3. Python正则表达式
  4. 正则表达式