In this article, i will show you how to use python regular expression module ( python re module ) to parse a string to return the first matched string and all matched strings.
1. Use Python Regular Expression Module To Parse String Steps.
- Import re module.
# import python regular expression parse module. import re
- Create a re.Pattern object by invoke re.compile function. You should provide a pattern format string to the compile function like below.
# create the pattern format string. pattern_format_string = r'\d\d\d-\d\d\d\d\d\d\d\d' # create the re.Pattern object use re.compile function. reg_pattern = re.compile(pattern_format_string)
- To get the first matched string, you should invoke the re.Pattern object’s search function like below.
# invoke re.Pattern object's search method, pass the string that will be parsed to it. search_result = pattern_object.search(string) # get the first matched string in the result by invoke it's group function. print('search result: ' + search_result.group()) - To get all matched string, you should invoke the re.Pattern object’s findall function like below.
# invoke re.Pattern object's findall method. find_all_result = pattern_object.findall(string) print('find all result: ' + str(find_all_result))
2. Python Regular Expression Examples.
There are 2 function in this example, one invoke re.Pattern‘s search function, the other invoke re.Pattern‘s findall function. You can see code comments for detail explanation.
'''
Created on Sep 22, 2020
@author: songzhao
'''
# import python regular expression parse package.
import re
'''
This function will invoke the python regexo Pattern object's search method to get the first matched string.
'''
def regexp_search_function(pattern, string):
searched_result = pattern.search(string)
print('search phone number result: ' + searched_result.group())
'''
This function will invoke the python regexo Pattern object's findall method to get all the matched string in a list.
'''
def regexp_find_all_function(pattern, string):
find_all_result = pattern.findall(string)
print('find all phone number result: ' + str(find_all_result))
if __name__ == '__main__':
# create a regexp pattern to match a phone number
phone_number_format = r'\d\d\d-\d\d\d\d\d\d\d\d'
phone_number_pattern = re.compile(phone_number_format)
# this is the phone number string that contain 3 phone number, but only the first 2 match above phone number pattern.
phone_number_string = 'phone_number_1: 010-88888889;phone_number_2:012-89877987; phont_number_3: 0893-898998'
regexp_search_function(phone_number_pattern, phone_number_string)
regexp_find_all_function(phone_number_pattern, phone_number_string)
# create a regexp pattern to match a phone number with group, please notice the parentheses in the pattern format string.
phone_number_format_use_group = r'(\d\d\d)-(\d\d\d\d\d\d\d\d)'
phone_number_pattern = re.compile(phone_number_format_use_group)
# above phone number pattern will return
regexp_find_all_function(phone_number_pattern, phone_number_string)
Below is above example execution result.
search phone number result: 010-88888889
find all phone number result: ['010-88888889', '012-89877987']
find all phone number result: [('010', '88888889'), ('012', '89877987')]
Reference