Role of Regex in String Manipulation

Sushma Kudipudi
2 min readMar 16, 2021

This is an introductory article to understand the power of regex usage through an example.

A regular expression is a sequence of characters that specifies a search pattern in any file(pattern matching) or from website(web scraping) or for pre-processing text(data cleaning).String Manipulation is pretty easy using regex. Having the knowledge of regular expression is extremely useful in data science.

A very simple analogy for understanding regex is like grep commands that we use for searching a pattern in files.

Referring to regex cheat sheet from any good source helps us in recollecting the syntax. I used cheat sheet from https://www.rexegg.com/regex-quickstart.html

Lets say we have a file with list of IP addresses followed by user id of the user in the format.

Example:’173.140.74.179 — rippin3809'

If we are now interested in collecting the IP addresses of users, our regex would be
‘(.*)\s-\s’

search for IP address

Here re module stores all regular expression libraries

( ) indicates to capture a group with the content inside the group.
.* indicates any character except line break, since the file has list of IP addresses with a line break we capture only the IP address with this command.
\s indicates a whitespace character
- is from the format in file followed by another whitespace character.

Thankyou for reading the article.

--

--