String comparison is an essential feature of many applications. Many software and database systems require such programming to look for differentiation among different strings. In this article, you will learn about string comparison and the different ways to compare strings.
What is string comparison?
String comparison is the process of comparing two strings. These two strings act as operands or parameters that participate to check their differences. Mostly, the comparison process uses the ASCII value or Unicode value to compare two strings. There are three different programming approaches we can use to compare two strings in Python. Let us now discuss each of them in detail.
Method 1: Using Relational Operator
Relational operators are mostly used to compare two constants. Since, it comes under the category of binary operator, it helps in comparing strings in Python also. After using a relational operator on both operands, it either returns True or False depending on the condition. These type of operators in Python are also called the comparison operators. There are 6 different types of comparison operators in Python.
|== (Equal operator)||Checks whether both operands are equal or not|
|> (Greater than)||Checks whether the left-hand side operand is greater than the right-hand side operand|
|< (Less than)||Checks whether the right-hand side operand is greater than the left-hand side operand|
|>= (Greater than or equals)||Checks whether the left-hand side operand is greater than or equal to the right-hand side operand|
|<= (Less than or equals)||Checks whether the right-hand side operand is greater than or equals to the left-hand side operand|
|!= (Not equals)||Checks whether both operands are not equal or not|
In the case of Python's relational operator, the Unicode value of each of the characters within the string is checked starting from the zeroth element until the last index or end of the string. Based on that checking of all the left-side operand's Unicode matches with the operand of the right-side Unicode, a Boolean value is returned.
print("Karlos" == "Karlos") print("Karlos" < "karlos") print("Karlos" > "karlos") print("Karlos" != "Karlos")
True True False False
This is a simple program that uses the relational operator to compare its right-hand side operand with its left-hand side operand. Here, the operator checks the Unicode value of each character within the two strings and returns a True if both Unicode number matches; otherwise, false. The print() function will display the True or False depending on the comparison of strings.
Method 2: Using is and is not (Identity) operator
In Python, the == operator is used for comparing the values for both the operands while checking for equality. But the Python's 'is' operator (which is an identity operator) helps in checking whether both its operands are referring to the same object or not. This also happens in the case of != and 'is not' operators of Python.
val1 = "Karlos" val2 = "Karlos" val3 = val1 valn = "karlos" print(" The ID of val1 is: ", hex(id (val1))) print(" The ID of val2 is: ", hex(id (val2))) print(" The ID of val3 is: ", hex(id (val3))) print(" The ID of valn is: ", hex(id (valn))) print(val1 is val1) print(val1 is val2) print(val1 is val3) print(valn is val1)
The ID of val1 is: 0x21d012c4f70 The ID of val2 is: 0x21d012c4f70 The ID of val3 is: 0x21d012c4f70 The ID of valn is: 0x21d012c7cb0 True True True False
Here we are using the identity operator to check & compare the two strings. Here we have declared four variables that will hold some string values. The variable val1 & val2 will hold “Karlos” and val3 will hold the value of val1. The final valn will hold a string “karlos”. Now, each of them is different objects and hence the object ID might vary. Therefore, we are using the hex(id()) functions in combination to fetch and display the object ID for each variable created.
You will notice that the ID of first three will be the same because all of them have the same value (because of space optimization) and hence the print() will display the same location for all these three objects. In the case of valn object, there will be a different object ID because it has a different constant initialized. The same way we can say that the value hold by valn is not equal to val1. This is how identity operator can help in comparing two strings.
Method 3: String Insensitive comparison
In the previous topics, we discussed how we have to match the exact string. But, to perform case-insensitive comparisons, we have to use the lower() and upper() methods. We can find both these methods under Python's string objects. The upper() method is used for converting the entire string into uppercase, whereas the lower() is used to convert all the strings to their lowercase letters.
listOfCities = ["Mumbai", "Bangaluru", "Noida"] currCity = "noiDa" for loc in listOfCities: print (" Case-Insensitive Comparison: %s with %s: %s" % (loc, currCity, loc.lower() == currCity.lower()))
Case-Insensitive Comparison: Mumbai with noiDa: False Case-Insensitive Comparison: Bangaluru with noiDa: False Case-Insensitive Comparison: Noida with noiDa: True
In this program, we have taken a list of strings with three different values. We have taken another variable currCity which is storing another string noiDa. Next, we have to iterate over the list of string (i.e., listOfCities variable) to check whether the curCity matches with any of the strings or not. Also, we have to use the objname.lower() to bring both the strings to lowercase and then used the == operator to compare both oeprands.
Method 4: Using user-defined function
Apart from all the above techniques, we can also create our own user-defined function using the 'def' keyword and take each character from both strings and compare them using the relational operator. This function will allow two string parameters that need to be compared. If the string matches irrespective of upper or lower case, it will show matching by returning a TRUE value.
def strcmpr(strg, strgg): cnt1 = 0 cnt2 = 0 for i in range(len(strg)): if strg[i] >= "0" and strg[i] <= "9": cnt1 += 1 for i in range(len(strgg)): if strgg[i] >= "0" and strgg[i] <= "9": cnt2 += 1 return cnt1 == cnt2 print('Compare String 246 and 2468: ', strcmpr("246", "2468")) print('Compare String KARLOS and karlos:', strcmpr("KARLOS", "karlos"))
Compare String 246 and 2468: False Compare String KARLOS and karlos: True
This is another traditional way of comparing strings in Python. Here we are manually creating a user-defined function that will count all the characters of the string one by one and if matches return True or else return False. But in this case, we are not taking care of the case-sensitivity. And hence, the second strcmpr() function will return True.
Method 5: Using Regular Expression
A regex or regular expression defines a specific pattern in a programming element. Here also, we will be using regular expressions to find patterns in characters of the compared string. To implement the concept of regular expression in Python, we will use the re module. This time we will use the compile() method of the re module to check the pattern.
import re stateList = ["Madhya Pradesh", "Tamil Nadu", "Uttar Pradesh", "Punjab"] pattern = re.compile("[Pp]radesh") for loc in stateList: if pattern.search(loc): print ("%s is matching with the search pattern" % loc)
Madhya Pradesh is matching with the search pattern Uttar Pradesh is matching with the search pattern
Here, we are importing the re (regular expression) module first. Then, we are defining a list that is having the names of 4 different states. Now, we are usingthe re.compile() module to check whether the strings within the list has “Pradesh” with ‘p’ as either in lowercase or in uppercase. If yes, the for loop will iterate through the stateList iterable object and if the patter.search() founds the loc matching, it will print the message – “The <value> is matching with the search pattern”.
Among all of these, using the relational operator or the identity operator has the best significance and is the most efficient way of comparing strings. Some competitive exams might ask you to apply regular expression or check strings with case insensitivity. In that case, you have to go with the method 3 and method 5. But, make sure that method 3 and method 5 are not that efficient. If you want to simply check the string count then method 4 (user-defined technique) is for you.