Register Login

How to Find Mean Mode and Median in Python for Data Science

If you are looking out for summarizing your data, you would probably start by calculating the mean (or average), the median, and the mode of the data. Finding the centralized data (known as central tendency measure) is often our preliminary approach to find and understand data. In this tutorial, you will learn how to compute the mean, median, and mode of a data set without using any library and using a library function.
Mean, Median, and Mode

Let us first understand what mean, median, and mode are?

  • Mean: We can define the mean as the average value of all numbers. It is also called the arithmetic mean. To find the average of all numbers, the basic approach or the arithmetic approach is to add all the numbers and divide that addition with the quantity of numbers. Let suppose, you have five numbers (2, 4, 3, 7, 9). To find the average of these numbers, you have to simply add them (2+4+3+7+9) and divide the addition with 5 (because it has five numbers).
  • Median: The median is the middle value in a cluster of numbers or values. In this, the group of values remains sorted in either ascending or descending order. If there is an odd quantity of numbers, the median value will be in the middle having the same amount of numbers before and after it. Suppose we have 2, 3, 4, 5, 6, then 4 is the median value in this number group.
  • Mode: We can define mode as that particular number, which occurs most often in a cluster of numbers or values. The mode number will appear frequently, and there can be more than one mode or even no mode in a group of numbers. Suppose we have 3, 4, 7, 4, 2, 8, 6, 2. Then, here are two mode numbers, 4 and 2.

Program to find Mean, Median, and Mode without using Libraries:

Mean:

numb = [2, 3, 5, 7, 8]
no = len(numb)
summ = sum(numb)
mean = summ / no
print("The mean or average of all these numbers (", numb, ") is", str(mean))

Output:

The mean or average of all these numbers ( [2, 3, 5, 7, 8] ) is 5.0

Explanation:

In this program, we have taken a list with the name numb that holds five numbers. Then, we create another variable (no) that stores the length of the numb using len(). Then the sum() function takes care of the summation of all the values of the list that is stored in the sum variable. After that, to find the mean, we calculate it by dividing sum with the number of elements in the list. Finally, we print the mean value.

Median:

numb = [2, 4, 5, 8, 9]
no = len(numb)
numb.sort()
if no % 2 == 0:
    median1 = numb[no//2]
    median2 = numb[no//2 - 1]
    median = (median1 + median2)/2
else:
    median = numb[no//2]
print("The median of the given numbers  (", numb, ") is", str(median))

Output:

The median of the given numbers  ( [2, 4, 5, 8, 9] ) is 5

Explanation:

In this program, we have taken a list with the name numb that holds five numbers. Then, we create another variable (no) that stores the length of the numb using len(). Then the sort() will sort the numbers of the numb. We have to check a condition whether no is even or odd. If it is even, we have to simply perform the floor division by 2 on the list numb and store it in the median1. Similarly, we have to again floor division by 2 and subtract it by 1 and store it in median2. These two values (median1 and median2) will help in finding a balance number. Now, to finally calculate the balance number, add both median1 and median2 and divide the whole with 2 (if the length of list is even) or in the else part, median will be numb[floor division 2] (if the length of the list is odd). Finally, print the calculated median.

Mode:

from collections import Counter
numb = [2, 3, 4, 5, 7, 2]
no = len(numb)
val = Counter(numb)
findMode = dict(val)
mode = [i for i, v in findMode.items() if v == max(list(val.values()))]  
if len(mode) == no:
    findMode = "The group of number do not have any mode"
else:
    findMode = "The mode of a number is / are: " + ', '.join(map(str, mode))
print(findMode)

Output:

The mode of a number is / are: 2

Explanation:

First, we will import the counter module. In this program, we have to take a list with the name numb that holds six numbers. Then, we create another variable (no) that stores the length of the numb using len(). Python Counter is a container holding the count of every element residing in the container. The val will hold the counter value and the existence of each element. Then we typecast the value of val to dictionary using the dict(). Then we perform a list comprehension operation by iterating over every item of the list to find the mode and the count of items stored in the mode. The next if condition checks whether the mode has a length equals to the number, if yes, there is no repetation of number in the list and hence will store the string "The group of number do not have any mode". Otherwise, it will display the mode in string by joining itself with the string "The mode of a number is / are: ".

Program to find Mean, Median, and Mode using pre-defined library:

Statistics Module:

As you all know, calculating the mean, media, and mode are some common practices done by data analysts and data science engineers. That is the reason Python included this functionality within the statistics module to make our task easier.

The statistics module contains various pre-defined data handling functions that you are shown below

To find the mean, the method is:

import statistics
statistics.mean([5, 3, 6, 8, 9, 12, 5])

To find the mean, the method is:

import statistics
statistics.median([5, 3, 6, 8, 9, 12, 5])

To find the mean, the method is:

import statistics
statistics.mode([5, 3, 6, 8, 9, 12, 5])

Conclusion:

The mean (or average), the median, and the mode are usually the initial things data analysts look at in any sample data when trying to assume the necessary inclination of the data. It is always better to use the manual approach, but if the code is complicated and we have to find the mean, median, and mode in lesser time, using the statistical module is the best option.