# HackerRank HackerRant - Mean, Median, and Mode in Python

*HackerRank is an excellent website to create code based on prompt challenges, prepare for coding interviews, search for jobs, and to see how the community has approached the solutions over time. The author wanted to dive into the Python focused solutions, and is in no way affiliated with HackerRank itself.*

## The Challenge: Mean, Median, Mode

From 10 Days of Statistics Day 0: Mean, Median, and Mode:

Output FormatPrint lines of output in the following order:

- Print the mean on a new line, to a scale of decimal place (i.e., , ).
- Print the median on a new line, to a scale of decimal place (i.e., , ).
- Print the mode on a new line; if more than one such value exists, print the numerically smallest one.

Sample Input`10 64630 11735 14216 99233 14470 4978 73429 38120 51135 67060`

Sample Output`43900.6 44627.5 4978`

The top-voted Python 3 solution came out to be:

Python 3 - Dont reinvent the wheel ;)`import numpy as np from scipy import stats size = int(input()) numbers = list(map(int, input().split())) print(np.mean(numbers)) print(np.median(numbers)) print(int(stats.mode(numbers)[0]))`

To those who have been introduced to Python via data science courses and tools, this may seem like the solution one is looking for. Though, this is *only* the case if a project already includes the SciPy package.

### Wait, Why Could This Be Bad Practice?

The **scipy** and **numpy** packages are third-party libraries, and they would have to be added to a `requirements.txt`

, `setup.py`

, or `Pipfile`

in order to make use of them in a project. This adds complexity by piling onto the software supply chain.fn1

Installing **scipy** (which includes installing **numpy** as a dependency) results in:

- Downloading ~45mb worth of files: >3000 files
- Introducing potential for vulnerabilities in a project

Just this year, **numpy** had an Arbitrary Code Execution (ACE) vulnerability raised around how it was unpickling-by-default with `numpy.load`

, which has since changed. The **pickle** module is known for this vulnerability risk, and has a big red warning about it in the Python docs.fn2

Using these third-party packages is overkill for a project that doesn't already contain the libraries, unless you'd really like to be on the lookout for long GitHub Issue conversations and *Common Vulnerabilities and Exposures (CVE)* database entries (such as CVE-2019-6446 in this case) where you try to decipher how big a problem this is if it even is a problem at all.

## Using Standard Libraries

How can we solve this problem with standard libraries that come with Python?

```
# With standard lib imports only
from statistics import mean, median
def basicstats(numbers):
print(round(mean(numbers),1))
print(median(numbers))
print(max(sorted(numbers), key=numbers.count))
input() # Don't need array length, so ignore input
numbers = list(map(float, input().split()))
basicstats(numbers)
```

### Detailed Code Breakdown

```
from statistics import mean, median
```

`statistics`

has been included with Python 3 since Python 3.4 (released in 2014).- We only want
`mean`

and`median`

from this library, so we are explicitly importing each rather than importing the entire library. - Why aren't we using
`mode`

from`statistics`

? This is because`mode`

will error-out in cases where:*"...if there is not exactly***one**most common value,`StatisticsError`

is raised."fn3- This is a problem, due to the last requirement of the challenge for
**mode**output:*"...if more than one such [mode] value exists, print the numerically smallest one."*

- This is a problem, due to the last requirement of the challenge for

```
input() # Don't need array length, so ignore input
numbers = list(map(float, input().split()))
```

- We do nothing with the first
`input()`

, which is meant to be a count of numbers being input in the second prompt. This is dropped because it is not needed in order to produce the mean, median, and mode output. - For
`numbers`

, let's start from the inside-most parentheses and move outword:`input().split()`

breaks apart the single-string input into a list of strings, as`split()`

defaults to whitespace as the*sep*delimiter:*"If***sep**is not specified or is`None`

, a different splitting algorithm is applied: runs of consecutive whitespace are regarded as a single separator, and the result will contain no empty strings at the start or end if the string has leading or trailing whitespace. Consequently, splitting an empty string or a string consisting of just whitespace with a`None`

separator returns`[]`

."fn4`map(float, input().split())`

: Here,`map()`

is being used to convert the resulting list of strings into**float**type values.`list(map(...))`

: The reason we need to convert the*map*back into a*list*is because`map()`

returns an*iterator*. This means we can only call the elements within it*once*. If all we wanted was the median, for example, we wouldn't need to convert the*map*to a*list*type because we may not care about the values anymore after the median is returned.

NOTE:Instead of`list(map(...))`

, we could use alist comprehensionfn5 like so:`numbers = [float(number) for number in input().split()]`

This is argued as a better approach on StackOverflow,fn6 and if you are up for an interesting side note of history, you can read about how`map()`

was nearly removed from Python 3 at one point.fn7

After we have our list of floats, `basicstats(numbers)`

is called, running the following:

```
def basicstats(numbers):
print(round(mean(numbers),1))
print(median(numbers))
print(max(sorted(numbers), key=numbers.count))
```

`print(round(mean(numbers), 1))`

from the inside-most parentheses and move outword to see what we are printing out:`mean(numbers)`

: Simply returns the**mean**without a third-party package!`round(mean(numbers), 1)`

rounds the resulting float to one number after the decimal point (per requirements).

`print(median(numbers))`

: Simply returns the**median**without a third-party package!`print(max(sorted(numbers), key=numbers.count))`

: how is this providing the**mode**?`sorted(numbers)`

: First, we need the list sorted as we are only meant to return the lowest-value mode if their is more than one value. This is needed for`max(...)`

to properly return the lowest value we want.`max(sorted(numbers), key=numbers.count))`

: Providing`key=numbers.count`

as an arg is ensuring we get the value with the highest count within the list.`max()`

only returns a single value, so it will return the first value, being the lowest in the event that there is a draw (due to use using`sorted(numbers)`

).

`Counter()`

Optional Approach to Retrieving Mode: Using Instead of `max()`

, we could alternately use `Counter()`

fn8 from `collections`

, which is argued to be a better approach to this problem.fn9 **Counter()** was added to the **collections** module way back with Python 2.7.0 (released in 2010):

```
# With standard lib imports only
from statistics import mean, median
from collections import Counter
def basicstats(numbers):
print(round(mean(numbers),1))
print(median(numbers))
# Optional approach to 'mode'
print(Counter(sorted(numbers)).most_common(1)[0][0])
input() # Don't need array length, so ignore input
numbers = list(map(float, input().split()))
basicstats(numbers)
```

`Counter(sorted(numbers)).most_common(1)[0][0]`

working from the inside, out:`sorted(numbers)`

is needs for the later call of`most_common()`

to return the*lowest*mode.`Counter(...)`

: Creates a dictionary with count values of all elements in the list.`Counter(...).most_common(1)`

: Returns a*list*of*tuples*. Using`1`

as an arg means it returns only one*tuple*, being the*first*value that appears the most often.`Counter(...).most_common(1)[0][0]`

: The first`[0]`

means we are calling the*tuple*in the`0`

index position of the*list*, with the`[0]`

calling the`0`

index value of that*tuple*.

## Conclusion

There are many ways to come to a solution, and depending on the situation, some are better than others. If packages like **scipy** and/or **numpy** are already included within a project, it certainly makes sense to use them.

Though, it is a great idea to take a look at whether built-in or standard libraries can solve a problem before looking into third-party solutions. This helps you:

- Learn what Python is capable of out-of-the-box
- Make your code more portable for use in other projects without installing additional resources
- Reduce the security complexity of the software supply chainfn1 by avoiding unnecessary inclusion of third-party packages

Was this helpful? Have thoughts to add? Please add to the conversation on dev.to!

## Footnotes

Software Supply Chain: Fewer, Better Suppliers. Written by Shannon Lietz @ DevSecOps, 2016˄

Comprehending Python's Comprehensions. Written by Dan Bader @ dbader.org˄

The Fate of

`reduce()`

in Python 3. Written by Guido van Rossum, 2005.*NOTE: He's the creator, and previous BDFL, of Python. The article includes thoughts on*˄**map()**,**filter()**, and**lambda**.StackOverflow: Python - Find The Item with Maximum Occurrences in A List.˄