# Statistical Functions

Numpy has many useful statistical functions for calculating mean, minimum across any axis, variances etc.

**1.) Order Statistics**

a.)

**Numpy.amin():**Returns the minimum across any axis. b.)

**Numpy.amax():**Returns the maximum across any axis. c.)

**Numpy.percentile():**Returns the qth percentile of the data along the specified axis. This function is the same as the median if q=50, the same as the minimum if q=0 and the same as the maximum if q=100.Order Statistics

`import numpy as np`

```
a = np.arange(6).reshape((3,2))
a
```

#### amin() function to find minimum in array.

#### Similarly we can go for amax() to find maximum

```
## Return minimum along flattened array.
np.amin(a)
```

```
## Minimum along columns
np.amin(a,axis=0)
```

```
##Minimum along rows
np.amax(a,axis=1)
```

#### percentile() function to find percentile along the specified axis

`np.percentile(a,q=0)`

```
print(np.percentile(a,q=50))
print(np.percentile(a,q=50,axis=0))
```

`np.percentile(a,q=100)`

**2.) Average and Variances**

a.)

**Numpy.median():**Compute the median along the specified axis. b.)

**Numpy.var():**Compute the variance along the specified axis. Directly, using the function we can calculate it as:**Numpy.var(x, axis=0)**: It is calculating variance across columns.

**Numpy.var(x, axis=1)**: It is calculating variance across rows.

**Numpy.var(x)**: It is calculating variance taking as the whole array.

If you want to know the core details of calculation behind variance go through this:

Variance is calculated as, suppose we are having the no. of data points X1, X2,. . . . , Xn then the variance is calculated (this is not including across any axis, it is simply for whole Numpy array. In order to calculate along any axis we need to modify our process a bit.) as:

**Step 1:**Calculate the mean of all data points. Using Numpy you can do it as->

**y=x.mean()**

**Step 2:**Subtract the mean from each data point Xi and squaring them.Using Numpy->

**z=np.sum(np.square(x-y))**

**Step 3:**In the final step, sum up all the squared results and then divide by no. of data points.Using Numpy->

**z/x.size**

Average and Variances

`import numpy as np`

```
a = np.arange(6).reshape(3,2)
a
```

1. np.median() -> Calculates the median of an array.For median, if

For median, if

**no. of values are even**then it takes an average of middle elements.For median, if

**no. of values is odd**then it takes simply the middle element.`np.median(a)`

`np.median(a, axis=0)`

`np.median(a, axis=1)`

2. numpy.var() -> Calculates the variance of matrix

`np.var(a)`

`np.var(a,axis=0)`

`np.var(a,axis=1)`

Core implementation of variance

This is for whole flattened array

```
y = a.mean()
print(y)
```

```
z = np.sum(np.square(a-y))
print(z)
```

`z/a.size`

#### This is for matrix along axis=1

```
y= a.mean(axis=1)
print(y)
y.reshape(3,1)
```

```
print(a-y.reshape(3,1))
z = np.sum(np.square(a-y.reshape(3,1)),axis=1)
print(z)
```

`z/a.shape[1]`