# NumPy - Statistical Function

Introducing module import numpy as np

1.numpy.sum(a, axis=None)/a.sum(axis=None)

Calculates the sum, axis integer, or tuple, of the elements associated with array a based on the given axis axis. The default is to sum all elements without specifying an axis.

If the shape of a is (d0,d1,..,dn), when axis=(m1,m2,...mi), the result returned should be a shape of (d0,d1,...,dn)-(dm1,dm2,...dmi), where each element is the sum of the elements above the axis m1,m2,...mi.

Example:

a = np.arange(24).reshape((2, 3, 4))
print("array a:\n", a)
print("np.sum(a):", np.sum(a))                   # All elements and
print("np.sum(a, axis=0):\n", np.sum(a, axis=0))   # Elements and of axis 0 (outermost)
print("np.sum(a, axis=1):\n", np.sum(a, axis=1))    # Axis 1 elements and
print("np.sum(a, axis=(0, 1)):\n", np.sum(a, axis=(0, 1)))  # The sum of axis 0 and axis 1 elements
print("np.sum(a, axis=(0, 2)):\n", np.sum(a, axis=(0, 2)))  # The sum of axis 0 and axis 2 elements

Output:

array a:
[[[ 0  1  2  3]
[ 4  5  6  7]
[ 8  9 10 11]]

[[12 13 14 15]
[16 17 18 19]
[20 21 22 23]]]

np.sum(a): 276

np.sum(a, axis=0):
[[12 14 16 18]         # 0+12=12 1+13=14 ...
[20 22 24 26]          # 4+16=20 5+17=22
[28 30 32 34]]

np.sum(a, axis=1):
[[12 15 18 21]         # 0+4+8=12 1+5+9=15 ...
[48 51 54 57]]         # 12+16+20=48 13+17+21=51

np.sum(a, axis=(0, 1)):
[60 66 72 78]          # 0+4+8+12+16+20=60 1+5+9+13+17+21=66...

np.sum(a, axis=(0, 2)):
[ 60  92 124]          # 0+1+2+3+12+13+14+15=60 4+5+6+7+16+17+18+19=92....

2.numpy.mean(a, axis=None)/a.mean(axis=None)

Calculates the average, axis integer, or tuple of the elements associated with array a based on the given axis axis.

Axis is not specified and all elements are averaged by default.Specify axis, and average the elements on the specified axis.

If the shape of a is (d0,d1,..,dn), when axis=(m1,m2,...mi), the result returned should be a shape of (d0,d1,...,dn)-(dm1,dm2,...dmi), where each element is the average of all elements on axis m1,m2,...mi

Example:

print("array a:\n", a)
print("np.mean(a):", np.mean(a))    # Average of all elements
print("np.mean(a, axis=0):\n", np.mean(a, axis=0))  # Average on 0 axis
print("np.mean(a, axis=(0, 2)):\n", np.mean(a, axis=(0, 2)))    # Average of 0 and 2 axes

Output:

array a:
[[[ 0  1  2  3]
[ 4  5  6  7]
[ 8  9 10 11]]

[[12 13 14 15]
[16 17 18 19]
[20 21 22 23]]]
np.mean(a): 11.5
np.mean(a, axis=0):
[[ 6.  7.  8.  9.]         # (0+12)/2=6 (1+13)/2=7...
[10. 11. 12. 13.]          # (4+16)/2=10 (5+17)/2=11...
[14. 15. 16. 17.]]         # (8+20)/2=14 (9+21)/2=15..
np.mean(a, axis=(0, 2)):
[ 7.5 11.5 15.5]           # (0+1+2+3+12+13+14+15)/2=7.5..


3.numpy.average(a,axis=None,weights=None)

Calculates the weighted average of the elements associated with array a based on the given axis axis.

Weights are an array of weights whose shape should be the same as that of a given array, that is, weights.shape=a.shape, or when an axis axis is specified, weight should be a one-dimensional array with the same number of elements as the specified axis dimension.

When weigts is not specified, the mean is calculated, and the effect is the same as.mean

Example:

print("array a:\n", a)
print("np.average(a, axis=0):\n", np.average(a, axis=0))
print("np.average(a, axis=0, weights=[10, 1]):\n", np.average(a, axis=0, weights=[10, 1]))
wei = np.random.randint(1, 60, (2, 3, 4 ))
print("The weight array is:", wei)
print("np.average(a, axis=(0, 2), weights=wei):\n", np.average(a, axis=(0, 2), weights=wei))

Output:

array a:
[[[ 0  1  2  3]
[ 4  5  6  7]
[ 8  9 10 11]]

[[12 13 14 15]
[16 17 18 19]
[20 21 22 23]]]
np.average(a, axis=0):
[[ 6.  7.  8.  9.]
[10. 11. 12. 13.]
[14. 15. 16. 17.]]
np.average(a, axis=0, weights=[10, 1]):
[[ 1.09090909  2.09090909  3.09090909  4.09090909] # (0*10+12*1)/(10+1)=1.0909
[ 5.09090909  6.09090909  7.09090909  8.09090909]  # (4*10+16*1)/(10+1)=5.0909
[ 9.09090909 10.09090909 11.09090909 12.09090909]]
//The weight array is: [[[375 509]
[ 9 40 17 42]
[45  4 41 29]]

[[17 24 29 37]
[20  8 14 37]
[ 3  1 48 14]]]
np.average(a, axis=(0, 2), weights=wei):
[ 7.73557692 10.92513369 13.96756757]  # (0*37+1*5+2*50+3*9+12*17+13*24+14*29+15*37)/(37+5+50+9+17+24+29+37)=7.7355


4.numpy.std(a,axis=None)/a.std(axis=None)       numpy.var(a,axis=None)/a.var(axis=None)

.std(a,axis=None) Calculates the total standard deviation of the elements associated with array a based on the given axis axis (to be distinguished from the sample standard deviation)

That is: $$sigma=\sqrt{{\frac 1N}\sum_{i=1}^N(x_i-\overline x)^2}$$

(Standard Deviation) - std standard deviation, also known as mean deviation

.var(a,axis=None) Calculates the population variance of the elements associated with array a based on the given axis axis

That is: $$sigma^2={\frac {\sum_{i=1}^N(x_i-\overline x)^2}N}) Variance-var variance example b = np.random.randint(1, 30, (2, 3, 4)) print("array b:\n", b) print("np.std(b, axis=2):\n", np.std(b, axis=2)) # standard deviation print("np.var(b, axis=2):\n", np.var(b, axis=2)) # variance Output: array b: [[[16 8 27 24] [12 15 25 8] [11 19 15 26]] [[29 15 18 24] [17 8 4 15] [ 2 28 10 21]]] np.std(b, axis=2): [[7.39509973 6.28490254 5.53962995] [5.40832691 5.24404424 9.98436277]] np.var(b, axis=2): [[54.6875 39.5 30.6875] [29.25 27.5 99.6875]]  Let's examine, for example, the standard deviation of 12,15,258 sets of data on 2 axes: The mean is: \(\overline x=15$$

The standard deviation of the sample is (sigma=sqrt{\frac {(12-15)^2+ (15-15)^2+ (25-15)^2+left(8-15right)^2}{4}=sqrt{39.5}approx6.284988)

The variance is: $$\sigma^2=39.5$$

5. Maximum Function

numpy.amin(a,axis=None)/numpy.min(a,axis=None)/a.min(axis=None)

Returns the minimum value on the axis axis, or the minimum value of all elements by default if no axis is specified

numpy.amax(a,axis=None)/numpy.max(a,axis=None)/a.max(axis=None)

Returns the maximum value on axis axis, or defaults to the maximum value of all elements if no axis is specified

Example:

c = np.random.randint(1, 60, (2, 3, 4))
print("array c:\n", c)
print("np.min(c):     ", np.min(c))
print("np.amin(c, axis=1):\n", np.amin(c, axis=1))
print("c.min(axis=2): \n", c.min(axis=2))
print("-"*20 + 'Split Line' + '-'*20)
print("np.max(c):    ", np.max(c))
print("np.amax(c, axis=1):\n", np.amax(c, axis=1))
print("c.max(axis=2):\n", c.max(axis=2))

Output:

array c:
[[[15 50 24  6]
[ 2  8 27 53]
[52 23  9 35]]

[[17 38 42 20]
[ 4 32  9 17]
[48 39 17 40]]]
np.min(c):      2
np.amin(c, axis=1):
[[ 2  8  9  6]
[ 4 32  9 17]]
c.min(axis=2):
[[ 6  2  9]
[17  4 17]]
--------------------Split Line--------------------
np.max(c):     53
np.amax(c, axis=1):
[[52 50 27 53]
[48 39 42 40]]
c.max(axis=2):
[[50 53 52]
[42 32 48]]

Strictly speaking, a.min and others are not functions of the NumPy Library

6. Maximum Subscript

numpy.argmin(a,axis=None)/a.argmin(axis=None)

Returns the relative coordinates of the minimum value on the specified axis of the array after it has been reduced to one dimension

numpy.argmax(a,axis=None)/a.argmax(axis=None)

Returns the relative coordinates of the maximum value on the specified axis of the array after it has been reduced to one dimension

Example:

print("array c:\n", c)
print("c.argmax():  ", c.argmax())
print("np.argmax(c, axis=2):\n", np.argmax(c, axis=2))
print("-"*20 + 'Split Line' + '-'*20)
print("np.argmin(c):  ", np.argmin(c))
print("c.argmin(axis=1):\n", c.argmin(axis=1))

Output:

array c:
[[[50 44 13 16]
[26 23 31 35]
[ 5 21 42  8]]

[[ 6 53 10 57]
[14  5 18 38]
[40 31  4 55]]]
c.argmax():   15        # One-dimensional reduction of subscript 57 is 15
np.argmax(c, axis=2):
[[0 3 2]               # On axis 2, 50-0 35-3 42-2 57-3 38-3 55-3
[3 3 3]]
--------------------Split Line--------------------
np.argmin(c):   22
c.argmin(axis=1):
[[2 2 0 2]
[0 1 2 1]]

7.numpy.unravel_index(index, shape)

Converts one-dimensional subscript index to a multidimensional subscript (corresponding to the shape's subscript) according to the shape, and works with argmax,argmin in 6

Example:

print("array c:\n", c)
print(np.unravel_index(np.argmax(c), c.shape))

Output:

 [[[22  4 28 56]
[45 34  3 22]
[59 43 43 27]]

[[32 35 47 53]
[ 7 27 41 18]
[40 32 30 43]]]
(0, 2, 0) #59 is the maximum value of the array, and its index coordinates are (0, 2, 0)

8.numpy.median(a,axis=None)

Returns the median (median) of the array on the specified axis, or the median of all elements by default if the axis is not specified

Example:

print("array c:\n", c)
print("np.median(c):  ", np.median(c))

Output:

[[[17 59 14 23]
[27 59  6 12]
[43 16 27 17]]

[[12 10  5 17]
[21 55 18 42]
[41 36 40  5]]]
np.median(c):   19.5

9. Other Functions

numpy.ptp(a,axis=None)/a.ptp(a,axis=None)

Calculates the difference between the maximum and minimum values on a specified axis, defaulting to all elements if no axis is specified

Example:

print("np.ptp(c):  ", np.ptp(c))
print("c.ptp(axis=1):\n", c.ptp(axis=1))

Output:

array c:
[[[35 28 18 38]
[44 56  7 24]
[ 4 59  2 24]]

[[55 56  5 27]
[18 44 22  1]
[ 3 30 20 43]]]
np.ptp(c):   58     #   59-1=58
c.ptp(axis=1):
[[40 31 16 14]     # 44-4=40 59-28=31 ...
[52 26 17 42]]

numpy.percentile(a, q, axis=None)

1. a:Input Array
2. q: Percentile to be calculated, between 0 and 100
3. Axis: axis for calculating percentiles

Returns a number that satisfies that at least one q% of the number is less than or equal to the value and that at least one (100-q)% of the number is greater than or equal to the value.

Example:

d = np.random.randint(1, 40, (2, 5))
print("array d:\n", d)
print("np.percentile(d, 40):    ", np.percentile(d, 40))
print("np.percentile(d, 40, axis=1):\n", np.percentile(d, 40, axis=1))

Output:

array d:
[[39 15 35 17 39]
[20 12 36 19 10]]
np.percentile(d, 40):     18.200000000000003
np.percentile(d, 40, axis=1):
[27.8 16.2]              `

Many function parameter lists have keepdims=False, keepdims is to maintain the dimensionality of the array, and if keepdims is True, the return will still be contained in the multidimensional array []

Reference material

Qike Valley--NumPy statistical function

Official Document - Statistical Functions

Official Documents - Sort, Search and Count

Keywords: Python less

Added by manitoon on Tue, 03 Mar 2020 18:12:50 +0200