27.28.29. Polar map (Nightingale Rose), Venn diagram, Area chart, tree map

26. Polar map (Nightingale Rose)
27. Venn diagram
28. Area chart
29. Tree map

26. Polar map (Nightingale Rose)

The polar map (also known as Nightingale Rose) is radially extended, and each piece will occupy a certain angle. Its radius represents the size of a certain type of data it represents. Its angle size indicates its proportion in the total category.

Nightingale rose, invented by Nightingale, is a British nurse and statistician. When working in the British barracks, he collected the mortality and cause distribution of soldiers during the Crimean War in different months, and effectively moved the senior managers at that time through the rose chart. Therefore, the proposal for medical improvement received strong support, reducing the mortality of soldiers from 42% to 2%. Therefore, this graph was later called Nightingale rose chart.

What scene is the Nightingale rose generally used in? In fact, Nightingale rose chart is similar to pie chart. It is a deformation of pie chart, and its usage is the same. It is mainly used in scenes that need to check the proportion. The only difference between the two is that the pie chart judges the proportion by angle, while the rose chart can judge by radius or sector area.

import numpy as np
import matplotlib.pyplot as plt

# Fixing random state for reproducibility
np.random.seed(19680801)

# Compute pie slices
N = 10
theta = np.linspace(0.0, 2 * np.pi, N, endpoint=False)
radii = 10 * np.random.rand(N)
width = np.pi / 4 * np.random.rand(N)
colors = plt.cm.viridis(radii / 10.)

ax = plt.subplot(111, projection='polar')
ax.bar(theta, radii, width=width, bottom=0.0, color=colors, alpha=0.5)

plt.show()

import matplotlib.pyplot as plt
import numpy as np

N = 7
'''Generate angle value'''
theta = np.arange(0.,2*np.pi,2*np.pi/N)
'''Generate radius value'''
radii = np.array([7,4,5,3,2,4,6])
'''Define shaft type'''
plt.axes([0.025,0.025,0.95,0.95],polar=True)
'''Define the color set, which is used here RGB Value, of course, you can also use color names'''
colors = np.array(['#4bb2c5','#c5b47f','#EAA228','#579575','#839557','#958c12','#953579'])
'''bar()The function requires an angle and radius to be passed in as parameters'''
bars = plt.bar(theta,radii,width=(2*np.pi/N),bottom=0.0,color=colors)
plt.show()

27. Venn diagram

Venn diagram is a diagram showing the logical relationship (intersection, difference, Union) between sets.

Install Matplotlib Venn when using: PIP install Matplotlib Venn

import matplotlib.pyplot as plt
from matplotlib_venn import venn2

# First way to call the 2 group Venn diagram
venn2(subsets=(10, 5, 2), set_labels=('Group A', 'Group B'))
plt.show()

import matplotlib.pyplot as plt
from matplotlib_venn import venn2

# Second way
venn2([set(['A', 'B', 'C', 'D']), set(['D', 'E', 'F'])])
plt.show()

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from matplotlib_venn import venn2

df = pd.DataFrame({'Product': ['Only cheese', 'Only red wine', 'Both'],
                   'NbClient': [900, 1200, 400]},
                  columns = ['Product', 'NbClient'])
print(df)
'''
Output result:
         Product  NbClient
0    Only cheese       900
1  Only red wine      1200
2           Both       400
'''

# First way
plt.figure(figsize=(8, 11))

v2 = venn2(subsets = {'10': df.loc[0, 'NbClient'],
                      '01': df.loc[1, 'NbClient'],
                      '11': df.loc[2, 'NbClient']},
           set_labels=('', ''))
v2.get_patch_by_id('10').set_color('yellow')
v2.get_patch_by_id('01').set_color('red')
v2.get_patch_by_id('11').set_color('orange')

v2.get_patch_by_id('10').set_edgecolor('none')
v2.get_patch_by_id('01').set_edgecolor('none')
v2.get_patch_by_id('11').set_edgecolor('none')

v2.get_label_by_id('10').set_text('%s\n%d\n(%.0f%%)' % (df.loc[0, 'Product'],
                                                        df.loc[0, 'NbClient'],
                                                        np.divide(df.loc[0, 'NbClient'],
                                                                  df.NbClient.sum())*100))

v2.get_label_by_id('01').set_text('%s\n%d\n(%.0f%%)' % (df.loc[1, 'Product'],
                                                        df.loc[1, 'NbClient'],
                                                        np.divide(df.loc[1, 'NbClient'],
                                                                  df.NbClient.sum())*100))

v2.get_label_by_id('11').set_text('%s\n%d\n(%.0f%%)' % (df.loc[2, 'Product'],
                                                        df.loc[2, 'NbClient'],
                                                        np.divide(df.loc[2, 'NbClient'],
                                                                  df.NbClient.sum())*100))
for text in v2.subset_labels:
    text.set_fontsize(12)

plt.show()

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from matplotlib_venn import venn2

# Second way
grp1 = set(['cheese-a', 'cheese-b', 'cheese-c', 'cheese-d',
            'cheese-e', 'cheese-f', 'cheese-g', 'cheese-h',
            'cheese-i', 'cheese', 'red wine'])
grp2 = set(['red wine-a', 'red wine-b', 'red wine-c', 'red wine-d',
            'red wine-e', 'red wine-f', 'red wine-g', 'red wine-h',
            'red wine-i', 'red wine-j', 'red wine-k', 'red wine-l',
            'red wine', 'cheese'])

v2 = venn2([grp1, grp2], set_labels = ('', ''))

v2.get_patch_by_id('10').set_color('yellow')
v2.get_patch_by_id('01').set_color('red')
v2.get_patch_by_id('11').set_color('orange')

v2.get_patch_by_id('10').set_edgecolor('none')
v2.get_patch_by_id('01').set_edgecolor('none')
v2.get_patch_by_id('11').set_edgecolor('none')

v2.get_label_by_id('10').set_text('Only cheese\n(36%)')
v2.get_label_by_id('01').set_text('Only red wine\n(48%)')
v2.get_label_by_id('11').set_text('Both\n(16%)')

plt.show()

This Venn diagram can usually be used for the analysis of retail transactions. Suppose we need to study the popularity of cheese and red wine, and 2500 customers answered the questionnaire. According to the above figure, we found that among 2500 customers, 900 customers (36%) like cheese, 1200 customers (48%) like red wine, and 400 customers (16%) like both products.

import matplotlib.pyplot as plt
from matplotlib_venn import venn3

plt.figure(figsize=(12,12))

v3 = venn3(subsets = {'100':30, '010':30, '110':17,
                      '001':30, '101':17, '011':17, '111':5},
           set_labels = ('', '', ''))

v3.get_patch_by_id('100').set_color('red')
v3.get_patch_by_id('010').set_color('yellow')
v3.get_patch_by_id('001').set_color('blue')
v3.get_patch_by_id('110').set_color('orange')
v3.get_patch_by_id('101').set_color('purple')
v3.get_patch_by_id('011').set_color('green')
v3.get_patch_by_id('111').set_color('grey')

v3.get_label_by_id('100').set_text('Math')
v3.get_label_by_id('010').set_text('Computer science')
v3.get_label_by_id('001').set_text('Domain expertise')
v3.get_label_by_id('110').set_text('Machine learning')
v3.get_label_by_id('101').set_text('Statistical research')
v3.get_label_by_id('011').set_text('Data processing')
v3.get_label_by_id('111').set_text('Data science')

for text in v3.subset_labels:
    text.set_fontsize(13)

plt.show()

28. Area chart

The area chart or area chart displays quantitative data graphically. It is based on a line graph. Areas between axes and lines are usually highlighted with colors, textures, and hatches.
It can be used to display or compare quantitative progress over time.

import numpy as np
import matplotlib.pyplot as plt

plt.figure(figsize=(6, 4))

turnover = [2, 7, 14, 17, 20, 27, 30, 38, 25, 18, 6, 1]
plt.fill_between(np.arange(12), turnover, color="skyblue", alpha=0.4)
plt.plot(np.arange(12), turnover, color="Slateblue", alpha=0.6, linewidth=2)

plt.tick_params(labelsize=12)
plt.xticks(np.arange(12), np.arange(1,13))
plt.xlabel('Month', size=12)
plt.ylabel('Turnover (K dollars) of ice-cream', size=12)
plt.ylim(bottom=0)

plt.show()

Suppose the figure above describes the turnover of ice cream sales in one year. According to the figure, it is clear that sales peak in summer and then decline from autumn to winter.

Example 2:

import numpy as np
import matplotlib.pyplot as plt

plt.figure(figsize=(9, 6))

year_n_1 = [1.5, 3, 10, 13, 22, 36, 30, 33, 24.5, 15, 6.5, 1.2]
year_n = [2, 7, 14, 17, 20, 27, 30, 38, 25, 18, 6, 1]

plt.fill_between(np.arange(12), year_n_1, color="lightpink", alpha=0.5, label='year N-1')
plt.fill_between(np.arange(12), year_n, color="skyblue", alpha=0.5, label='year N')

plt.tick_params(labelsize=12)
plt.xticks(np.arange(12), np.arange(1,13))
plt.xlabel('Month', size=12)
plt.ylabel('Turnover (K dollars) of ice-cream', size=12)
plt.ylim(bottom=0)
plt.legend()

plt.show()

29. Tree map

A tree map displays hierarchical (tree structured) data as a set of nested rectangles. Each branch of the tree has a rectangle, which is then tiled with smaller rectangles representing sub branches. The area of the rectangle of the leaf node is proportional to the specified size of the data. Typically, leaf nodes are shaded to show individual dimensions of the data.
The idea of tree map is represented by the area of the box. The larger the area, the larger the value it represents, and vice versa.
Install square when using: PIP install square

(base) C:\Users\toto>pip install squarify
Collecting squarify
  Downloading squarify-0.4.3-py3-none-any.whl (4.3 kB)
Installing collected packages: squarify
Successfully installed squarify-0.4.3

(base) C:\Users\toto>

Function syntax and parameters:

squarify.plot(sizes, 
            norm_x=100, 
            norm_y=100, 
            color=None, 
            label=None, 
            value=None, 
            alpha,
            **kwargs)

 sizes: specify the value corresponding to each level of discrete variable, that is, reflect the area size of tree map sub block;
norm_x: By default, the range of x-axis is limited to 0-100;
norm_y: By default, the range of y-axis is limited to 0-100;
 color: customize the filling color of tree map sub blocks;
 label: assign a label to each sub block;
 value: add a label of value size for each sub block;
• alpha: sets the transparency of the fill color;
 kwargs: keyword parameter, which is similar to the keyword parameter of bar chart, such as setting border color, border thickness, etc.

import matplotlib.pyplot as plt
import squarify

# Chinese and minus sign processing method
plt.rcParams['font.sans-serif'] = 'Microsoft YaHei'
# plt.rcParams['axes.unicode_minus'] = False

# Data creation
name = ['Shanghai GDP', 'Beijing GDP', 'Shenzhen GDP', 'Guangzhou GDP',
        'Chongqing GDP', 'Suzhou GDP', 'Chengdu GDP', 'Wuhan GDP',
        'Hangzhou GDP', 'Tianjin GDP', 'Nanjing GDP', 'Changsha GDP',
        'Ningbo GDP', 'Wuxi GDP', 'Qingdao GDP', 'Zhengzhou GDP',
        'Foshan GDP', 'Quanzhou GDP', 'Dongguan GDP', 'Jinan GDP']
income = [38155, 35371, 26927, 23628, 23605, 19235, 17012, 16900, 15373, 14104,
          14030, 12580, 11985, 11852, 11741, 11380, 10751, 9946, 9482, 9443]

# Drawing details
colors = ['steelblue', '#9999ff', 'red', 'indianred', 'deepskyblue', 'lime', 'magenta', 'violet', 'peru', 'green',
          'yellow', 'orange', 'tomato', 'lawngreen', 'cyan', 'darkcyan', 'dodgerblue', 'teal', 'tan', 'royalblue']
plot = squarify.plot(sizes=income,  # Specify drawing data
                     label=name,  # Specify label
                     color=colors,  # Specify custom colors
                     alpha=0.6,  # Specify transparency
                     value=income,  # Add value label
                     edgecolor='white',  # Set the bounding box to white
                     linewidth=3  # Set the border width to 3
                     )
# Set the label size to 10
plt.rc('font', size=10)
# Set title and font size
plot.set_title('2019 Year city GDP Top 20(RMB100mn)', fontdict={'fontsize': 15})
# Remove axis
plt.axis('off')
# Scale except top and right borders
plt.tick_params(top='off', right='off')
# Graphic display
plt.show()

Added by shadowq on Tue, 08 Mar 2022 23:42:24 +0200