Histogram equalization and normalization

1. Histogram equalization

The gray level of an image can be viewed as a random variable within an interval [0, L-1]. The basic descriptor is a probability density function (PDF) of the gray scale.
Make r the input grayscale variable, s the output grayscale variable, pr(r) and ps(s) represent the probability density function of grayscale random variables R and s respectively, requiring the mapping relationship of grayscale variables R -> s. If the differentiable condition is satisfied, the PDF of the output grayscale variable s can be obtained by the following formula:

\[p_{s}(s)=p_{r}(r)\left|\frac{\mathrm{d} r}{\mathrm{d} s}\right| \]

The transformation functions that are particularly important in image processing take the following forms:

\[s=T(r)=(L-1) \int_{0}^{r} p_{r}(w) \mathrm{d} w \]

Where w is an integral pseudovariable. To the right of the formula is the cumulative distribution function (CDF) of the input grayscale random variable r. Because the PDF is always positive, the integral of a function is the area under the function, and the integral value is 1 (the area under the PDF curve is always 1), and the integral interval is [0, L-1].

For an equalized image, the probability density is uniform, that is, ps(s) =1/ (L-1).
Then the relationship between the transformed gray variable s and the pre-transformed gray variable r can be derived from the upper formula.

For discrete values, the probability of the occurrence of a grayscale rk in an image is approximately:

\[p_{r}\left(r_{k}\right)=\frac{n_{k}}{M N}, \quad k=0,1,2, \cdots, L-1 \]

Corresponding to the above transformation, the discrete form is:

\[s_{k}=T\left(r_{k}\right)=(L-1) \sum_{j=0}^{k} p_{r}\left(r_{j}\right)=\frac{(L-1)}{M N} \sum_{j=0}^{k} n_{j}, \quad k=0,1,2, \cdots, L-1 \]

MN is the total number of pixels in the image, nk is the number of pixels with rk gray scale, and L is the number of gray levels in the image.

The Python code equalizes a 16-bit graph:

1. Read in a 16 bit image matrix using img = cv2.imread('origin.tif', cv2.IMREAD_UNCHANGED) and img_array = np.array(img) read;
2. Compute the image histogram using np.histogram();
3. Calculate the cumulative histogram using np.cumsum();
4. Divide the number of occurrences of each gray level by the total number of occurrences of all gray levels (that is, the width x height of the image);
5. Calculate the gray transformation relationship matrix of 63336 x 63336 dimensions, and transform each gray level of r to each gray level of s;
6. Grayscale transformation of each pixel of the whole picture according to the grayscale transformation relation matrix above;
7. Output image matrix after gray transformation, that is, histogram equalization

def histEqualize(imgArray):
    """Calculate 16 bit The result of histogram equalization of the graph
    
    Args:
        imgArray: Gray Matrix of Original Gray
        
    Returns:
        equalimgArray: Equalized Gray Matrix of Pictures
    
    """
                
    imgHist = np.histogram(imgArray, bins=range(0, 65537))[0]
    # Cumulative Histogram    
    imgCumuHist = np.cumsum(imgHist)
    # Cumulative Histogram Probability
    imgCumuHistPro = imgCumuHist / imgCumuHist[-1]
    # Gray transformation matrix
    grayConvert = np.ceil(np.dot(65535, imgCumuHistPro))
    w,h = imgArray.shape
    equalimgArray = np.zeros((w,h), dtype=np.uint16)
    for i in range(w):
        for j in range(h):
            equalimgArray[i][j] = grayConvert[imgArray[i][j]]
   
    return equalimgArray

2. Histogram specification (matching)

Make R and Z represent the gray levels of the input and output images, respectively, which are histogram-specified. Given that the grayscale probability density function of the input image is pr(r), and pz(z) is the specified probability density function that we want the output image to have, we need to find the r-> Z grayscale variable mapping relationship.
Let's first transform r into a histogram equalization T(r) as an intermediate variable s:

\[s=T(r)=(L-1) \int_{0}^{r} p_{r}(w) \mathrm{d} w \]

Then, Z is transformed into histogram equalization G(z), which is mapped to the intermediate variable s:

\[G(z)=(L-1) \int_{0}^{z} p_{z}(t) \mathrm{d} t=s \]

Three variables are related:

\[z=G^{-1}[T(r)]=G^{-1}(s) \]

Therefore, the histogram specification steps are as follows:

1. Calculate the histogram pr(r) of the image and make histogram equalization transformation. Calculates the mapping relationship of r-> s, rounding s to integers in the range [0, L-1];
2. The registration variable Z is histogram equalized and all the values of transformed G(z) are calculated. Round the value of G(z) to an integer in the range [0,L-1];
3. For each value s k of the grayscale variable s, k = 0,1.2,..., L-1, use the G(z) value stored in step 2 to find the corresponding Z value so that G(z) is closest to S K and store the mappings of these s-> Z.

Python uses numpy and opencv to implement the above steps.
When calculating the mapping relationship of s->z, all the values of s are transformed into a set of collections, so as to avoid repeating the search for the G(z) value closest to the minimum of both Z values for the same s value.
And zero the other gray levels z_Gray = np.zeros(65536), because there is no value in s in the gray level of [0, 65535], it is not necessary to consider, because it does not appear in the mapping relationship of R -> s -> Z.
The way to find the index Z of the closest value of s to G(z) is to first subtract the list of the same dimension with all s values from the G(z) list and then find the index of the minimum value.

In summary, the input image r is mapped to s, then the registration variable Z is mapped to G(z)=s, and the relationship between G(z)=s and Z is found through discrete variables. Finally, the gray value mapping is completed by R - > s - > Z.

def histSpecify(imgArray, specifyHist):
    """For 16 bit Histogram specification (histogram matching)
    
    Args:
        imgArray: Gray Matrix of Original Gray
        specifyHist: Need matching histogram
        
    Returns:
        specifiedImgArray: Specified Grayscale Matrix of Pictures
        
    """     
    
    # Histogram of the original graph
    imgHist = np.histogram(imgArray, bins=range(0, 65537))[0]  
    # Cumulative Histogram    
    imgCumuHist = np.cumsum(imgHist)
    # Cumulative Histogram Probability
    imgCumuHistPro = imgCumuHist / imgCumuHist[-1]
    # Calculate r->s Grayscale Conversion Relation
    s_Gray = np.ceil(np.dot(65535, imgCumuHistPro))
    
    
    # Cumulative Map Matching Histogram    
    specifyCumuHist = np.cumsum(specifyHist)
    specifyCumuHistPro = specifyCumuHist / specifyCumuHist[-1]
    # Calculating z->G(z) Grayscale Conversion Relationships
    Gz_Gray = np.ceil(np.dot(65535, specifyCumuHistPro))
    
    # Calculating s->z Grayscale Conversion Relation
    z_Gray = np.zeros(65536)
    set_s_Gray = set(s_Gray)
    for s in set_s_Gray:
        # Find the index Z of the closest value s to G(z) and map s->z
        diff = np.abs(Gz_Gray - np.full(65536, s))
        z = np.argwhere(diff == min(diff))[0][0]
        z_Gray[int(s)] = z 
    
    # Gray level transformation of image
    w, h = imgArray.shape        
    specifiedImgArray = np.zeros((w,h), dtype = np.uint16)
    for i in range(w):
        for j in range(h):
             # r -> s
             specifiedImgArray[i][j] = s_Gray[imgArray[i][j]]
             # s -> z
             specifiedImgArray[i][j] = z_Gray[specifiedImgArray[i][j]]
    
    return specifiedImgArray

3. Picture testing

if __name__=='__main__':

    ori_img = cv2.imread('origin.tif', cv2.IMREAD_UNCHANGED)
    ori_img_array = np.array(ori_img)
    ori_hist = np.histogram(ori_img_array, bins=range(0, 65537))[0]
    # %varexp --plot ori_in IPython Hist Views Histogram of Original
       
    spec_img = cv2.imread('processed.tif', cv2.IMREAD_UNCHANGED)
    spec_img_array = np.array(spec_img)
    spec_hist = np.histogram(spec_img_array, bins=range(0, 65537))[0]
    # %varexp --plot spec_in IPython Hist Views Histogram of Original
    
    
    # Display histogram equalized pictures in%% cv mode    
    equal_img_array = histEqualize(ori_img_array)
    cv2.imwrite('histeq_img.png', equal_img_array)
    img_scaled = cv2.normalize(equal_img_array, dst=None, alpha=0, beta=65535, norm_type=cv2.NORM_MINMAX)
    cv2.imshow('equal_img_array', img_scaled)
    cv2.waitKey(0)
    cv2.destroyAllWindows()
        
    # Display histogram normalized pictures in%% cv mode    
    specified_img_array = histSpecify(ori_img_array, spec_hist)
    cv2.imwrite('specified_img.png', specified_img_array)
    img_scaled = cv2.normalize(specified_img_array, dst=None, alpha=0, beta=65535, norm_type=cv2.NORM_MINMAX)
    cv2.imshow('specified_img_array', img_scaled)
    cv2.waitKey(0)
    cv2.destroyAllWindows()

Calculating histograms of pictures using OpenCV

cv2.calcHist(images, channels, mask, histSize, ranges[, hist[, accumulate ]]) ->hist

images: Input image
channels: Select the channel for the image
Mask: A mask is an np array of the same size as an image in which the part to be processed is specified as 1, the part not to be processed is specified as 0, and is generally set to None to represent processing the entire image
histSize: How many bin s to use, typically 256, 16 bit figure 65535
ranges: The range of pixel values, typically [0,255] for 0-255 and [0,65535] for 16bit

ori_hist = cv2.calcHist([ori_img], [0], None, [65536], [0, 65535])

GDAL Library Read Pictures

ori_img = cv2.imread('origin.tif', cv2.IMREAD_LOAD_GDAL)

Added by Creech on Wed, 03 Nov 2021 19:55:01 +0200

Programming VIP

Histogram equalization and normalization

1. Histogram equalization

2. Histogram specification (matching)

3. Picture testing

Popular Keywords