imgaug is a encapsulated python library for image authentication. It supports the transformation of key points and bounding box together.
- Project home page: imgaug doc
1. Installation and uninstallation
# Install via github sudo pip install git+https://github.com/aleju/imgaug # Install via pypi sudo pip install imgaug # For local installation, the following version becomes the version you want to install, for example: imgaug-0.2.5 tar. gz python setup.py sdist && sudo pip install dist/imgaug-VERSION.tar.gz # uninstall sudo pip uninstall imgaug
2. Examples
2.1 basic use
First define a transformation sequence, and then directly transfer the image batch:
from imgaug import augmenters as iaa seq = iaa.Sequential([ iaa.Crop(px=(0, 16)), # crop images from each side by 0 to 16px (randomly chosen) iaa.Fliplr(0.5), # 0.5 is the probability, horizontally flip 50% of the images iaa.GaussianBlur(sigma=(0, 3.0)) # blur images with a sigma of 0 to 3.0 ]) for batch_idx in range(1000): # 'images' should be either a 4D numpy array of shape (N, height, width, channels) # or a list of 3D numpy arrays, each having shape (height, width, channels). # Grayscale images must have shape (height, width, 1) each. # All images must have numpy's dtype uint8. Values are expected to be in # range 0-255. images = load_batch(batch_idx) images_aug = seq.augment_images(images) train_on_images(images_aug)
2.2 contains common transformation examples
import cv2 import numpy as np from imgaug import augmenters as iaa import imgaug as ia # Define a lambda expression to perform the image enhancement transmitted by sometimes with the probability of p=0.5 sometimes = lambda aug: iaa.Sometimes(0.5, aug) # Establish an instance named seq and define the enhancement method for enhancement aug = iaa.Sequential( [ iaa.Fliplr(0.5), # Mirror flip 50% of images iaa.Flipud(0.2), # Flip 20% of the images left and right sometimes(iaa.Crop(percent=(0, 0.1))), # Here, we follow the sometimes mentioned above, and do the crop operation on some random images # The range of crop is 0 to 10% # Or sometimes(iaa.Crop(px=(0, 16))), randomly select the crop range from 0-16 pixels away from the edge # Affine transformation of some images sometimes(iaa.Affine( scale={"x": (0.8, 1.2), "y": (0.8, 1.2)}, # The image is scaled between 80% and 120% translate_percent={"x": (-0.2, 0.2), "y": (-0.2, 0.2)}, # Translation ± 20% rotate=(-45, 45), # Rotation ± 45 degrees shear=(-16, 16), # Shear transformation ± 16 degrees, (rectangular to parallelogram) order=[0, 1], # Use nearest neighbor difference or bilinear difference cval=(0, 255), # All white and all black filling mode=ia.ALL # Defines the method of filling the area outside the image )), # Use the following methods between 0 and 5 to enhance the image. Pay attention to the usage of SomeOf iaa.SomeOf((0, 5), [ # Part of the image is represented by super pixels. o(╥﹏╥) o it's the first time for the author to use super-pixel enhancement, which is relatively ignorant sometimes( iaa.Superpixels( p_replace=(0, 1.0), n_segments=(20, 200) ) ), # Gaussian blur, mean blur and median blur are used for enhancement. Pay attention to the usage of OneOf iaa.OneOf([ iaa.GaussianBlur((0, 3.0)), iaa.AverageBlur(k=(2, 7)), # When the nuclear size is between 2 and 7 and k=((5, 7), (1, 3)), the nuclear height is 5 ~ 7 and the width is 1 ~ 3 iaa.MedianBlur(k=(3, 11)), ]), # Crisp Enhancement iaa.Sharpen(alpha=(0, 1.0), lightness=(0.75, 1.5)), # Relief effect iaa.Emboss(alpha=(0, 1.0), strength=(0, 2.0)), # For edge detection, the detected value is assigned 0 or 255, and then superimposed on the original image sometimes(iaa.OneOf([ iaa.EdgeDetect(alpha=(0, 0.7)), iaa.DirectedEdgeDetect( alpha=(0, 0.7), direction=(0.0, 1.0) ), ])), # Adding Gaussian noise iaa.AdditiveGaussianNoise( loc=0, scale=(0.0, 0.05 * 255), per_channel=0.5 ), # Set 1% to 10% of pixels to black # Or cover 3% to 15% of the pixels with black squares 2% to 5% of the original size iaa.OneOf([ iaa.Dropout((0.01, 0.1), per_channel=0.5), iaa.CoarseDropout( (0.03, 0.15), size_percent=(0.02, 0.05), per_channel=0.2 ), ]), # 5% probability to reverse the intensity of the pixel, that is, the original intensity is V, so now it is 255-v iaa.Invert(0.05, per_channel=True), # Each pixel randomly adds or subtracts a number between - 10 and 10 iaa.Add((-10, 10), per_channel=0.5), # Multiply pixels by numbers between 0.5 or 1.5 iaa.Multiply((0.5, 1.5), per_channel=0.5), # Change the contrast of the whole image to half or twice the original iaa.ContrastNormalization((0.5, 2.0), per_channel=0.5), # Turn RGB into a grayscale image, and then multiply alpha to add it to the original image iaa.Grayscale(alpha=(0.0, 1.0)), # Move the pixels around. This method is seen in mnist dataset enhancement sometimes( iaa.ElasticTransformation(alpha=(0.5, 3.5), sigma=0.25) ), # Distort local areas of the image sometimes(iaa.PiecewiseAffine(scale=(0.01, 0.05))) ], random_order=True # These operations are applied to the image in random order ) ], random_order=True # These operations are applied to the image in random order ) # Data enhancement for a single picture image = cv2.imread('1.jpg', 0) h = image.shape[0] w = image.shape[1] enhance_num = 32 aug_example_img = aug.augment_image(image=image) print(image.shape, aug_example_img.shape) # Generate a list of pictures example_images = np.array( [image for _ in range(enhance_num)], dtype=np.uint8 ) aug_imgs = aug(images=example_images) # images_aug = aug.augment_images(images=img_array) # display picture ia.show_grid(aug_imgs, rows=4, cols=8) # Save picture for i in range(aug_imgs.shape[0]): img = aug_imgs[i] cv2.imwrite("aug_%d.jpg" % i, img) # Save as a picture img_array = np.array([image] * enhance_num, dtype=np.uint8) write_img = np.zeros(shape=(h, (w+10) * enhance_num, 3), dtype=np.uint8) for j, item in enumerate(aug_imgs): write_img[:, j * (w + 10): j * (w + 10) + w, :] = item
3 commonly used functions of augmenters
Import the Augmenters class first
from imgaug import augmenters as iaa
3.1 iaa.Sequential()
Generate a Sequential function prototype for processing pictures:
from imgaug import augmenters as iaa iaa.Sequential(children=None, random_order=False, name=None, deterministic=False, random_state=None)
Parameters:
- children: the Augmenter or Augmenter set you want to apply to the image. Default None
- random_order: bool type, False by default. Whether to apply different order of Augmenter list to the pictures of each batch. When set to True, the processing order of pictures will be different between different batches, but the order within the same batch is the same.
- deterministic: bool type, False by default.
3.2 iaa.someOf()
Apply part of the transformation in the Augmenter to image processing instead of all augmenters. For example, you can define 20 transformations, but only select 5 of them at a time. However, fixed selection of an Augmenter is not supported.
Function prototype:
from imgaug import augmenters as iaa iaa.SomeOf(n=None, children=None, random_order=False, name=None, deterministic=False, random_state=None)
Parameters:
- n: How many to choose from the total augmeters. It can be an int, tuple, list or random value.
- random_order: whether the order is different each time.
example:
# Select one flip mode at a time seq = iaa.SomeOf(1, [ iaa.Fliplr(1.0), iaa.Flipud(1.0) ]) imgs_aug = seq.augment_images(imgs) # Use 1 ~ 3 augmeters to process pictures each time, and the order of augmeters in each batch is the same. seq = iaa.SomeOf((1, 3), [ iaa.Fliplr(1.0), iaa.Flipud(1.0), iaa.GaussianBlur(1.0) ]) imgs_aug = seq.augment_images(imgs) # One or more augmeters are used to process pictures each time, and the order of augmeters in each batch is different. seq = iaa.SomeOf((1, None), [ iaa.Fliplr(1.0), iaa.Flipud(1.0), iaa.GaussianBlur(1.0) ], random_order=True) imgs_aug = seq.augment_images(imgs)
3.3 iaa.OneOf()
Select one from a series of Augmenters at a time to transform.
iaa.OneOf(children, name=None, deterministic=False, random_state=None)
The meaning of parameters is the same as above.
3.4 iaa.Sometimes()
Apply some Augmenters to some pictures in the batch, and apply other Augmenters to the remaining pictures.
iaa.Sometimes(p=0.5, then_list=None, else_list=None, name=None, deterministic=False, random_state=None)
- p: float. What proportion of the picture will be Augmente.
- then_list: Augmenter set. p probability pictures are transformed by augmeters.
- else_list: 1-p probability pictures will be transformed by augmeters. Note that the Augmenter of the transformed image application can only be then_list or else_ One in the list.
3.5 iaa.WithColorspace()
Transform the image in a specific color space. That is, first transform the picture from one color space to another, then transform the image in another color space, and finally transform it back to the original color space.
iaa.WithColorspace(to_colorspace, from_colorspace='RGB', children=None, name=None, deterministic=False, random_state=None)
- to_colorspace: the color space to transform. There are the following options: RGB, BGR, GRAY, CIE, YCrCb, HSV, HLS, Lab, Luv
- from_colorspace: the original color space, RGB by default.
- children: the transformation to perform.
# First transform the picture from RGB to HSV, then increase the H value by 10, and then transform it back to RGB. aug = iaa.WithColorspace(to_colorspace="HSV", from_colorspace="RGB", children=iaa.WithChannels(0, iaa.Add(10)))
3.6 iaa.WithChannels()
Select a Channel from the picture for transformation, and then merge the Channel back after the transformation.
iaa.WithChannels(channels=None, children=None, name=None, deterministic=False, random_state=None)
Parameters:
- channels: int or int list. Which channel s are used to transform.
- Children: what transformations should be performed after the channel is selected.
3.7 iaa.Noop()
No transformation. In some cases, you only want to use an Augmenter as a placeholder, so you can continue to call augment_image() function, but no transformation is actually made. For example, you can use this when testing.
3.8 iaa.Lambda()
Customize some transformation functions.
iaa.Lambda(func_images, func_keypoints, name=None, deterministic=False, random_state=None)
Parameters:
- func_images: call this function for each image. This function must return the transformed picture. The form of this function is:
function(images, random_state, parents, hooks)
- func_ Keys: a function that transforms the key points of each image. This function returns the transformed keypoint. The form of this function is:
function(keypoints_on_images, random_state, parents, hooks)
example:
def func_images(images, random_state, parents, hooks): images[:, ::2, :, :] = 0 return images def func_keypoints(keypoints_on_images, random_state, parents, hooks): return keypoints_on_images aug = iaa.Lambda( func_images=func_images, func_keypoints=func_keypoints )
Turn the pixels of each picture that are not separated by two lines into black strips, and keep the key points.
[the external chain picture transfer fails. The source station may have anti-theft chain mechanism. It is recommended to save the picture and upload it directly (img-1v7ziLsX-1621334062341)(/img/imgaug/1621331938-3d7ee6949c9fc3215207fc7170f29b20.jpg)]
3.9 iaa.AssertShape()
assert the image to be transformed and the shape of the keypoint. If not, throw an exception.
iaa.AssertShape(shape, check_images=True, check_keypoints=True, name=None, deterministic=False, random_state=None)
Parameters:
- shape: tuple, usually in the form of (N, H, W, C). The value of each element in the tuple can be: None, int, two tuples of type int or a list of type int. If None, all values are acceptable. If it is int, only the corresponding position is the value will be accepted. If it is a tuple of type int, such as (a,b), the value of the corresponding position must be a < = X
# Check whether each picture entered is 32 × thirty-two × 3. If yes, perform horizontal turnover, otherwise an error will be reported seq = iaa.Sequential([ iaa.AssertShape((None, 32, 32, 3)), iaa.Fliplr(0.5) ]) # First check whether the height of the picture is 32 < = h < 64, the width is 32, and the channel is 1 or 3. If all of them are satisfied, perform horizontal turning; otherwise, an error will be reported. seq = iaa.Sequential([ iaa.AssertShape((None, (32, 64), 32, [1, 3])), iaa.Fliplr(0.5) ])
3.10 iaa.Scale()
Scales the image to a fixed size.
iaa.Scale(size, interpolation='cubic', name=None, deterministic=False, random_state=None)
Parameters:
- Size: string "keep". At this time, keep the original size of the image without scaling. If it is an integer n, it is scaled to (n, n). If it is a float v, each image will be scaled to (H*v, W*v). At this time, the size of each image is still different. If it is a tuple type (a,b), if there is at least one decimal in a and B, select a number from [a,b] as the scaling scale. If both a and B are integers, select an integer from [a,b] as the scaled size. If it is a list, the numbers in the list are either all integers or all decimals (cannot be mixed). If it is a dict type, the dict must have two keys: height and width. The value of each key can still be selected according to the above method. In addition, the value of key can also be "keep aspect ratio", which means scaling according to scale.
- interpolation: scaling method. If it is All, you will randomly select one from the following: near, linear, area and cubic. Note that each picture may be different. If it is int, it should be one of the following: CV2 INTER_ nearest, cv2. INTER_ LINEAR, cv2. INTER_ AREA,cv2. INTER_ CUBIC. If it is a string, this method will be used All the time. It must be one of the following: near, linear, area and cubic. If it is an int list or a string list, each picture will be selected at random.
3.11 iaa.CropAndPad()
Intercept (crop) or fill (pad). When filling, the filled area is black.
iaa.CropAndPad(px=None, percent=None, pad_mode='constant', pad_cval=0, keep_size=True, sample_independently=True, name=None, deterministic=False, random_state=None)
Parameters:
- px: want the pixels of crop(negative values) or pad(positive values). Note that and percent cannot exist at the same time. If it is none, the crop of pixel level will not be used. Int or int list is the same as above. If it is a tuple with four elements, the four elements represent (top, right, bottom, left) respectively. Each element can be int or int tuple or int list.
- percent: crop or pad in proportion, the same as px. But the two cannot exist at the same time.
- pad_mode: fill mode. It can be All, string, string list. The optional filling methods are: constant, edge and linear_ ramp, maximum, median, minimum, reflect, symmetric, wrap. The specific meaning can be found in numpy documents.
- pad_cval: float,int,float tuple,int tuple,float list,int list. When pad_ Select the filled value when mode = constant.
- keep_size: bool type. After crop, the image size will change. If the value is set to 1, it will be scaled back to the original size after crop or pad.
- sample_ Independently: bool type. If set to False, each value selected from px or percent will act in four directions.
3.12 iaa.Pad()
And IAA The same as cropandpad(), only positive values are accepted.
3.13 iaa.Crop()
And IAA The same as cropandpad(), only negative values are accepted.
3.14 iaa.Fliplr()
Horizontal mirror flip.
iaa.Fliplr(p=0, name=None, deterministic=False, random_state=None)
Parameters:
- p: int or float, the probability of flipping each picture
3.15 Flipud()
Flip up and down, the same as above.
3.16 iaa.ChangeColorspace()
Change image space.
iaa.ChangeColorspace(to_colorspace, from_colorspace='RGB', alpha=1.0, name=None, deterministic=False, random_state=None)
Parameters:
- to_colorspace: see above.
- from_colorspace: see above.
- Alpha: when overwriting the old color space, the alpha value of the new color space. Is int, float, int tuple, float tuple.
3.17 iaa.Grayscale()
Becomes a grayscale image.
iaa.Grayscale(alpha=0, from_colorspace='RGB', name=None, deterministic=False, random_state=None)
Parameters:
- Alpha: when overwriting the old color space, the alpha value of the new color space.
3.18 iaa.GaussianBlur()
Gaussian perturbation.
iaa.GaussianBlur(sigma=0, name=None, deterministic=False, random_state=None)
Parameters:
- sigma: standard deviation of Gaussian transform. Can be float, float tuple. Common ones are 0 and no disturbance. 3. Strong disturbance.
3.19 iaa.AverageBlur()
Take the mean value from the nearest pixel to disturb.
iaa.AverageBlur(k=1, name=None, deterministic=False, random_state=None)
Parameters:
- k: Window size. Can be int, int tuple. When it is int tuple, if each element is also tuple, each element is regarded as height and width respectively, and the window size is inconsistent.
3.20 iaa.MedianBlur()
Perturb by nearest neighbor median.
iaa.MedianBlur(k=1, name=None, deterministic=False, random_state=None)
Same as above.
3.21 iaa.Convolve()
Use convolution on images.
iaa.Convolve(matrix=None, name=None, deterministic=False, random_state=None)
- Matrix: convolution matrix.
3.22 iaa.Sharpen()
Sharpen.
iaa.Sharpen(alpha=0, lightness=1, name=None, deterministic=False, random_state=None)
3.23 iaa.Emboss()
Relief effect.
iaa.Emboss(alpha=0, strength=1, name=None, deterministic=False, random_state=None)
3.24 iaa.EdgeDetect()
Edge detection.
iaa.EdgeDetect(alpha=0, name=None, deterministic=False, random_state=None)
3.25 iaa.DirectedEdgeDetect()
Edge detection in a specific direction.
iaa.DirectedEdgeDetect(alpha=0, direction=(0.0, 1.0), name=None, deterministic=False, random_state=None)
3.26 iaa.Add()
Add a value at random.
iaa.Add(value=0, per_channel=False, name=None, deterministic=False, random_state=None)
3.27 iaa.AddElementwise()
Add by pixel.
iaa.AddElementwise(value=0, per_channel=False, name=None, deterministic=False, random_state=None)
3.28 iaa.AdditiveGaussianNoise()
Add Gaussian noise.
iaa.AdditiveGaussianNoise(loc=0, scale=0, per_channel=False, name=None, deterministic=False, random_state=None)
3.29 iaa.Multiply()
Multiply each pixel in the image by a value to make the image brighter or darker.
iaa.Multiply(mul=1.0, per_channel=False, name=None, deterministic=False, random_state=None)
3.30 iaa.MultiplyElementwise()
Multiply by pixel value.
iaa.MultiplyElementwise(self, mul=1.0, per_channel=False, name=None, deterministic=False, random_state=None)
3.31 iaa.Dropout()
Randomly remove some pixels, that is, turn them into 0.
iaa.Dropout(p=0, per_channel=False, name=None, deterministic=False, random_state=None)
3.32 iaa.CoarseDropout()
Set the value of the rectangle to 0.
iaa.CoarseDropout(p=0, size_px=None, size_percent=None, per_channel=False, min_size=4, name=None, deterministic=False, random_state=None)
3.33 iaa.Invert()
Change each pixel value p to 255-p.
iaa.Invert(p=0, per_channel=False, min_value=0, max_value=255, name=None, deterministic=False, random_state=None)
3.34 iaa.ContrastNormalization()
Change the contrast of the image.
iaa.ContrastNormalization(alpha=1.0, per_channel=False, name=None, deterministic=False, random_state=None)
3.35 iaa.Affine()
Affine transformation. Including: translation, rotation, zoom and shear. Simulation transformation usually produces some new pixels. We need to specify the generation method of these new pixels by setting cval and mode parameters. The parameter order is used to set the interpolation method.
iaa.Affine(scale=1.0, translate_percent=None, translate_px=None, rotate=0.0, shear=0.0, order=1, cval=0, mode='constant', name=None, deterministic=False, random_state=None)
Parameters:
- Scale: image scaling factor. 1 means no scaling, and 0.5 means reducing to 50%. This parameter can be float, float tuple, dict. In the case of float, all pictures are scaled to this scale. If x-float and x-float are selected to scale at random, then they will be scaled at the same scale. If it is a dict, there should be two keys: X and Y. the value of each X or y can be float, float tuple. At this time, the scaling ratio of x-axis and y-axis is different.
- translate_percent: translation scale, 0 means no translation, and 0.5 means 50% translation. It can be float, float tuple, dict. The specific meaning is the same as that of scale. Use positive and negative to indicate the translation direction.
- translate_px: Pan by pixel. It can be int, int tuple, dict. The specific meaning is the same as translate_percent is the same.
- rotate: translation angle, 0 ~ 360 degrees, plus or minus can also indicate direction. It can be float, float tuple.
- shear: the degree of staggered cutting, between 0 and 360 degrees. Positive and negative indicates the direction. Can be float, int, float tuple, int tuple.
- Order: interpolation order, which is the same as that defined in skimage. The following methods 0 and 1 are fast, 3 is slow, and 4 and 5 are particularly slow. It can be int, int list, IA ALL. If it is IA All, select randomly from all interpolation methods each time.
- 0: nearest neighbor interpolation.
- 1: Bilinear interpolation (default).
- 2: Biquadratic interpolation (not recommended).
- 3: Bicubic interpolation.
- 4: Bi-quartic.
- 5: Bi-quintic.
- cval: when using constant filling after translation, specify the constant value of filling. It will take effect only when mode=constant. It can be int, float, tuple, IA ALL. If it is IA All, a value will be randomly selected from [0255] to fill in.
- mode: how to fill the transformed blank pixels. Can be string, string list, IA ALL. The basic usage is the same as above. The selection range of string is:
- Constant: fill with a constant.
- Edge: edge fill.
- Symmetric: mirror symmetric fill.
- reflect: Pads with the reflection of the vector mirrored on the first and last values of the vector along each axis.
- wrap: Pads with the wrap of the vector along the axis. The first values are used to pad the end and the end values are used to pad the beginning.
3.36 iaa.PiecewiseAffine()
Randomly place some regular grid points, and then move the pixels around these points. This leads to local distortion.
iaa.PiecewiseAffine(scale=0, nb_rows=4, nb_cols=4, order=1, cval=0, mode='constant', name=None, deterministic=False, random_state=None)
3.37 iaa.ElasticTransformation()
Transform by moving local pixels.
iaa.ElasticTransformation(alpha=0, sigma=0, name=None, deterministic=False, random_state=None)
4. keypoint transformation
imgaug supports the transformation of key points in the image at the same time. Examples are as follows:
import imgaug as ia from imgaug import augmenters as iaa iaa.seed(1) image=ia.quokka(size=(256,256)) # Define 4 key points keypoints=ia.KeypointsOnImage([ ia.Keypoint(x=65, y=100), ia.Keypoint(x=75, y=200), ia.Keypoint(x=100, y=100), ia.Keypoint(x=200, y=80) ], shape=image.shape) # Define a transformation sequence seq=iaa.Sequential([ iaa.Multiply((1.2, 1.5)), # Change the brightness without affecting the keys iaa.Affine( rotate=10, scale=(0.5, 0.7) ) # Rotating 10 degrees and then scaling affects the keys ]) # After fixing the transformation sequence, the image can be transformed first and then the key points can be transformed, so as to ensure that the two transformations are exactly the same. # If the function is called once, it needs to be called once every batch, otherwise different batches perform the same transformation. seq_det = seq.to_deterministic() # Convert to list or batch. Since there is only one picture, use [0] to take out the picture and key points. image_aug = seq_det.augment_images([image])[0] keypoints_aug = seq_det.augment_keypoints([keypoints])[0] # print coordinates before/after augmentation (see below) # use after.x_int and after.y_int to get rounded integer coordinates for i in range(len(keypoints.keypoints)): before = keypoints.keypoints[i] after = keypoints_aug.keypoints[i] print("Keypoint %d: (%.8f, %.8f) -> (%.8f, %.8f)" % ( i, before.x, before.y, after.x, after.y) ) # Draw keys on the picture. # image with keypoints before/after augmentation (shown below) image_before = keypoints.draw_on_image(image, size=7) image_after = keypoints_aug.draw_on_image(image_aug, size=7) fig, axes = plt.subplots(2, 1, figsize=(20, 15)) plt.subplots_adjust(left=0.2, bottom=0.2, right=0.8, top=0.8, hspace=0.3, wspace=0.0) axes[0].set_title("image before") axes[0].imshow(image_before) axes[1].set_title("image after augmentation") axes[1].imshow(image_after) plt.show()
5. Bounding Boxes transform
imgaug transforms the bound box in the image while transforming the image. bounding support includes:
- Encapsulating the bounding box as an object
- Transform the bounding box
- Draw the bounding box on the image
- Move the position of the bounding box, map the transformed bounding box to the image, and calculate the IoU of the bounding box.
5.1 basic transformation
Examples are as follows:
import imgaug as ia from imgaug import augmenters as iaa ia.seed(1) image = ia.quokka(size=(256, 256)) # Defining 2 bounding box es bbs = ia.BoundingBoxesOnImage([ ia.BoundingBox(x1=65, y1=100, x2=200, y2=150), ia.BoundingBox(x1=150, y1=80, x2=200, y2=130) ], shape=image.shape) seq = iaa.Sequential([ iaa.Multiply((1.2, 1.5)), # Changing the brightness does not affect the bounding box iaa.Affine( translate_px={"x": 40, "y": 60}, scale=(0.5, 0.7) ) # Zooming after panning will affect the bounding box ]) # Fixed transformation seq_det = seq.to_deterministic() # Transform images and bounding box es image_aug = seq_det.augment_images([image])[0] bbs_aug = seq_det.augment_bounding_boxes([bbs])[0] # Print coordinates # use .x1_int, .y_int, ... to get integer coordinates for i in range(len(bbs.bounding_boxes)): before = bbs.bounding_boxes[i] after = bbs_aug.bounding_boxes[i] print("BB %d: (%.4f, %.4f, %.4f, %.4f) -> (%.4f, %.4f, %.4f, %.4f)" % ( i, before.x1, before.y1, before.x2, before.y2, after.x1, after.y1, after.x2, after.y2) ) # output # BB 0: (65.0000, 100.0000, 200.0000, 150.0000) -> (130.7524, 171.3311, 210.1272, 200.7291) # BB 1: (150.0000, 80.0000, 200.0000, 130.0000) -> (180.7291, 159.5718, 210.1272, 188.9699) # image with BBs before/after augmentation (shown below) image_before = bbs.draw_on_image(image, thickness=2) image_after = bbs_aug.draw_on_image(image_aug, thickness=2, color=[0, 0, 255]) fig, axes = plt.subplots(2, 1, figsize=(20, 15)) plt.subplots_adjust(left=0.2, bottom=0.2, right=0.8, top=0.8, hspace=0.3, wspace=0.0) axes[0].set_title("image before") axes[0].imshow(image_before) axes[1].set_title("image after augmentation") axes[1].imshow(image_after) plt.show()
5.2 translating the bounding box
Just call the shift function.
import imgaug as ia from imgaug import augmenters as iaa ia.seed(1) # Define image and two bounding boxes image = ia.quokka(size=(256, 256)) bbs = ia.BoundingBoxesOnImage([ ia.BoundingBox(x1=25, x2=75, y1=25, y2=75), ia.BoundingBox(x1=100, x2=150, y1=25, y2=75) ], shape=image.shape) # The two boxes move 25 pixels to the right, and then the second box moves 25 pixels down bbs_shifted = bbs.shift(left=25) bbs_shifted.bounding_boxes[1] = bbs_shifted.bounding_boxes[1].shift(top=25) # Draw images before/after moving BBs image = bbs.draw_on_image(image, color=[0, 255, 0], thickness=2, alpha=0.75) image = bbs_shifted.draw_on_image(image, color=[0, 0, 255], thickness=2, alpha=0.75)
The image obtained is:
[the external chain picture transfer fails, and the source station may have anti-theft chain mechanism. It is recommended to save the picture and upload it directly (img-eWwQmj92-1621334062344)(/img/imgaug/1621331938-01a26f5d29055cb83709b34efa0802eb.jpg)]
5.3 mapping of bounding box when the image is zoomed
Just call the on function.
import imgaug as ia from imgaug import augmenters as iaa ia.seed(1) # Define image with two bounding boxes image = ia.quokka(size=(256, 256)) bbs = ia.BoundingBoxesOnImage([ ia.BoundingBox(x1=25, x2=75, y1=25, y2=75), ia.BoundingBox(x1=100, x2=150, y1=25, y2=75) ], shape=image.shape) # Rescale image and bounding boxes image_rescaled = ia.imresize_single_image(image, (512, 512)) bbs_rescaled = bbs.on(image_rescaled) # Draw image before/after rescaling and with rescaled bounding boxes image_bbs = bbs.draw_on_image(image, thickness=2) image_rescaled_bbs = bbs_rescaled.draw_on_image(image_rescaled, thickness=2)
5.4 calculation of Intersections, Unions and IoU
import imgaug as ia from imgaug import augmenters as iaa import numpy as np ia.seed(1) # Define image with two bounding boxes. image = ia.quokka(size=(256, 256)) bb1 = ia.BoundingBox(x1=50, x2=100, y1=25, y2=75) bb2 = ia.BoundingBox(x1=75, x2=125, y1=50, y2=100) # Compute intersection, union and IoU value # Intersection and union are both bounding boxes. They are here # decreased/increased in size purely for better visualization. bb_inters = bb1.intersection(bb2).extend(all_sides=-1) bb_union = bb1.union(bb2).extend(all_sides=2) iou = bb1.iou(bb2) # Draw bounding boxes, intersection, union and IoU value on image. image_bbs = np.copy(image) image_bbs = bb1.draw_on_image(image_bbs, thickness=2, color=[0, 255, 0]) image_bbs = bb2.draw_on_image(image_bbs, thickness=2, color=[0, 255, 0]) image_bbs = bb_inters.draw_on_image(image_bbs, thickness=2, color=[255, 0, 0]) image_bbs = bb_union.draw_on_image(image_bbs, thickness=2, color=[0, 0, 255]) image_bbs = ia.draw_text( image_bbs, text="IoU=%.2f" % (iou,), x=bb_union.x2+10, y=bb_union.y1+bb_union.height//2, color=[255, 255, 255], size=13 )
The image obtained is as follows:
[the external chain picture transfer fails, and the source station may have anti-theft chain mechanism. It is recommended to save the picture and upload it directly (img-ez6lnq9r-1621334062346) (/ img / imgaug / 1621331938-fd161c36361e9c2e8f1d1d4c8a9f37b22d. JPG)]
6. Stochastic Parameter
When making the transformation, we hope that the transformation of each picture is different, which can be realized through parameter randomization. However, if you want to reproduce the previous transformation, it needs to be realized through determinism, which is cumbersome. To avoid this situation, Stochastic Parameters are used. This variable is usually an abstract probability distribution, such as positive distribution, uniform distribution and so on. Generally, all augmeters can accept this parameter, which makes it easy to control the variable range. They can all be combined with determinism.
example:
from imgaug import augmenters as iaa from imgaug import parameters as iap seq = iaa.Sequential([ iaa.GaussianBlur( sigma=iap.Uniform(0.0, 1.0) ), iaa.ContrastNormalization( iap.Choice( [1.0, 1.5, 3.0], p=[0.5, 0.3, 0.2] ) ), iaa.Affine( rotate=iap.Normal(0.0, 30), translate_px=iap.RandomSign(iap.Poisson(3)) ), iaa.AddElementwise( iap.Discretize( (iap.Beta(0.5, 0.5) * 2 - 1.0) * 64 ) ), iaa.Multiply( iap.Positive(iap.Normal(0.0, 0.1)) + 1.0 ) ])
All available probability distributions are:
6.1 normal distribution
Normal(loc, scale): the mean value is loc and the standard deviation is scale.
from imgaug import parameters as iap params = [ iap.Normal(0, 1), iap.Normal(5, 3), iap.Normal(iap.Choice([-3, 3]), 1), iap.Normal(iap.Uniform(-3, 3), 1) ] iap.show_distributions_grid(params)
[the external chain picture transfer fails, and the source station may have anti-theft chain mechanism. It is recommended to save the picture and upload it directly (img-Y99ms8sd-1621334062356)(/img/imgaug/1621331938-dbbb6ac5efb90a1424b5781513c72330.jpg)]
6.2 Laplace distribution
Laplace(loc, scale): peak value loc, width scale:
from imgaug import parameters as iap params = [ iap.Laplace(0, 1), iap.Laplace(5, 3), iap.Laplace(iap.Choice([-3, 3]), 1), iap.Laplace(iap.Uniform(-3, 3), 1) ] iap.show_distributions_grid(params)
[the external chain picture transfer fails, and the source station may have anti-theft chain mechanism. It is recommended to save the picture and upload it directly (img-JQmL3Fl2-1621334062358)(/img/imgaug/1621331938-7e1866ebd7cde4503ac363bec5a07cce.jpg)]
6.3 other continuous probability distributions include:
- Chi square distribution
- Weibull distribution
- Uniform distribution
- Beta distribution
6.4 discrete probability distribution
- Binomial distribution
- Discrete uniform
- Poisson distribution
6.5 mathematical operation of distribution
imgaug supports arithmetic operation of random parameters. Allows you to modify values extracted from distributions or combine several distributions with each other. Supported operations are:
- Add
- Subtract
- Multiply
- Divide
- Power
6.6 special parameters
Supported operations are:
- Deterministic
- Choice
- Clip
- Discretize
- Absolute
- RandomSign
- ForceSign
- Positive
- Negative
- FromLowerResolution
See the document for specific meaning and usage.
7. Blending/Overlaying images
The fragment will change the image directly and discard the original image. Sometimes we need to change the part of the image, or combine the original image with the newly transformed image. This can be done by adding a certain weight to the pictures before and after the transformation( αα Parameter) or use a pixel wise mask.
An example is as follows:
# First row iaa.Alpha( (0.0, 1.0), first=iaa.MedianBlur(11), per_channel=True ) # Second row iaa.SimplexNoiseAlpha( first=iaa.EdgeDetect(1.0), per_channel=False ) # Third row iaa.SimplexNoiseAlpha( first=iaa.EdgeDetect(1.0), second=iaa.ContrastNormalization((0.5, 2.0)), per_channel=0.5 ) # Forth row iaa.FrequencyNoiseAlpha( first=iaa.Affine( rotate=(-10, 10), translate_px={"x": (-4, 4), "y": (-4, 4)} ), second=iaa.AddToHueAndSaturation((-40, 40)), per_channel=0.5 ) # Fifth row iaa.SimplexNoiseAlpha( first=iaa.SimplexNoiseAlpha( first=iaa.EdgeDetect(1.0), second=iaa.ContrastNormalization((0.5, 2.0)), per_channel=True ), second=iaa.FrequencyNoiseAlpha( exponent=(-2.5, -1.0), first=iaa.Affine( rotate=(-10, 10), translate_px={"x": (-4, 4), "y": (-4, 4)} ), second=iaa.AddToHueAndSaturation((-40, 40)), per_channel=True ), per_channel=True, aggregation_method="max", sigmoid=False )
The picture obtained is:
[the external chain picture transfer fails, and the source station may have an anti-theft chain mechanism. It is recommended to save the picture and upload it directly (img-En1fXAVL-1621334062359)(/img/imgaug/1621331938-ec2ecb79a733cfd85deafb1086e2e6b9.png)]
See the document for specific usage.