
- #Keras data augmentation image mask generator#
- #Keras data augmentation image mask full#
- #Keras data augmentation image mask code#
columns = row_name rowĬlass ImageDataGenerator_landmarks ( object ): def _init_ ( self, datagen, preprocessing_function = lambda x, y : ( x, y ), loc_xRE = None, loc_xLE = None, flip_indicies = None, target_shape = None, ignore_horizontal_flip = True ): ''' datagen : Keras's ImageDataGenerator preprocessing_function : The function that will be implied on each input. jpg" )) row_name = row = # yLM: row = pd.


#Keras data augmentation image mask code#
The code below shows how I implemented this approach.įrom import img_to_array, load_img dir_data = "DrivFace/" # For this data, we have annotation right eye, left eye, nose, right mouth and left mouth landmarks = img = img_to_array ( load_img ( dir_data + "/DrivImages/20130529_01_Driv_001_f. Then we will pass this 4-channel image to Keras's ImagedataGenerator and find where the indexed landmark will be after image translation.

The original image is augmented with this mask as the 4th channel (assuming that the image has 3 channels). The pixels of the mask corresponding to a landmark is indexed. The idea is simple: I will create a mask having the same size as the image.
#Keras data augmentation image mask full#
I came up with a rather simple approach that takes full advantage of Keras's ImageDataGenerator.Īlthough this is probably not the most optimized approach, it is very simple and the method allows us to use all Keras's ImageDataGenerator functionalities for landmark detection problem. In my previous blog post Achieving Top 23% in Kaggle's Facial Keypoints Detection with Keras + Tensorflow, I implemented a python class that can flip the image horizontally and shift the image both along horizontal and vertical axes while adjusting the landmark coordinates.īut there are so many other translations that I want to do e.g., shearing, zooming, or all of them at once! And I do not want to code rotation matrix by myself! Keras's ImageDataGenerator for facial keypoint detection problem. I was looking for some existing API that can translate both images and coordinates. That means that if the image of a face is shifted by 3 pixels, the (x,y) coordinates of the eye location also needs to be shifted. In landmark detection or facial keypoint detections, the target values also needs to change when an image is translated. So the target label "dog" does not need to be translated. For example, the image of a dog is still an image of a dog even if the image is shifted by 3 pixels. CNN modeling with image translations using MNIST dataĭespite that it is a powerful and popular API, this API is limited to the image classification problem where the target does not depend on the translation of images.
#Keras data augmentation image mask generator#
This generator has been used in many of my previous blog posts, for example: The generator can generate augmented images from the training images on the fly. Keras has a powerful API called ImageDataGenerator that resolve this problem. Keras's ImageDataGenerator and its limit ¶ĭata augmentation could increase the number of training images substantially which could raise a storage problem.

This experiment shows that it is essential to increase the data size using data augmentation to develop a robust deep learning model. However, the model performance improves when training data also contains translated images. In my previous blog post, I have seen poor performance of a deep learning model when testing images contain the translation of the training images. Why data augmentation? ¶ĭeep learning model is data greedy and the performance of the model may be surprisingly bad when testing images vary from training images a lot.ĭata augmentation is an essential technique to utilize limited amount of training images. The python class ImageDataGenerator_landmarks is available at my github account.
