By Matthew Millar R&D Scientist at ユニファ
Purpose:
This blog will cover a method for combining unsupervised learning with supervised learning. I will show how to use an autoencoder and combine that with a neural network for a classification problem in Pytorch.
Data Processing:
The first step will be easy as the same dataloader can be used for both training the autoencoder and the neural network.
I will be using the cifar10 dataset as this is available to everyone and is easy to deal with.
#Basic Transforms SIZE = (32,32) # Resize the image to this shape # Test and basic transform. This will reshape and then transform the raw image into a tensor for pytorch basic = transforms.Compose([transforms.Resize(SIZE), transforms.ToTensor()]) # Normalized transforms (0.4914, 0.4822, 0.4465), (0.247, 0.243, 0.261) retrived from here https://github.com/kuangliu/pytorch-cifar/issues/19 mean = (0.4914, 0.4822, 0.4465) # Mean std = (0.247, 0.243, 0.261) # Standard deviation # This will transform the image to the Size and then normalize the image norm_tran = transforms.Compose([transforms.Resize(SIZE), transforms.ToTensor(), transforms.Normalize(mean=mean, std=std)]) #Simple Data Augmentation # Data augmentations ''' Randomly flip the images both virtically and horizontally this will cover and orientation for images Randomly rotate the image by 15. This will give images even more orientation than before but with limiting the black board issue of rotations Random Resie and crop this will resize the image and remove any excess to act like a zoom feature Normalize each image and make it a tensor ''' aug_tran = transforms.Compose([transforms.RandomHorizontalFlip(), transforms.RandomRotation(15), transforms.RandomResizedCrop(SIZE, scale=(0.08, 1.0), ratio=(0.75, 1.3333333333333333), interpolation=3), transforms.ToTensor(), transforms.Normalize(mean=mean, std=std)]) # Create Dataset train_dataset = datasets.ImageFolder(TRAIN_DIR, transform=aug_tran) test_dataset = datasets.ImageFolder(TEST_DIR, transform=norm_tran) #No augmentation for testing sets # Data loaders # Parameters for setting up data loaders BATCH_SIZE = 32 NUM_WORKERS = 4 VALIDATION_SIZE = 0.15 # Validatiaon split num_train = len(train_dataset) # Number of training samples indices = list(range(num_train)) # Create indices for each set np.random.shuffle(indices) # Randomlly sample each of these by shuffling split = int(np.floor(VALIDATION_SIZE * num_train)) # Create the split for validation train_idx , val_idx = indices[split:], indices[:split] # Create the train and validation sets train_sampler = SubsetRandomSampler(train_idx) # Subsample using pytroch validation_sampler = SubsetRandomSampler(val_idx) # same here but for validation # Create the data loaders train_loader = DataLoader(train_dataset, batch_size=BATCH_SIZE, sampler=train_sampler, num_workers=NUM_WORKERS) validation_loader = DataLoader(train_dataset, batch_size=BATCH_SIZE, sampler=validation_sampler, num_workers=NUM_WORKERS) test_loader = DataLoader(test_dataset, batch_size=BATCH_SIZE, shuffle=False, num_workers=NUM_WORKERS)
Also, I have a list of dataloaders on my Kaggle page for both Pytorh and Keras if you would like to learn how to build out custom dataloader and datasets with both languages.
https://www.kaggle.com/matthewmillar/pytorchdataloaderexamples
https://www.kaggle.com/matthewmillar/kerasgeneratorexamples
Autoencoder:
An autoencoder is an unsupervised method of learning encodings of data which that can be processed efficiently. This is done through dimension reduction and ignoring noise in the dataset. There are two sides to an autoencoder. The encoder and the decoder. The encoder job is to create a useful encoding that will remove unwanted noise in the dataset while keeping the most import parts of the data. The decoder job is to take the encodings and reassemble it into the original input form. Below is the Autoencoder that we will be using as the feature extraction system in our combination model.
The approach that will be taken is to train the autoencoder separately instead of together with the NN. This will allow for us to check the result of the output of the encoder as well as the decoder and see how well it works.
# define the NN architecture class ConvAutoencoder(nn.Module): def __init__(self): super(ConvAutoencoder, self).__init__() ## encoder layers ## # conv layer (depth from 1 --> 16), 3x3 kernels self.conv1 = nn.Conv2d(3, 16, 3, padding=1) # conv layer (depth from 16 --> 4), 3x3 kernels self.conv2 = nn.Conv2d(16, 4, 3, padding=1) # pooling layer to reduce x-y dims by two; kernel and stride of 2 self.pool = nn.MaxPool2d(2, 2) ## decoder layers ## ## a kernel of 2 and a stride of 2 will increase the spatial dims by 2 self.t_conv1 = nn.ConvTranspose2d(4, 16, 2, stride=2) self.t_conv2 = nn.ConvTranspose2d(16, 3, 2, stride=2) def forward(self, x): ## encode ## # add hidden layers with relu activation function # and maxpooling after x = torch.relu(self.conv1(x)) x = self.pool(x) # add second hidden layer x = torch.relu(self.conv2(x)) x = self.pool(x) # compressed representation ## decode ## # add transpose conv layers, with relu activation function x = torch.relu(self.t_conv1(x)) # output layer (with sigmoid for scaling from 0 to 1) x = torch.sigmoid(self.t_conv2(x)) return x # Loss and optimizers loss_function = nn.MSELoss() optimizer = torch.optim.Adam(ae_model.parameters(), lr=0.001) scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau(optimizer=optimizer, mode='min', factor=0.1, patience=3, verbose=True) # Automatically reduce learning rate on plateau # number of epochs to train the model n_epochs = 35 ae_model_filename = 'cifar_autoencoder.pt' train_loss_min = np.Inf # track change in training loss ae_train_loss_matrix = [] for epoch in range(1, n_epochs+1): # monitor training loss train_loss = 0.0 ################### # train the model # ################### for data in train_loader: # _ stands in for labels, here # no need to flatten images images, _ = data if use_gpu: images = images.cuda() # clear the gradients of all optimized variables optimizer.zero_grad() # forward pass: compute predicted outputs by passing inputs to the model outputs = ae_model(images) # calculate the loss loss = loss_function(outputs, images) # backward pass: compute gradient of the loss with respect to model parameters loss.backward() # perform a single optimization step (parameter update) optimizer.step() # update running training loss train_loss += loss.item()*images.size(0) # print avg training statistics train_loss = train_loss/len(train_loader) scheduler.step(train_loss) ae_train_loss_matrix.append([train_loss, epoch]) print('Epoch: {} \tTraining Loss: {:.6f}'.format(epoch, train_loss)) # save model if validation loss has decreased if train_loss <= train_loss_min: print('Validation loss decreased ({:.6f} --> {:.6f}). Saving model ...'.format( train_loss_min, train_loss)) torch.save(ae_model.state_dict(), ae_model_filename) train_loss_min = train_loss
Looking at the above image the encoder works ok so we can use this with confidence.
Neural Network.
This will be the classification and supervised learning section of the model. The first this we need to do is freeze the autoencoder to ensure that its weights and bias do not get updated during training. Now we will define the NN using the autoencoder maxpooling layer as the output (the encoder part) and add on top of that Fully connected layers with a dropout layer as well to help normalize the output.
Here is the training code.
class MyModel(nn.Module): def __init__(self): super(MyModel, self).__init__() image_modules = list(ae_model.children())[:-2] #get only the encoder layers self.modelA = nn.Sequential(*image_modules) # Shape of max pool = 4, 112, 112 self.fc1 = nn.Linear(4*16*16, 1024) self.fc2 = nn.Linear(1024,512) self.out = nn.Linear(512, 10) self.drop = nn.Dropout(0.2) def forward(self, x): x = self.modelA(x) x = x.view(x.size(0),4*16*16) x = torch.relu(self.fc1(x)) x = self.drop(x) x = torch.relu(self.fc2(x)) x = self.drop(x) x = self.out(x) return x #Freze the autoencoder layers so they do not train. We did that already # Train only the linear layers for child in model.children(): if isinstance(child, nn.Linear): print("Setting Layer {} to be trainable".format(child)) for param in child.parameters(): param.requires_grad = True else: for param in child.parameters(): param.requires_grad = False # Optimizer and Loss function criterion = nn.CrossEntropyLoss() optimizer = torch.optim.SGD(model.parameters(), lr= 0.001) # Decay LR by a factor of 0.1 every 7 epochs scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau(optimizer=optimizer, mode='min', factor=0.1, patience=3, verbose=True) model_filename = 'model_cifar10.pt' n_epochs = 40 valid_loss_min = np.Inf # track change in validation loss train_loss_matrix = [] val_loss_matrix = [] val_acc_matrix = [] for epoch in range(1, n_epochs+1): # keep track of training and validation loss train_loss = 0.0 valid_loss = 0.0 train_correct = 0 train_total = 0 val_correct = 0 val_total = 0 ################### # train the model # ################### model.train() for batch_idx, (data, target) in enumerate(train_loader): # move tensors to GPU if CUDA is available if use_gpu: data, target = data.cuda(), target.cuda() # clear the gradients of all optimized variables optimizer.zero_grad() # forward pass: compute predicted outputs by passing inputs to the model output = model(data) # calculate the batch loss loss = criterion(output, target) # backward pass: compute gradient of the loss with respect to model parameters loss.backward() # perform a single optimization step (parameter update) optimizer.step() # update training loss train_loss += loss.item()*data.size(0) ###################### # validate the model # ###################### model.eval() val_acc = 0.0 for batch_idx, (data, target) in enumerate(validation_loader): # move tensors to GPU if CUDA is available if use_gpu: data, target = data.cuda(), target.cuda() # forward pass: compute predicted outputs by passing inputs to the model output = model(data) # calculate the batch loss loss = criterion(output, target) # update average validation loss valid_loss += loss.item()*data.size(0) val_acc += calc_accuracy(output, target) # calculate average losses train_loss = train_loss/len(train_loader.sampler) valid_loss = valid_loss/len(validation_loader.sampler) #exp_lr_scheduler.step() scheduler.step(valid_loss) # Add losses and acc to plot latter train_loss_matrix.append([train_loss, epoch]) val_loss_matrix.append([valid_loss, epoch]) val_acc_matrix.append([val_acc, epoch]) # print training/validation statistics print('Epoch: {} \tTraining Loss: {:.6f} \tValidation Loss: {:.6f}\tValidation Accuracy: {:.6f}'.format( epoch, train_loss, valid_loss, val_acc)) # save model if validation loss has decreased if valid_loss <= valid_loss_min: print('Validation loss decreased ({:.6f} --> {:.6f}). Saving model ...'.format( valid_loss_min,valid_loss)) torch.save(model.state_dict(), model_filename) valid_loss_min = valid_loss
Training the model will give the final accuracy for each class.
Test Accuracy of airplane: 45% (231/504) Test Accuracy of automobile: 61% (312/504) Test Accuracy of bird: 18% (91/496) Test Accuracy of cat: 11% (55/496) Test Accuracy of deer: 27% (139/504) Test Accuracy of dog: 35% (181/504) Test Accuracy of frog: 63% (315/496) Test Accuracy of horse: 49% (244/496) Test Accuracy of ship: 59% (298/504) Test Accuracy of truck: 46% (234/504) Test Accuracy (Overall): 41% (2100/5008)
Conclusion:
Looking at the loss and validation accuracy the accuracy is moving up steadily (all be it a little jumpy) while the losses are both decreasing with the validation loss consistently less than training loss. This shows that the model is not overfitting or underfitting, so it is learning well going forward. The accuracy is a little low compared to simply supervised learning, but giving enough time the accuracy could get higher.