Hot questions for Using Neural networks in torchvision
I am new to pytorch and had a problem with channels in AlexNet. I am using it for a ‘gta san andreas self driving car’ project, I collected the dataset from a black and white image that has one channel and trying to train AlexNet using the script:
from AlexNetPytorch import* import torchvision import torchvision.transforms as transforms import torch.optim as optim import torch.utils.data import numpy as np import torch from IPython.core.debugger import set_trace AlexNet = AlexNet() criterion = nn.CrossEntropyLoss() optimizer = optim.SGD(AlexNet.parameters(), lr=0.001, momentum=0.9) all_data = np.load('training_data.npy') inputs= all_data[:,0] labels= all_data[:,1] inputs_tensors = torch.stack([torch.Tensor(i) for i in inputs]) labels_tensors = torch.stack([torch.Tensor(i) for i in labels]) data_set = torch.utils.data.TensorDataset(inputs_tensors,labels_tensors) data_loader = torch.utils.data.DataLoader(data_set, batch_size=3,shuffle=True, num_workers=2) if __name__ == '__main__': for epoch in range(8): runing_loss = 0.0 for i,data in enumerate(data_loader , 0): inputs= data inputs = torch.FloatTensor(inputs) labels= data labels = torch.FloatTensor(labels) optimizer.zero_grad() # set_trace() inputs = torch.unsqueeze(inputs, 1) outputs = AlexNet(inputs) loss = criterion(outputs , labels) loss.backward() optimizer.step() runing_loss +=loss.item() if i % 2000 == 1999: # print every 2000 mini-batches print('[%d, %5d] loss: %.3f' % (epoch + 1, i + 1, running_loss / 2000)) running_loss = 0.0 print('finished')
I am using AlexNet from the link: https://github.com/pytorch/vision/blob/master/torchvision/models/alexnet.py
But changed line 18 from :
nn.Conv2d(3, 64, kernel_size=11, stride=4, padding=2)
nn.Conv2d(1, 64, kernel_size=11, stride=4, padding=2)
Because I am using only one channel in training images, but I get this error:
File "training_script.py", line 44, in <module> outputs = AlexNet(inputs) File "C:\Users\Mukhtar\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 477, in __call__ result = self.forward(*input, **kwargs) File "C:\Users\Mukhtar\Documents\AI_projects\gta\AlexNetPytorch.py", line 34, in forward x = self.features(x) File "C:\Users\Mukhtar\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 477, in __call__ result = self.forward(*input, **kwargs) File "C:\Users\Mukhtar\Anaconda3\lib\site-packages\torch\nn\modules\container.py", line 91, in forward input = module(input) File "C:\Users\Mukhtar\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 477, in __call__ result = self.forward(*input, **kwargs) File "C:\Users\Mukhtar\Anaconda3\lib\site-packages\torch\nn\modules\pooling.py", line 142, in forward self.return_indices) File "C:\Users\Mukhtar\Anaconda3\lib\site-packages\torch\nn\functional.py", line 396, in max_pool2d ret = torch._C._nn.max_pool2d_with_indices(input, kernel_size, stride, padding, dilation, ceil_mode) RuntimeError: Given input size: (256x1x1). Calculated output size: (256x0x0). Output size is too small at c:\programdata\miniconda3\conda-bld\pytorch-cpu_1532499824793\work\aten\src\thnn\generic/SpatialDilatedMaxPooling.c:67
I don't know what is wrong, is it wrong to change the channel size like this, and if it is wrong can you please lead me to a neural network that work with one channel , as I said I am a newbie in pytorch and I don't want to write the nn myself.
Your error is not related to using gray-scale images instead of RGB. Your error is about the spatial dimensions of the input: while "forwarding" an input image through the net, its size (in feature space) became zero - this is the error you see. You can use this nice guide to see what happens to the output size of each layer (conv/pooling) as a function of kernel size, stride and padding. Alexnet expects its input images to be 224 by 224 pixels - make sure your inputs are of the same size.
Other things you overlooked:
You are using Alexnet architecture, but you are initializing it to random weights instead of using pretrained weights (trained on imagenet). To get a trained copy of alexnet you'll need to instantiate the net like this
AlexNet = alexnet(pretrained=True)
Once you decide to use pretrained net, you cannot change its first layer from 3 input channels to three (the trained weight simply won't fit). The easiest fix is to make your input images "colorful" by simply repeating the single channel three times. See
repeat()for more info.
I want to train SqueezeNet 1.1 model using MNIST dataset instead of ImageNet dataset. Can i have the same model as torchvision.models.squeezenet? Thanks!
TorchVision provides only ImageNet data pretrained model for the SqueezeNet architecture. However, you can train your own model using MNIST dataset by taking only the model (but not the pre-trained one) from
In : import torchvision as tv # get the model architecture only; ignore `pretrained` flag In : squeezenet11 = tv.models.squeezenet1_1() In : squeezenet11.training Out: True
Now, you can use this architecture to train a model on MNIST data, which should not take too long.
One modification to keep in mind is to update the number of classes which is 10 for MNIST. Specifically, the 1000 should be changed to 10, and the kernel and stride accordingly.
(classifier): Sequential( (0): Dropout(p=0.5) (1): Conv2d(512, 1000, kernel_size=(1, 1), stride=(1, 1)) (2): ReLU(inplace) (3): AvgPool2d(kernel_size=13, stride=1, padding=0) )
Here's the relevant explanation: finetuning_torchvision_models-squeezenet
I use some code similar to the following - for data augmentation:
from torchvision import transforms #... augmentation = transforms.Compose([ transforms.RandomApply([ transforms.RandomRotation([-30, 30]) ], p=0.5), transforms.RandomHorizontalFlip(p=0.5), ])
During my testing I want to fix random values to reproduce the same random parameters each time I change the model training settings. How can I do it?
I want to do something similar to
np.random.seed(0) so each time I call random function with probability for the first time, it will run with the same rotation angle and probability. In other words, if I do not change the code at all, it must reproduce the same result when I rerun it.
Alternatively I can separate transforms, use
p=1, fix the angle
max to a particular value and use numpy random numbers to generate results, but my question if I can do it keeping the code above unchanged.
__getitem__ of your dataset class make a numpy random seed.
def __getitem__(self, index): img = io.imread(self.labels.iloc[index,0]) target = self.labels.iloc[index,1] seed = np.random.randint(2147483647) # make a seed with numpy generator random.seed(seed) # apply this seed to img transforms if self.transform is not None: img = self.transform(img) random.seed(seed) # apply this seed to target transforms if self.target_transform is not None: target = self.target_transform(target) return img, target