Hot questions for Using Neural networks in azure machine learning studio

Top 10 Python Open Source / Neural networks / azure machine learning studio

Question:

I have a minimal example of a neural network with a back-propagation trainer, testing it on the IRIS data set. I started of with 7 hidden nodes and it worked well.

I lowered the number of nodes in the hidden layer to 1 (expecting it to fail), but was surprised to see that the accuracy went up.

I set up the experiment in azure ml, just to validate that it wasn't my code. Same thing there, 98.3333% accuracy with a single hidden node.

Can anyone explain to me what is happening here?


Answer:

First, it has been well established that a variety of classification models yield incredibly good results on Iris (Iris is very predictable); see here, for example.

Secondly, we can observe that there are relatively few features in the Iris dataset. Moreover, if you look at the dataset description you can see that two of the features are very highly correlated with the class outcomes.

These correlation values are linear, single-feature correlations, which indicates that one can most likely apply a linear model and observe good results. Neural nets are highly nonlinear; they become more and more complex and capture greater and greater nonlinear feature combinations as the number of hidden nodes and hidden layers is increased.

Taking these facts into account, that (a) there are few features to begin with and (b) that there are high linear correlations with class, would all point to a less complex, linear function as being the appropriate predictive model-- by using a single hidden node, you are very nearly using a linear model.

It can also be noted that, in the absence of any hidden layer (i.e., just input and output nodes), and when the logistic transfer function is used, this is equivalent to logistic regression.

Question:

I'm working with data scientists who would like to gain insight and understanding of the neural network models that they train using the visual interfaces in Azure Machine Learning Studio/Service. Is it possible to dump out and inspect the internal representation of a neural network model? Is there a way that I could write code that accesses the nodes and weights of a trained neural network in order to visualize the network as a graph structure? Or if Azure Machine Learning Studio/Service doesn't support this I'd appreciate advice on a different machine learning framework that might be more appropriate for this kind of analysis.

Things I have tried:

  • Train Model outputs an ILearnerDotNet (AML Studio) or Model (AML Service). I looked for items to drag into the workspace where I could write custom code such as Execute Python Script. They seem to accept datasets, but not ILearnerDotNet/Model as input.
  • I wasn't able to locate documentation about the ILearnerDotNet/Model interfaces.
  • Selecting the Train Model output offers the option to Save as Trained Model. This creates a trained model object and that would help me reference the trained model in other places, but I didn't find a way to use this to get at its internals.

I'm new to the Azure Machine Learning landscape, and could use some help with how to get started on how to access this data.


Answer:

Quote from Azure ML Exam reference:

By default, the architecture of neural networks is limited to a single hidden layer with sigmoid as the activation function and softmax in the last layer. You can change this in the properties of the model, opening the Hidden layer specification dropdown list, and selecting a Custom definition script. A text box will appear in which you will be able to insert a Net# script. This script language allows you to define neural networks architectures.

For instance, if you want to create a two layer network, you may put the following code.

input Picture [28, 28];
hidden H1 [200] from Picture all;
hidden H2 [200] from H1 all;
output Result [10] softmax from H2 all;

Nevertheless, with Net# you will face certain limitations as, it does not accept regularization (neither L2 nor dropout). Also, there is no ReLU activation that are commonly used in deep learning due to their benefits in backpropagation. You cannot modify the batch size of the Stochastic Gradient Descent (SGD). Besides that, you cannot use other optimization algorithms. You can use SGD with momentum, but not others like Adam, or RMSprop. You cannot define recurrent or recursive neural networks.

Another great tool is CNTK (Cognitive Toolkit) that allows you defining your computational graph and create a fully customizable model. Quote from documentation

It is a Microsoft open source deep learning toolkit. Like other deep learning tools, CNTK is based on the construction of computational graphs and their optimization using automatic differentiation. The toolkit is highly optimized and scales efficiently (from CPU, to GPU, to multiple machines). CNTK is also very portable and flexible; you can use it with programming languages like Python, C#, or C++, but you can also use a model description language called BrainScript.