Hot questions for Using Neural networks in skflow
I recently switched form tensorflow to skflow. In tensorflow we would add our lambda*tf.nn.l2_loss(weights) to our loss. Now I have the following code in skflow:
def deep_psi(X, y): layers = skflow.ops.dnn(X, [5, 10, 20, 10, 5], keep_prob=0.5) preds, loss = skflow.models.logistic_regression(layers, y) return preds, loss def exp_decay(global_step): return tf.train.exponential_decay(learning_rate=0.01, global_step=global_step, decay_steps=1000, decay_rate=0.005) deep_cd = skflow.TensorFlowEstimator(model_fn=deep_psi, n_classes=2, steps=10000, batch_size=10, learning_rate=exp_decay, verbose=True,)
How and where do I add a regularizer here? Illia hints something here but I couldn't figure it out.
You can still add additional components to loss, you just need to retrieve weights from dnn / logistic_regression and add them to the loss:
def regularize_loss(loss, weights, lambda): for weight in weights: loss = loss + lambda * tf.nn.l2_loss(weight) return loss def deep_psi(X, y): layers = skflow.ops.dnn(X, [5, 10, 20, 10, 5], keep_prob=0.5) preds, loss = skflow.models.logistic_regression(layers, y) weights =  for layer in range(5): # n layers you passed to dnn weights.append(tf.get_variable("dnn/layer%d/linear/Matrix" % layer)) # biases are also available at dnn/layer%d/linear/Bias weights.append(tf.get_variable('logistic_regression/weights')) return preds, regularize_loss(loss, weights, lambda)
Note, the path to variables can be found here.
Also, we want to add regularizer support to all layers with variables (like
fully_connected) so may be next week's night build of Tensorflow should have something like this
dnn(.., regularize=tf.contrib.layers.l2_regularizer(lambda)). I'll update this answer when this happens.
My colleagues and this question on Cross Validated say you should transform data to zero mean and unit variance for neural networks. However, my performance was slightly worse with scaling than without.
I tried using:
scaler = preprocessing.StandardScaler().fit(X_train) X_train = scaler.transform(X_train) X_test = scaler.transform(X_test) steps = 5000 def exp_decay(global_step): return tf.train.exponential_decay( learning_rate=0.1, global_step=global_step, decay_steps=steps, decay_rate=0.01) random.seed(42) # to sample data the same way classifier = skflow.TensorFlowDNNClassifier( hidden_units=[150, 150, 150], n_classes=2, batch_size=128, steps=steps, learning_rate=exp_decay) classifier.fit(X_train, y_train) y_pred = classifier.predict(X_test)
Did I do something wrong or is scaling not necessary?
Usually scaling benefits most for models that don't have regularization and linear models. For example simple mean squared error loss (like in
TensorFlowLinearRegressor) without regularization won't work very well on not scaled data.
In your case you are using classifier that runs softmax regularization and you are using DNN, so scaling is not needed. DNNs themselve can model rescaling (via bias and weight on the feature in the first layer) if that's a useful thing to do.