Hot questions for Using Neural networks in skflow


I recently switched form tensorflow to skflow. In tensorflow we would add our lambda*tf.nn.l2_loss(weights) to our loss. Now I have the following code in skflow:

def deep_psi(X, y):
    layers = skflow.ops.dnn(X, [5, 10, 20, 10, 5], keep_prob=0.5)
    preds, loss = skflow.models.logistic_regression(layers, y)
    return preds, loss

def exp_decay(global_step):
    return tf.train.exponential_decay(learning_rate=0.01,

deep_cd = skflow.TensorFlowEstimator(model_fn=deep_psi,

How and where do I add a regularizer here? Illia hints something here but I couldn't figure it out.


You can still add additional components to loss, you just need to retrieve weights from dnn / logistic_regression and add them to the loss:

def regularize_loss(loss, weights, lambda):
    for weight in weights:
        loss = loss + lambda * tf.nn.l2_loss(weight)
    return loss    

def deep_psi(X, y):
    layers = skflow.ops.dnn(X, [5, 10, 20, 10, 5], keep_prob=0.5)
    preds, loss = skflow.models.logistic_regression(layers, y)

    weights = []
    for layer in range(5): # n layers you passed to dnn
        weights.append(tf.get_variable("dnn/layer%d/linear/Matrix" % layer))
        # biases are also available at dnn/layer%d/linear/Bias

    return preds, regularize_loss(loss, weights, lambda)


Note, the path to variables can be found here.

Also, we want to add regularizer support to all layers with variables (like dnn, conv2d or fully_connected) so may be next week's night build of Tensorflow should have something like this dnn(.., regularize=tf.contrib.layers.l2_regularizer(lambda)). I'll update this answer when this happens.


My colleagues and this question on Cross Validated say you should transform data to zero mean and unit variance for neural networks. However, my performance was slightly worse with scaling than without.

I tried using:

scaler = preprocessing.StandardScaler().fit(X_train)
X_train = scaler.transform(X_train)
X_test = scaler.transform(X_test)

steps = 5000
def exp_decay(global_step):
    return tf.train.exponential_decay(
        learning_rate=0.1, global_step=global_step,
        decay_steps=steps, decay_rate=0.01)

random.seed(42) # to sample data the same way
classifier = skflow.TensorFlowDNNClassifier(
    hidden_units=[150, 150, 150],
    learning_rate=exp_decay), y_train)

y_pred = classifier.predict(X_test)

Did I do something wrong or is scaling not necessary?


Usually scaling benefits most for models that don't have regularization and linear models. For example simple mean squared error loss (like in TensorFlowLinearRegressor) without regularization won't work very well on not scaled data.

In your case you are using classifier that runs softmax regularization and you are using DNN, so scaling is not needed. DNNs themselve can model rescaling (via bias and weight on the feature in the first layer) if that's a useful thing to do.