Hot questions for Using Neural networks in dynet

Question:

Is there a way to update a subset of parameters in dynet? For instance in the following toy example, first update h1, then h2:

 model = ParameterCollection()
 h1 = model.add_parameters((hidden_units, dims))
 h2 = model.add_parameters((hidden_units, dims))
 ...
 for x in trainset:
    ...
    loss.scalar_value()
    loss.backward()
    trainer.update(h1)
    renew_cg()

 for x in trainset:
    ...
    loss.scalar_value()
    loss.backward()
    trainer.update(h2)
    renew_cg()

I know that update_subset interface exists for this and works based on the given parameter indexes. But then it is not documented anywhere how we can get the parameter indexes in dynet Python.


Answer:

A solution is to use the flag update = False when creating expressions for parameters (including lookup parameters):

import dynet as dy
import numpy as np

model = dy.Model()
pW = model.add_parameters((2, 4))
pb = model.add_parameters(2)
trainer = dy.SimpleSGDTrainer(model)

def step(update_b):
    dy.renew_cg()
    x = dy.inputTensor(np.ones(4))
    W = pW.expr()
    # update b?
    b = pb.expr(update = update_b)

    loss = dy.pickneglogsoftmax(W * x + b, 0)
    loss.backward()
    trainer.update()
    # dy.renew_cg()

print(pb.as_array())
print(pW.as_array())
step(True)
print(pb.as_array()) # b updated
print(pW.as_array())
step(False)     
print(pb.as_array()) # b not updated
print(pW.as_array())
  • For update_subset, I would guess that the indices are the integers suffixed at the end of parameter names (.name()). In the doc, we are supposed to use a get_index function.
  • Another option is: dy.nobackprop() which prevents the gradient to propagate beyond a certain node in the graph.
  • And yet another option is to zero the gradient of the parameter that do not need to be updated (.scale_gradient(0)).

These methods are equivalent to zeroing the gradient before the update. So, the parameter will still be updated if the optimizer uses its momentum from previous training steps (MomentumSGDTrainer, AdamTrainer, ...).

Question:

Is there a way to call a parameter by its name in dynet:

def dosomething(model):
    temp = model.get_parameter("temp") #something like this?
    ...

def create():
    Model = ParameterCollection()
    temp = Model.add_parameters((2,2))
    ...
    dosomething(Model) 

Answer:

You cannot do so directly, using dynet API.

Each parameter has a name, that you can specify with the keyword argument name. Example:

pW = model.add_parameter((12, 12), name="W")

However (source):

The names are used for identifying the parameters and the collection hierarchy when loading from disk, and in particular when loading only a subset of the objects in a saved file.

...

One can supply an optional informative name when creating the parameter or sub-collection. The supplied names are then appended with running index to avoid name clashes.

So you cannot retrieve the corresponding parameter based on the ParameterCollection object and the name (well, you could but I would not advise it).


So the usual practice to achieve when you need is to use a dictionary:

import dynet as dy

def dosomething(model, params):
    pW = params["W"]
    pb = params["b"]
    #...

def create():
    model = dy.ParameterCollection()
    params = {}
    params["W"] = model.add_parameters((12, 12))
    params["b"] = model.add_parameters((12, ))

    #...
    dosomething(model, params)

create()