Skip to content

Questions wrt. gradient computation #2

@matthiasreisser

Description

@matthiasreisser

I'm not 100% clear about pytorch syntax. Should the two following ways to compute the gradients df/dtheta be equivalent? Why are they not? :) I'm not entirely sure what loss.backward(backward_ones) does. Is this df/d1 ?

loss.mean().backward(retain_variables=True)
print(reg_funcs.params.grad.data)
reg_funcs.params.data.zero_()
loss.backward(backward_ones)
print(reg_funcs.params.grad.data)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions