desc.optimize.sgd

class desc.optimize.sgd(fun, x0, grad, args=(), method='sgd', ftol=1e-06, xtol=1e-06, gtol=1e-06, verbose=1, maxiter=None, callback=None, options=None)Source

Minimize a scalar function using stochastic gradient descent with momentum.

Update rule is x_{k+1} = x_{k} - alpha*v_{k}

v_{k} = beta*v_{k-1} + (1-beta)*grad(x)

Where alpha is the step size and beta is the momentum parameter.

Parameters:
  • fun (callable) – objective to be minimized. Should have a signature like fun(x,*args)-> float

  • x0 (array-like) – initial guess

  • grad (callable) – function to compute gradient, df/dx. Should take the same arguments as fun

  • args (tuple) – additional arguments passed to fun and grad

  • method (str) – Step size update rule. Currently only the default “sgd” is available. Future updates may include RMSProp, Adam, etc.

  • ftol (float or None, optional) – Tolerance for termination by the change of the cost function. The optimization process is stopped when dF < ftol * F.

  • xtol (float or None, optional) – Tolerance for termination by the change of the independent variables. Optimization is stopped when norm(dx) < xtol * (xtol + norm(x)). If None, the termination by this condition is disabled.

  • gtol (float or None, optional) – Absolute tolerance for termination by the norm of the gradient. Optimizer terminates when max(abs(g)) < gtol. If None, the termination by this condition is disabled.

  • verbose ({0, 1, 2}, optional) –

    • 0 : work silently.

    • 1 (default) : display a termination report.

    • 2 : display progress during iterations

  • maxiter (int, optional) – maximum number of iterations. Defaults to size(x)*100

  • callback (callable, optional) –

    Called after each iteration. Should be a callable with the signature:

    callback(xk, *args) -> bool

    where xk is the current parameter vector. and args are the same arguments passed to fun and grad. If callback returns True the algorithm execution is terminated.

  • options (dict, optional) –

    dictionary of optional keyword arguments to override default solver settings.

    • "alpha" : (float > 0) Step size parameter. Default 1e-2 * norm(x)/norm(grad(x))

    • "beta" : (float > 0) Momentum parameter. Default 0.9

Returns:

res (OptimizeResult) – The optimization result represented as a OptimizeResult object. Important attributes are: x the solution array, success a Boolean flag indicating if the optimizer exited successfully.