desc.optimize.sgd

class desc.optimize.sgd(fun, x0, grad, args=(), method='sgd', ftol=1e-06, xtol=1e-06, gtol=1e-06, verbose=1, maxiter=None, callback=None, options=None)Source 

Minimize a scalar function using stochastic gradient descent with momentum.

Update rule is x_{k+1} = x_{k} - alpha*v_{k}: v_{k} = beta*v_{k-1} + (1-beta)*grad(x)

Where alpha is the step size and beta is the momentum parameter.

Parameters:

fun (callable) – objective to be minimized. Should have a signature like fun(x,*args)-> float
x0 (array-like) – initial guess
grad (callable) – function to compute gradient, df/dx. Should take the same arguments as fun
args (tuple) – additional arguments passed to fun and grad
method (str) – Step size update rule. Currently only the default “sgd” is available. Future updates may include RMSProp, Adam, etc.
ftol (float or None, optional) – Tolerance for termination by the change of the cost function. The optimization process is stopped when dF < ftol * F.
xtol (float or None, optional) – Tolerance for termination by the change of the independent variables. Optimization is stopped when norm(dx) < xtol * (xtol + norm(x)). If None, the termination by this condition is disabled.
gtol (float or None, optional) – Absolute tolerance for termination by the norm of the gradient. Optimizer terminates when max(abs(g)) < gtol. If None, the termination by this condition is disabled.
verbose ({0, 1, 2}, optional) –
- 0 : work silently.
- 1 (default) : display a termination report.
- 2 : display progress during iterations
maxiter (int, optional) – maximum number of iterations. Defaults to size(x)*100
callback (callable, optional) –
Called after each iteration. Should be a callable with the signature:

callback(xk, *args) -> bool

where xk is the current parameter vector. and args are the same arguments passed to fun and grad. If callback returns True the algorithm execution is terminated.
options (dict, optional) –
dictionary of optional keyword arguments to override default solver settings.
- "alpha" : (float > 0) Step size parameter. Default 1e-2 * norm(x)/norm(grad(x))
- "beta" : (float > 0) Momentum parameter. Default 0.9

Returns:

res (OptimizeResult) – The optimization result represented as a OptimizeResult object. Important attributes are: x the solution array, success a Boolean flag indicating if the optimizer exited successfully.