desc.optimize.sgd
- class desc.optimize.sgd(fun, x0, grad, args=(), method='sgd', ftol=1e-06, xtol=1e-06, gtol=1e-06, verbose=1, maxiter=None, callback=None, options=None)Source
Minimize a scalar function using stochastic gradient descent with momentum.
- Update rule is x_{k+1} = x_{k} - alpha*v_{k}
v_{k} = beta*v_{k-1} + (1-beta)*grad(x)
Where alpha is the step size and beta is the momentum parameter.
- Parameters:
fun (callable) – objective to be minimized. Should have a signature like fun(x,*args)-> float
x0 (array-like) – initial guess
grad (callable) – function to compute gradient, df/dx. Should take the same arguments as fun
args (tuple) – additional arguments passed to fun and grad
method (str) – Step size update rule. Currently only the default “sgd” is available. Future updates may include RMSProp, Adam, etc.
ftol (float or None, optional) – Tolerance for termination by the change of the cost function. The optimization process is stopped when
dF < ftol * F
.xtol (float or None, optional) – Tolerance for termination by the change of the independent variables. Optimization is stopped when
norm(dx) < xtol * (xtol + norm(x))
. If None, the termination by this condition is disabled.gtol (float or None, optional) – Absolute tolerance for termination by the norm of the gradient. Optimizer terminates when
max(abs(g)) < gtol
. If None, the termination by this condition is disabled.verbose ({0, 1, 2}, optional) –
0 : work silently.
1 (default) : display a termination report.
2 : display progress during iterations
maxiter (int, optional) – maximum number of iterations. Defaults to size(x)*100
callback (callable, optional) –
Called after each iteration. Should be a callable with the signature:
callback(xk, *args) -> bool
where
xk
is the current parameter vector. andargs
are the same arguments passed to fun and grad. If callback returns True the algorithm execution is terminated.options (dict, optional) –
dictionary of optional keyword arguments to override default solver settings.
"alpha"
: (float > 0) Step size parameter. Default1e-2 * norm(x)/norm(grad(x))
"beta"
: (float > 0) Momentum parameter. Default 0.9
- Returns:
res (OptimizeResult) – The optimization result represented as a
OptimizeResult
object. Important attributes are:x
the solution array,success
a Boolean flag indicating if the optimizer exited successfully.