Numerical optimization is a hard problem not an afterthought.

“Maximize the log-likelihood” and “minimize the loss function” sound like much more straightforward operations than they are in practice.

“Minimize loss” is a hard problem unless the function has structure (convexity). Thinking about how exactly to solve the problem is critical for using estimators based on optimization in practice.

The numerical algorithm should use the structure of the optimization problem to get the right results. We can't assume that just running the problem through Nelder-Mead or Newton methods will find the optimum result! In all likelihood, it will not.