This is similar to issues with trying to find a global minimum, where it’s easy to get stuck in a local minimum. Consider trying to find the global minimum for the profile below: you place the ball in different places and follow it as it rolls down the hill to the minimum, but depending on where you place it, you may get stuck in a local dip.
That is, in complicated situations, you can’t always get to the best solution from all starting points using small optimizing increments. The general solutions to this are to fluctuate the parameters (i.e., weights, in this case) more vigorously (and usually reduce the size of the fluctuations as you progress the simulation — like in simulated annealing), or just realize that a bunch of the starting points aren’t going to go anywhere interesting.