Also move #includes into canonical order where appropriate.
It has no effect, since the code is supposed to operate the same way for any bit depth.