Here is a code fragment that does exactly what you need - https://github.com/DISTORTEC/distortos/blob/master/source/architecture/ARM/ARMv7-M/ARMv7-M-PendSV_Handler.cpp - it works as you described:
- exception entry automatically stacks some registers,
- you manually stack remaining registers,
- you switch SP,
- you unstack "remaining" registers,
- exception return unstacks the rest of registers.