You install a system wide hook. (SetWindowsHookEx) With this done, you get to be loaded into every process.
Now when the hook is called, you look for a loaded d3d9.dll.
If one is loaded, you create a temporary D3D9 object, and walk the vtable to get the address of the EndScene method.
Then you can patch the EndScene call, with your own method. (Replace the first instruction in EndScene by a call to your method.
When you are done, you have to patch the call back, to call the original EndScene method. And then reinstall your patch.
This is the way FRAPS does it. (Link)
You can find a function address from the vtable of an interface.
So you can do the following (Pseudo-Code):
IDirect3DDevice9* pTempDev = ...;
const int EndSceneIndex = 26 (?);
typedef HRESULT (IDirect3DDevice9::* EndSceneFunc)( void );
BYTE* pVtable = reinterpret_cast<void*>( pTempDev );
EndSceneFunc = pVtable + sizeof(void*) * EndSceneIndex;
EndSceneFunc does now contain a pointer to the function itself. We can now either patch all call-sites or we can patch the function itself.
Beware that this all depends on the knowledge of the implementation of COM-Interfaces in Windows. But this works on all windows versions (either 32 or 64, not both at the same time).