To draw silly faces. To build a navigation app. To prototype games based on recognizing objects. To draw information overlays positioned on real world. I don't know.
What you describe as "grabbing the camera to an image and using the rotation vector commands to work out where the camera is looking to draw the virtual objects" already sounds a bit complex for me. I like simple things. The AppGameKit way there would be to say, GetCamera(), SetCameraOverLay(), GiveGroudPlane(), PositionOverlayYAtGroundPlane(), PositionSpriteOnWall(), PositionSpriteOnGroundOffset(), FindImageFeatures(), RecognizeObjects(), IsObjectPresent(), PositionSpriteOnFeature(), ...
Incorporating all the latest and greatest in ML algorithms, CNN's, whatnot is the best thing in computer vision every day, of course..
No idea really, just throwing ideas for people to say.