It could work, but it would be highly annoying I'd imagine, and campers would be rewarded as they would get a greater advantage. As an avid FPS player(mostly online), I know how important fluid movements are, what you're proposing makes such movements very hard to perform, while I don't know what kind of FPS you aim to make, let's say I was attacking a house and knew someone was inside. If I went toward a door and opened it, I'd then have to set my movement destination to be inside the house slightly so when I move there I can kill anyone who's potentially watching the door right? But such a move in a game like Counterstrike-source would be a sure easy way to get yourself killed, assuming the opposition doesn't suck. The ideal method, for me at least(assuming I have no grenades handy) would be to quickly pop my head around the door for as little time as I can, to see if I spot anyone. It's highly unlikely I'd get killed by doing that as the enemy can only partially see my body and for a very limited time, typically I will then straight after pop back and kill them as I now have the advantage. Because I know their location, how many there are, maybe what weapon they have etc, so killing them isn't an issue. However with the point and click movement this isn't really possible, I don't have time to look and see them then look around and click to move elsewhere, how will this be fixed?
Similarly, if I'm being overrun by enemy forces, it would be wise to retreat to a defensive position further back as you'd get some advantage there, how would this be quickly done in the game? I'd have to turn around and place where to move then look back and hope during this time they didn't see me as I had my back turned. So the whole moving backward really is an issue, with typical point and click games you usually have an overhead view of the player so you can click to go one way and see anything behind and attack it or whatever.
Also I'm not exactly sure what you mean by "The simple fact is that amount of constant update required in a FPS will eventually suffer from lag resulting in jerky movement", as there are many types of client-simulation that can be done on both the client and server, i.e. a game in the battlefield series, CS:S etc all send player keystrokes, this can be bandwidth intensive, but also allows the server to accurately simulate the client(not to be confused with a dead terminal), and assuming 0 packet loss and a consistent latency, the server should be able to predict the exact path of the player, of course this will be their position in the past, depending on the latency to them, but you could use dead reckoning/input prediction to simulate where their current position most likely is, if the latency is too high and/or the server is CPU or network bottlenecked then it can become laggy, but having accurate movements like this alone doesn't cause lag.
If bandwidth is an issue, then instead of relaying client keystrokes from the server to the client, you could instead just send positions at set intervals, this would essentially be a waypoint, as the clients getting this info will then have to predict the current position of that client based on the latency to the server, but of course the client will potentially see some strange movement depending on your positional packet interval, such as layers clipping through objects at corners and such. I just don't see multiple waypoint based movement working from an FPS perspective, though it might be nice to see.