Quote: "So what are the standard rules for creating a normal map then?"
The basic idea is as follows - and normal maps can be created using different, but similar, conventions.
The easiest to follow, but not the one I used earlier, uses world or object space normals, i.e. XYZ are the usual world or object coords.
A normal vector, (Nx, Ny, Nz) say, has length 1, i.e. Nx^2+Ny^2+Nz^2 = 1, and indicates the direction perpendicular to the surface.
Since the vector can, in general, be pointing in any direction the components are coded into the colour byte range (0 to 255) as follows:
R = 128+127.0*Nx
G = 128+127.0*Ny
B = 128+127.0*Nz
Put another way, for a normal map to be valid its RGB values should approximately satisfy
(R-128)^2+(G-128)^2+(B-128)^2 = 127^2 = 16129
(exact equality isn't usually possible because we are working with bytes).
All normal maps are coded this way (although some people might use the multiplier 128 instead of 127 in which case the value 256 would have to be reduced to 255 to give a valid byte).
For example, a typical pixel, e.g. (u, v) = (237, 235) near the centre of the image, from the normal map I posted earlier has RGB values (57, 34, 178). This gives
(R-128)^2+(G-128)^2+(B-128)^2 = 127^2 = 16377
(which rather suggests I used the multiplier 128 rather than 127

).
When the normal map is read by a shader the values are scaled into the range 0 to 1 with 128 corresponding to 0.5 - but they need to be scaled again so each component is in the range -1 to 1. Hence you'll see things like the following in most normal mapping shaders:
float3 normal = 2 * tex2D(sampler1, In.Tex) - 1.0;
The "tex2D" bit is just reading a colour value from the normal map and each RGBA component is in the range 0 to 1 initially. The rest of the calculation is just re-scaling to the range -1 to 1. The "float3" bit just means that we are ignoring the alpha component in this case.
That's the easy part.
There's one more complication I haven't mentioned so far. Most normal mapping shaders use "texture space" coordinates rather than world or object space coordinates. In this system the values Nx, Ny and Nz are coded into colour bytes as described above - but the coordinates are interpreted differently.
In this system, "X" (or "red") corresponds to the "U" texture coordinate direction in the object's polygons (this will almost always correspond to different directions in world or object space for different polys).
The "Y" (or "green") component corresponds to the polygon's "V" coordinate - which is a common source of error because V coords increase as you go "down" an image whereas "Y" in world or object coordinates usually increase in the "up" direction. (My early normal maps got this the wrong way round.

)
The "Z" (or "blue")coordinate corresponds to the direction perpendicular to the polygon, with positive values pointing away from the object. For this reason most normal maps will have blue bytes in the range 128 to 255 - values less than 128 are legitimate but not really sensible as they would make the surface appear dark or black when the lighting calculation is done.
I'll spare you, for now, the details of how shaders convert the world light direction into texture space coords for a given polygon.