The Get Vertexdata Ptr() command appears without fanfare or explanation in the DBPro help files, but it can be used to read and manipulate vertex data at least 2x faster than using the standard vertexdata access commands.
This sort-of-tutorial is mostly pieced together from a fair bit of forum searching and a little trial and error. I haven’t kept tabs on what I got where, but I believe Rudolpho and IanM have supplied the best of it.
Background
DBOData.h in DBP source code contains definition for ‘sMesh’ and supporting structures. The most relevant parts are given here:
struct sMeshFVF
{
// mesh flexible vertex format data
DWORD dwFVFOriginal; // flexible vertex format original
DWORD dwFVF; // flexible vertex format
DWORD dwFVFSize; // size of flexible vertex format
};
struct sMeshDraw
{
// mesh draw information
int iMeshType; // put it to handle terrain scene culling (mesh visible flag)
BYTE* pOriginalVertexData; // pointer to original vertex data
BYTE* pVertexData; // pointer to vertex data
WORD* pIndices; // index array
DWORD dwVertexCount; // number of vertices
DWORD dwIndexCount; // number of indices
int iPrimitiveType; // primitive type
int iDrawVertexCount; // number of vertices to be used when drawing
int iDrawPrimitives; // number of actual primitives to draw
sDrawBuffer* pDrawBuffer; // VB and IB buffers used by the mesh
// reserved members
DWORD dwReservedMD1; // reserved - maintain plugin compat.
DWORD dwReservedMD2; // reserved - maintain plugin compat.
DWORD dwReservedMD3; // reserved - maintain plugin compat.
};
…
struct sMesh : public sMeshFVF,
sMeshDraw,
sMeshShader,
sMeshBones,
sMeshTexture,
sMeshInternalProperties,
sMeshExternalProperties,
sMeshMultipleAnimation
{
// contains all data for a mesh
sCollisionData Collision; // collision information
// reserved members
DWORD dwTempFlagUsedDuringUniverseRayCast; // V119 to fix bugged universe ray cast of large meshes
DWORD dwReservedM2; // reserved - maintain plugin compat.
DWORD dwReservedM3; // reserved - maintain plugin compat.
// constructor and destructor
sMesh ( );
~sMesh ( );
};
(Disclaimer: I don’t actually know or write C++.) The sMesh struct is made by smushing together all those other sMeshXX structs and appending its own bit at the end. So if you look at it in memory you’ll first find the three DWORDs of sMeshFVF, then immediately next to it you’ll find the int, pointers, dwords etc. of sMeshDraw, and so on. Put another way, sMesh inherits from multiple other structures, whose fields are arranged in sMesh’s memory by a basic process of concatenation.
The long and short of it is, all the fields are of a fixed size, so given the address of the first field, you can offset it to get a pointer to any other.
Get Vertexdata Ptr will give you that first-field address.
So for example if we want the field sMeshDraw->pVertexData, we have to skip over three DWORDs (sMeshFVF->dwFVFOriginal, dwFVF & dwFVFSize), an Int (sMeshDraw->iMeshType) and a pointer (sMeshDraw->pOriginalVertexData). All these datatypes are 4 bytes each, so we offset by (5 fields X 4 bytes) = 20 bytes.
Now truth be told, I think this is all
slightly hacky, but this is DBPro ¯\_( ツ )_/¯
For a given limb, the relevant mesh data can be found like this:
lock vertexdata for limb idObj, idLimb, lockMode
pMeshData as dword
pVertexData as dword
pIndexData as dword
pMeshData = get vertexdata ptr()
pVertexData = peek dword(pMeshData + 20)
pIndexData = peek dword(pMeshData + 24)
unlock vertexdata
Where lockMode is optional (defaults to 0, but 1 is fastest).
Now mesh properties can be accessed (even after unlocking, if you like) in this way:
fvf = peek dword(pMeshData + 4) // dwFVF
fvfSize = peek dword(pMeshData + 8) // dwFVFSize
meshType = peek integer(pMeshData + 12) // iMeshType
vtxCount = peek dword(pMeshData + 28) - 1 // dwVertexCount
idxCount = peek dword(pMeshData + 32) - 1 // dwIndexCount
The following is an example of reading vertex data. We follow the pVertexData pointer to the first vertex in a list of vertices. What data the vertices carry depends on the FVF, but position and normal data are always present unless deliberately excluded, and always occupy the first few data items, so are straightforward to read:
pRead as dword
pRead = pVertexData
for idVtx = 0 to vtxCount
pos_x# = peek float( pRead )
pos_y# = peek float( pRead + 4 )
pos_z# = peek float( pRead + 8 )
nor_x# = peek float( pRead + 12 )
nor_y# = peek float( pRead + 16 )
nor_z# = peek float( pRead + 20 )
print "# ", idVtx, ": p[", pos_x#, ", ", pos_y#, ", ", posz#, "] n[", nor_x#, ", ", nor_y#, ", ", nor_z#, "]"
inc pRead, fvfSize
next idVtx
The easiest way to be sure that data is where it should be is to be strict about what FVF types are allowed in the program, but the layout is quite simple (a one-item-after-the-other affair like the struct) and offsets to each data item can be generated for any given FVF.
In a similar vein, requiring all objects to be indexed simplifies working with them. Here, index data are read and displayed three at a time (every group of three indices representing a triangle):
pRead as dword
pRead = pIndexData
for idIdx = 0 to idxCount step 3
v0 = peek word( pRead )
v1 = peek word( pRead + 2 )
v2 = peek word( pRead + 4 )
print "tri # ", idIdx/3, ": [", v0, " -> ", v1, " -> ", v2, "]"
inc pRead, 6
next idIdx
As demonstrated, vertex data can be read without first using lock vertexdata. When modifying, however, the mesh will not update until the next unlock. It may therefore seem reasonable to conclude that you should lock when editing, but not do so when reading so as to avoid the performance penalty incurred by locking; however, there is in fact little-to-no penalty and as such it seems sensible practice to always lock when accessing vertexdata. Feel free to experiment.
This example snippet modifies an object by randomly nudging vertices along their normals:
pVtx as dword
do
position camera 0, 0, 0
rotate camera 10, hitimer()*0.005, 0
move camera -4
lock vertexdata for limb 1,1,1
for v = 0 to vtxCount
if rnd(10) = 0
pVtx = pVertexData + v * fvfSize
poke float pVtx , peek float( pVtx ) + peek float( pVtx + 12 ) * 0.01
poke float pVtx + 4, peek float( pVtx + 4 ) + peek float( pVtx + 16 ) * 0.01
poke float pVtx + 8, peek float( pVtx + 8 ) + peek float( pVtx + 20 ) * 0.01
endif
next v
unlock vertexdata
sync
nice wait 15
loop
Compatibility
Meshes: works as expected.
Clones: works as expected.
Instances: remain in sync with source object when source vertexdata changes.
Animated objects: works as expected; if crashing, ensure you have accessed the limb containing the mesh and not the (meshless, transform-only) bone limbs.
Sparkys Collision: works as expected; you must re-setup object after vertexdata changes.
Standard Vertexdata Access: can mix methods without apparent issue.
Notes
Safety: Obviously, like in any instance where you’re poking about under the hood, be vigilant: perform operations in a sensible order along clear dividing lines. Be aware that changes to, say, the vertex count, causes DBP to resize (and thus reallocate) the data, and will therefore invalidate your corresponding local pointers; refresh them in the same manner you initially acquired them.
So far, I’ve encountered no unexplained crashes or memory leaks. I’ll be using direct memory access in my own project so I’ll report any peculiarities, specific case fixes or fatal flaws here if/when I encounter them in the field.
Copies: You can copy the vertex data to somewhere else in memory without issue (using
copy memory for example). You can modify that copy as you please and then copy it back, overwriting the original. However, you can’t redirect the mesh pointers to your copy directly – the program will at the very least crash on exit, and will crash immediately if you try to free the original data. The problem may be inconsistent state (other places I don’t know about needing updating), and may be fixable – but its not something I’ll be looking into.
Summary
- Use direct memory access to speed up basic reading and updating of vertex data.
- Prefer the standard commands for more complex, infrequent changes.
- Remember to keep pointers up to date.
Benchmark 1 – reading vertexdata for one big object
(The file
denseGeosphere32Seg.X is attached - place it in the exe folder.)
set display mode desktop width(), desktop height(), 32, 1 : reload display pointer
sync on : sync rate 60 : sync
backdrop on 0 : color backdrop 0, 0xff3040ab
#constant TESTRUNS 50
set text font "courier new"
set text size 14
load object "denseGeosphere32Seg.X", 1
lock vertexdata for limb 1, 1, 1
pMeshData as dword
pVertexData as dword
pIndexData as dword
pMeshData = get vertexdata ptr()
pVertexData = peek dword( pMeshData + 20 )
pIndexData = peek dword( pMeshData + 24 )
fvf as dword, fvfSize as dword, meshType as integer, vtxCount as dword, idxCount as dword
fvf = peek dword(pMeshData + 4)
fvfSize = peek dword(pMeshData + 8)
meshType = peek integer(pMeshData + 12)
vtxCount = peek dword(pMeshData + 28) - 1
idxCount = peek dword(pMeshData + 32) - 1
unlock vertexdata
posx as float, posy as float, posz as float
norx as float, nory as float, norz as float
// Test A1: Lock + DBP Functions (lock+unlock each run)
print "Begin Test A1..." : sync
testA1_t0 = hitimer(10000)
for run = 1 to TESTRUNS
lock vertexdata for limb 1, 1, 1
for i = 0 to idxCount
v = get indexdata(i)
posx = get vertexdata position x(v)
posy = get vertexdata position y(v)
posz = get vertexdata position z(v)
norx = get vertexdata normals x(v)
nory = get vertexdata normals y(v)
norz = get vertexdata normals z(v)
`print posx, ",", posy, ",", posz, " ", norx, ",", nory, ",", norz
next i
unlock vertexdata
next run
testA1_t1 = hitimer(10000)
// Test A2: Lock + DBP Functions (lock once, unlock once)
print "Begin Test A2..." : sync
testA2_t0 = hitimer(10000)
lock vertexdata for limb 1, 1, 1
for run = 1 to TESTRUNS
for i = 0 to idxCount
v = get indexdata(i)
posx = get vertexdata position x(v)
posy = get vertexdata position y(v)
posz = get vertexdata position z(v)
norx = get vertexdata normals x(v)
nory = get vertexdata normals y(v)
norz = get vertexdata normals z(v)
`print posx, ",", posy, ",", posz, " ", norx, ",", nory, ",", norz
next i
next run
unlock vertexdata
testA2_t1 = hitimer(10000)
// Test B: Direct access
print "Begin Test B..." : sync
testB_t0 = hitimer(10000)
lock vertexdata for limb 1, 1, 1
pIdx as dword, pVtx as dword
for run = 1 to TESTRUNS
pIdx = pIndexData
for i = 0 to idxCount
pVtx = pVertexData + peek word(pIdx) * fvfSize
posx = peek float( pVtx )
posy = peek float( pVtx + 4 )
posz = peek float( pVtx + 8 )
norx = peek float( pVtx + 12 )
nory = peek float( pVtx + 16 )
norz = peek float( pVtx + 20 )
`print i, ": ", posx, ",", posy, ",", posz, " ", norx, ",", nory, ",", norz
inc pIdx, 2
next i
next run
unlock vertexdata
testB_t1 = hitimer(10000)
// RESULTS.
set cursor 0,0
box 0, 0, scrW, scrH, 0x58000000, 0x80000000, 0xf8000000, 0x88000000
testA1_t# = (testA1_t1 - testA1_t0)
testA2_t# = (testA2_t1 - testA2_t0)
testB_t# = (testB_t1 - testB_t0)
ratioA2A1$ = str$(testA2_t#/testA1_t#,3)
ratioBA1$ = str$(testB_t#/testA1_t#,3)
ratioA1B$ = str$(testA1_t#/testB_t#,3)
ratioA2B$ = str$(testA2_t#/testB_t#,3)
print " -= Test Results =- "
print "ran each test ", TESTRUNS, " times on object with "
print vtxCount+1, " vertices, ", idxCount+1, " indices, and fvf size ", fvfSize, "."
print
print
print "TEST A-1: Lock & DBPro Functions, Realistic (numtests x [lock->read->unlock]) ------------------------------"
print
print str$( testA1_t# * 0.1, 3), " ms"
print " ~", str$( (testA1_t#*0.1) / TESTRUNS, 6), "ms per run"
print " ~", str$( (testA1_t#*0.1) / TESTRUNS / (idxCount+1), 6), "ms per index"
print
print "TEST A-2: Lock & DBPro Functions, Ideal (lock -> numtests x [read] -> unlock) ------------------------------"
print
print str$( testA2_t# * 0.1, 3), " ms"
print " ~", str$( (testA2_t#*0.1) / TESTRUNS, 6), "ms per run"
print " ~", str$( (testA2_t#*0.1) / TESTRUNS / (idxCount+1), 6), "ms per index"
print
print "TEST B: Direct Access --------------------------------------------------------------------------------------"
print
print str$( testB_t# * 0.1, 3), " ms"
print " ~", str$( (testB_t#*0.1) / TESTRUNS, 6), "ms per run"
print " ~", str$( (testB_t#*0.1) / TESTRUNS / (idxCount+1), 6), "ms per index"
print
print
print
print " A1: A2: B"
print "--------------------------------"
print " 1:", padleft$(ratioA2A1$,10), ":", padleft$(ratioBA1$,10)
print padleft$(ratioA1B$,10), ":", padleft$(ratioA2B$,10), ": 1"
print
print
print "Press key to exit..."
sync
nice wait key
end
The following is a screen of the benchmark results on my desktop:
Benchmark 2 – Several primitives, clones, instances
Somewhat less rigorous than the first: run the code as is, then swap the comment-blocking and run it again.
set display mode desktop width(), desktop height(), 32, 1 : reload display pointer
sync on : sync rate 60 : sync
backdrop on 0 : color backdrop 0, 0xff3040ab `0xff5d7deb
set text font "courier new"
set text size 14
// SETUP 16 x 2 ARRAY OF OBJECTS
type tObj
id
pMeshData as dword
pVertexData as dword
pIndexData as dword
vtxCount as dword
idxCount as dword
fvf as dword
fvfSize as dword
endtype
dim arrObj(15,1) as tObj
for x = 0 to 15
for y = 0 to 1
id = find free object()
if y = 0
select x
case 0 : make object plane id, 1.0, 0.8, 1, 1 : endcase
case 1 : make object plane id, 1.0, 0.8, 40, 32 : endcase
case 2 : make object plane id, 1.0, 0.8, 4, 4, 10, 8 : endcase
case 3 : make object box id, 1.0, 1.0, 1.0 : endcase
case 4 : make object cylinder id, 1.0 : endcase
case 5 : make object sphere id, 1.0, 4, 4 : endcase
case 6 : make object sphere id, 1.0, 148, 148 : endcase
case 7 : make object cone id, 1.0 : endcase
case 8,9,10,11,12,13,14,15
clone object id, arrObj(x-8, 0).id
endcase
endselect
lock vertexdata for limb id, 0, 1
arrObj(x,0).pMeshData = get vertexdata ptr()
arrObj(x,0).pVertexData = peek dword(arrObj(x,0).pMeshData + 20)
arrObj(x,0).pIndexData = peek dword(arrObj(x,0).pMeshData + 24)
arrObj(x,0).vtxCount = peek dword(arrObj(x,0).pMeshData + 28) - 1
arrObj(x,0).idxCount = peek dword(arrObj(x,0).pMeshData + 32) - 1
arrObj(x,0).fvf = peek dword(arrObj(x,0).pMeshData + 4)
arrObj(x,0).fvfSize = peek dword(arrObj(x,0).pMeshData + 8)
unlock vertexdata
else
instance object id, arrObj(x, 0).id
arrObj(x,y) = arrObj(x,0)
endif
arrObj(x,y).id = id
position object id, (x-8)*1.5, 0, (y-2)*2
next y
next x
_fvfSize as dword
_pVertexData as dword
pVtx as dword
position camera 0, 0, -16
mmx# = mousemovex()
mmy# = mousemovey()
// MAIN LOOP
do
mmx# = mousemovex()
mmy# = mousemovey()
if mouseclick() = 4
position camera 0,0,0
rotate camera camera angle x() + mmy#*0.5, camera angle y() + mmx#*0.5, 0
move camera -16
else
idPick = pick object(mousex(), mousey(), arrObj(0,0).id, arrObj(15,1).id)
endif
t0 = hitimer(10000)
_readCount = 0
_writeCount = 0
_totalVtxCount = 0
for x = 0 to 15
idObj = arrObj(x,0).id
lock vertexdata for limb idObj, 0, 1
// < DIRECT ACCESS METHOD >
`remstart
_fvfSize = arrObj(x,0).fvfSize
_vtxCount = arrObj(x,0).vtxCount
_pVertexData = arrObj(x,0).pVertexData
for v = 0 to _vtxCount
pVtx = _pVertexData + v * _fvfSize
f# = rnd(5000)*0.000001*RndSgn()
poke float pVtx , peek float(pVtx ) + peek float(pVtx+12)*f#
poke float pVtx+4, peek float(pVtx+4) + peek float(pVtx+16)*f#
poke float pVtx+8, peek float(pVtx+8) + peek float(pVtx+20)*f#
next v
`remend
// < STANDARD ACCESS METHOD >
remstart
_vtxCount = get vertexdata vertex count() - 1
for v = 0 to _vtxCount
f# = rnd(5000)*0.000001*RndSgn()
set vertexdata position v, get vertexdata position x(v) + get vertexdata normals x(v)*f#, get vertexdata position y(v) + get vertexdata normals y(v)*f#, get vertexdata position z(v) + get vertexdata normals z(v)*f#
next v
remend
unlock vertexdata
inc _totalVtxCount, _vtxCount
inc _readCount, _vtxCount * 6
inc _writeCount, _vtxCount * 3
next x
for x = 0 to 15
for y = 0 to 1
idObj = arrObj(x,y).id
position object idObj, object position x(idObj), Lerp(object position y(idObj), 0.25*(idObj=idPick), 0.6), object position z(idObj)
next y
next x
t1 = hitimer(10000)
print str$((t1-t0)*0.1,3), " ms"
print _readCount, " reads / ", _writeCount, " writes - ", _totalVtxCount, " verts total"
sync
nice wait 5
set cursor 0,0
loop
function Lerp(a as float, b as float, t as float)
a = a + (b - a) * t
endfunction a
function rndSgn()
out as float
out = rnd(1)*2 - 1
endfunction out
Results on my desktop:
Direct access: about 29.5ms;
standard access: about 80.5ms;
Ratio: 1:2.7 or 0.37:1.
Comparison with Memblocks
I’ve never used memblocks to access vertexdata, so can’t say. I hear they’re a bit slow.