NOTE: The code for the functions is at the bottom of this post. Examples are in the next post.
Hi all,
Long time no see!
I know this looks long and tedious, but trust me, it's not so bad. If you want to skip the theory just read the bits in bold and then look at the simple example in my next post. You should be ready to brainstorm ideas for your game in no time. In fact, here's a super quick sample demonstrating how little needs to be done/understood:
` This program will tell you what usually comes after "Hello"!
#Constant NB_DATATYPE string
NB_DefaultSetup()
id = NB_NewClassifier()
DIM Data$(2)
Data$(1) = "Hello"
Data$(2) = "World!"
NB_AddTrainData(id, get arrayptr(Data$()))
Data$(1) = "Hello"
Data$(2) = "World!"
NB_AddTrainData(id, get arrayptr(Data$()))
Data$(1) = "Hello"
Data$(2) = "Moon"
NB_AddTrainData(id, get arrayptr(Data$()))
NB_Train(id)
NB_AddClassifyParam(id, 2)
NB_PrepareClassify(id)
Data$(1) = "Hello"
Data$(2) = "???"
NB_SetClassifyData(id, get arrayptr(Data$()))
second_word$ = NB_Classify(id, 2)
print "Hello ", second_word$
wait key
end
Before I begin, I'd like to say that everything in this post is based on my own understanding of the topics at hand. If anyone spots a flaw, please do point it out

.
I've been reading a bit about machine learning lately and decided to implement a
Naive Bayes Classifier in DBP. This algorithm is a
Supervised Learning Algorithm, meaning you need to provide it with some data to use as a base to learn from. Later I may implement an
Unsupervised Learning Algorithm (Genetic Algorithm), whereby the computer learns via trial and error, which would probably be more useful and fun for gaming because even the game's creator wouldn't know what the computer would do next

. But for now, here's a simple and popular supervised algorithm.


What is a Naive Bayes Classifier?

In simplest terms, a Naive Bayes Classifier is something which, when given some features, will produce what it expects another feature to be based on data it has already seen. It does this by seeing how likely each option for the feature in question is given the known features, and then picking the most likely case. This process is entirely based on
Bayes' Theorem, a formula describing the probability of some event
A given an event
B has occurred. We could also calculate it for
A given
B, C, D, etc, which is what the code in this post does. Anyway, it's Naive because it assumes that B, C, D, etc. do not depend on each other in any way. That is, B happening has nothing to do with C or D happening. This could be false (ex. if B is "lightning strikes a power line" and C is "power goes out", then these two events are clearly related). But we're naive and assume everything is independent because it makes the math nicer. This assumption can cause some problems (ex. if everything in your dataset depends completely on everything else), but usually it works out well.
You "train" the Classifier on some data, it "learns" from the data and then interprets any new data you give it based on what the old data said.
But enough technical stuff. Here's a concrete example:
Suppose we want to know what spell the player will use next in a turn-based RPG game. We know that some spells are commonly used together, but we don't know which, and we don't want to sit there and think of all the possible combos the player might have come up with. What to do? Well, here are a few creatively named spell sequences that the player has used on us in the past.

Freeze Feet ---> Hover Boulder Over Head ---> Make Boulder Fall

Hover Boulder Over Head ---> Make Boulder Fall ---> Heal Self

Freeze Feet ---> Shatter Ice ---> Reassemble Ice

Freeze Feet ---> Hover Boulder Over Head ---> Heal Self

Freeze Feet ---> Hover Boulder Over Head ---> Make Boulder Fall

Freeze Feet ---> Shatter Ice --> Freeze Feet

Freeze Feet ---> Hover Boulder Over Head ---> Make Boulder Fall

Freeze Feet ---> Hover Boulder Over Head ---> Shatter Ice

Freeze Feet ---> Hover Boulder Over Head ---> Make Boulder Fall

Make Boulder Fall ---> Heal Self ---> Heal Self
Hmmm. Our player really likes freezing things.
Pretend our player just used Freeze Feet followed by Hover Boulder Over Head. What will he do next? Well, in the past when he froze our feet and hovered a boulder over our head, he made it fall on us. But he also used Heal Self or Shatter Ice. So which will he do this time? Well, in the 6 cases that he's used Freeze Feet followed by Hover Boulder Over Head, 4/6 of those times he used Make Boulder Fall afterwards. The other two times he used either Heal Self or Shatter Ice. But most of the time he just makes the boulder fall. So the AI assumes that's what he's doing this time and prepares for it by using Deflect Boulder.
Bam.
The boulder goes flying back at the player, he loses health and curses the AI for figuring out his favorite combo.
But what if the player used Heal Self followed by Hover Boulder Over Head? Well now we're in trouble. We've never seen that combo before. But here's where the probability element comes in. In the past when the second move was Hover Boulder Over Head, the player usually used Make Boulder Fall afterwards. Because of this, the Naive Bayes Classifier will still conclude that the player is using Make Boulder Fall next.
The problem arises in a case where the player uses, say, Freeze Feet ---> Heal Self. What will he do next? We haven't seen that before either. Since Freeze Feet was used first, the AI will still think the player is doing the boulder combo. This goes back to our assumption of
independence. The AI doesn't know that using Make Boulder Fall depends on whether or not Hover Boulder Over Head was used, so it doesn't realize that its guess can't be accurate.
To get around this you might do a second classification. Given the player used Freeze Feet --> ???? --> Make Boulder Fall, what is ??? most likely to be? It'll probably say Hover Boulder Over Head. Since he didn't do that and used Heal Self instead, you know the combo is not exactly what the computer thinks it is. This is not a tried and true solution though-- I just thought of it

...
There is also a function for retrieving the actual probability obtained from the calculation that might come in handy.


What can I use this for?

Examples in gaming that I can think of:

Given the player's party is comprised of a Mage and 3 Clerics, what team composition should the AI use against it?

The player has selected a Warrior, added lots of Agility and equipped a sword. Based on things that I did while testing the game, what abilities will the player most likely select?

The player uses a Rocket Launcher as his main weapon, likes to spin around in circles and spends more than half of his time jumping in the air. Should the AI use landmines?

The AI has access to the same spells as the player and wants to use his combos against him. What spell does the player usually start with? What's his favorite combo from there?

How is the player going to defend against the AI's next attack? What does he think the AI is about to do?

The tic-tac-toe board looks just so. What should the AI do next? (I believe you could train an AI to play tic-tac-toe without even knowing the rules this way if you wanted to. I may try this later).
Examples outside of gaming that I can think of:

Spam filter. Given an email contains the phrases "FREE!", "BUY NOW!", "check out my page" and a tinyurl link, is it spam?

Determining the gender of a name based on endings.

Determining whether or not a picture contains a fruit.

Determining the type of bird based on its wingspan, flight speed, and rest period duration.

Basically anything that requires the
classification of something based on the features it has. That's why it's called a
Classifier.
The power here lies in your ability to pick good features! Be more creative than I am!


Actual Code Stuff

TODO

Better documentation

More examples

Clean up the code/comment it better

Fix any issues you guys discover

More error handling? Maybe?
How fast is it for a realtime game?: Not sure. Only one way to find out
Commands (I would go through the heavily commented simple example in my next post for a good overview of what's going on)
NB_Setup(num_classifiers, num_training_points, num_training_params, num_param_values, num_classify_params)
NB_DefaultSetup()
id = NB_NewClassifier()
NB_SetPriors(m, p)
NB_AddTrainData(id, ptr)
NB_ClearTrainData(id)
NB_AddClassifyParam(id, param)
NB_SetClassifyData(id, ptr)
NB_Train(id)
NB_SaveTrainData(id, filename as string, write_func_name as string)
NB_LoadTrainData(id, filename as string, read_func_name as string)
NB_PrepareClassify(id)
NB_Classify(id, param_to_classify)
NB_GetLastProbability(id)
Code as of August 15, 2013 (Alpha) (
Requires IanM's Matrix1 Utils) (Examples in next post). Just include the code with your project to use the functions.
Rem ***** Included Source File *****
remstart
******************************************************
Supervised Machine Learning: Naive Bayes Classifier
-Coded by Sixty Squares, 2013
******************************************************
Uses Bayes' Theorem with naive assumptions of independence to determine the most likely outcome, A, given
parameters/features B, C, D, E, etc.
remend
#Constant NB_INACTIVE -2
#Constant NB_ACTIVE -1
#Constant NB_MIN_INT -2147483648
Type NB_Classifiers_T
active as integer
m as integer
p as float
train_point as integer ` # of training data points
train_param as integer ` # of train params
classify_param as integer ` # of classify params
last_prob_product as float ` The probability product from the last most likely classification.
Endtype
Type NB_TrainingData_T
value as NB_DATATYPE
Endtype
Type NB_ClassifyParams_T
train_param as integer
Endtype
Type NB_PossibleParamValues_T
value as NB_DATATYPE
times_param_is_class as integer
Endtype
Type NB_ParamData_T
num_values as integer ` Number of possible parameter values
Endtype
function NB_DefaultSetup()
num_classifiers = 2
num_training_points = 100
num_training_params = 5
num_param_values = 20
num_classify_params = 5
NB_Setup(num_classifiers, num_training_points, num_training_params, num_param_values, num_classify_params)
endfunction
function NB_Setup(num_classifiers, num_training_points, num_training_params, num_param_values, num_classify_params)
Global NB_NUM_POSSIBLE_PARAM_VALUES as integer : NB_NUM_POSSIBLE_PARAM_VALUES = num_param_values
Global NB_NUM_CLASSIFIERS as integer : NB_NUM_CLASSIFIERS = num_classifiers
Global NB_NUM_CLASSIFY_PARAMS as integer : NB_NUM_CLASSIFY_PARAMS = num_classify_params
DIM NB_Classifiers(num_classifiers) as NB_Classifiers_T
for c = 1 to num_classifiers
NB_Classifiers(c).active = NB_INACTIVE
next c
DIM NB_TrainingData(num_classifiers, num_training_points, num_training_params) as NB_TrainingData_T
DIM NB_PossibleParamValues(num_classifiers, num_training_params, num_param_values) as NB_PossibleParamValues_T
DIM NB_ClassifyParams(num_classifiers, num_classify_params) as NB_ClassifyParams_T
DIM NB_ParamData(num_classifiers, num_training_params) as NB_ParamData_T
endfunction
function NB_NewClassifier()
id = NB_GetFreeClassifier()
NB_Classifiers(id).active = NB_ACTIVE
NB_Classifiers(id).train_point = 0
NB_Classifiers(id).m = -1
NB_Classifiers(id).p = -1
endfunction id
function NB_SetPriors(id, m as integer, p as float)
NB_Classifiers(id).m = m
NB_Classifiers(id).p = p
endfunction
function NB_AddTrainData(id, ptr)
inc NB_Classifiers(id).train_point, 1
DIM NB_Arr() as NB_DATATYPE
link array NB_Arr(), ptr
ac = array count(NB_Arr())
NB_Classifiers(id).train_param = ac
for i = 1 to ac
NB_TrainingData(id, NB_Classifiers(id).train_point, i).value = NB_Arr(i)
next i
unlink array NB_Arr()
UNDIM NB_Arr()
endfunction
function NB_ClearTrainData(id)
NB_Classifiers(id).train_point = 0
endfunction
function NB_AddClassifyParam(id, param)
` Add a parameter from the training params to classify with
inc NB_Classifiers(id).classify_param, 1
NB_ClassifyParams(id, NB_Classifiers(id).classify_param).train_param = param
endfunction
function NB_ClearClassifyParams(id)
NB_Classifiers(id).classify_param = 0
Endfunction
function NB_SetClassifyData(id, ptr)
`Classify data is stored in the 0th index of training data
DIM NB_Arr()
link array NB_Arr(), ptr
ac = array count(NB_Arr())
for i = 1 to ac
NB_TrainingData(id, 0, i).value = NB_Arr(i)
next i
unlink array NB_Arr()
UNDIM NB_Arr()
endfunction
` Indexes all of the possible values for each training parameter.
function NB_Train(id)
local class as NB_DATATYPE
for tpoint = 1 to NB_Classifiers(id).train_point
for param = 1 to NB_Classifiers(id).train_param
` Get param value for this data point
class = NB_TrainingData(id, tpoint, param).value
` Check if we already know about it.
found = 0
max_class = NB_ParamData(id, param).num_values
for idx = 1 to max_class
if NB_PossibleParamValues(id, param, idx).value = class
found = 1
endif
next idx
` If not, add it.
if found = 0
inc max_class, 1
NB_PossibleParamValues(id, param, max_class).value = class
NB_ParamData(id, param).num_values = max_class
endif
next param
next tpoint
endfunction
function NB_SaveTrainData(id, filename as string, write_func_name as string)
local write_ptr as dword
local file as integer
write_ptr = get ptr to function(write_func_name)
file = find free datafile()
if file exist(filename) then delete file filename
open datafile to write file, filename
`Write Metadata
write datafile string file, "TRAIN_POINT"
write datafile integer file, NB_Classifiers(id).train_point
write datafile string file, "TRAIN_PARAM"
write datafile integer file, NB_Classifiers(id).train_param
write datafile string file, "METADATA_END"
`Write actual training data
for tpoint = 1 to NB_Classifiers(id).train_point
for param = 1 to NB_Classifiers(id).train_param
call function ptr write_ptr, file, NB_TrainingData(id, tpoint, param).value
next param
next tpoint
close datafile file
endfunction
function NB_LoadTrainData(id, filename as string, read_func_name as string)
read_ptr = get ptr to function(read_func_name)
file = find free datafile()
open datafile to read file, filename
` Read metadata
repeat
s$ = datafile string$(file)
if s$ = "TRAIN_POINT"
train_point = datafile integer(file)
endif
if s$ = "TRAIN_PARAM"
train_param = datafile integer(file)
endif
until s$ = "METADATA_END"
local dim NB_TEMP_DATA(train_param) as NB_DATATYPE
arr_ptr = get arrayptr(NB_TEMP_DATA())
local temp_dat as NB_DATATYPE
` Read in training data. Store in temp local array for storage.
for tpoint = 1 to train_point
for param = 1 to train_param
temp_dat = call function ptr(read_ptr, file)
NB_TEMP_DATA(param) = temp_dat
next param
NB_AddTrainData(id, arr_ptr)
next tpoint
close datafile file
endfunction
function NB_PrepareClassify(id)
for c = 1 to NB_Classifiers(id).classify_param
` Get the parameter we are calculating for (i.e. Given our data to classify, what will PARAM be?)
param = NB_ClassifyParams(id, c).train_param
` Iterate over each possible classification for this parameter
local class as NB_DATATYPE
for idx = 1 to NB_ParamData(id, param).num_values
class = NB_PossibleParamValues(id, param, idx).value
` Calculate the number of times the parameter = this class
times_param_is_class = 0
for tpoint = 1 to NB_Classifiers(id).train_point
if NB_TrainingData(id, tpoint, param).value = class
inc times_param_is_class
endif
next tpoint
` Store it for easy acces later when classifying
NB_PossibleParamValues(id, param, idx).times_param_is_class = times_param_is_class
next idx
next c
endfunction
function NB_Classify(id, param_to_classify)
local other_param_is_our_values as integer
local prob_product as float
local best_prob as float : best_prob = NB_MIN_INT * 1.0
local times_param_is_class as integer
local our_specific_case as integer
local specific_case_product as float
local m# as float : local p# as float
` Get the parameter we are calculating for. We want to know what it will be given our data.
param = param_to_classify
` Get m-estimate values to take care of 0 probabilities.
p# = NB_Classifiers(id).p
m# = NB_Classifiers(id).m
` Default values for m and p if none specified.
if p# < 0 : p# = 1.0 / (NB_ParamData(id, param).num_values*1.0) : endif
if m# < 0 : m# = 2 : endif
` Iterate over each possible classification for this parameter
local class as NB_DATATYPE
local best_class as NB_DATATYPE
for idx = 1 to NB_ParamData(id, param).num_values
class = NB_PossibleParamValues(id, param, idx).value
` We want to calculate P(param_to_classify_value=class | [classification_data]). We can do this by asking ourselves-- if our data did belong
` to this particular class, how likely is it that our data would look the way it does? The formula looks like this:
` P(dataparam1=our_value | param_to_classify=class) * P(dataparam2=our_value | param_to_classify=class)
` * ... * P(dataparam_n=our_value | param_to_classify_value=class). We then have to multiply by the likelihood that
` this class even shows up at all, so we multiply by P(param_to_classify_value=class) at the end. We can also divide by the probability that our specific case
` shows up exactly to give us a better guess at how correct our answer is (Bayes' Formula includes this as part of its definition, but since it's constant
` for a given set of classification data many people ignore it in their implementation of this algorithm.)
` This process will tell us how likely it is that this class is the class that our [classification data] belongs to.
` To find an individual P(dataparam_n=its_value | param_to_classify=class), we can use the training set like so:
` (# of cases in training set where dataparam_n=our_value AND param_to_classify=class)
` P(dataparam_n=our_value | param_to_classify=class) = /
` (# of cases in training set where param_to_classify=class)
` This takes care of the (# of cases in training set where param_to_classify=class).
` We calculated it in PrepareClassify().
times_param_is_class = NB_PossibleParamValues(id, param, idx).times_param_is_class
` Now calculate the number of times parameter = this class AND the other parameters = our values. This is the numerator above.
prob_product = 0
specific_case_product = 0
for other_param = 1 to NB_Classifiers(id).train_param
other_param_is_our_values_and = 0
our_specific_case = 0
` Do not calculate it for the parameter we're looking at.
if other_param <> param
` Iterate over all data points in training set
for tpoint = 1 to NB_Classifiers(id).train_point
` Increase counter if other_parameter = our value for other_parameter AND parameter = this class
if NB_TrainingData(id, tpoint, other_param).value = NB_TrainingData(id, 0, other_param).value
if NB_TrainingData(id, tpoint, param).value = class
inc other_param_is_our_values_and, 1
endif
` Increase counter for probability that our specific case shows up for this parameter
inc our_specific_case, 1
endif
next tpoint
` Continue to build our probability product for this particular class.
`print "Adding to prob product: ", (other_param_is_our_values_and * 1.0 + m# * p#)/(times_param_is_class * 1.0 + m#)
prob_product = prob_product + log((other_param_is_our_values_and * 1.0 + m# * p#)/(times_param_is_class * 1.0 + m#))
` Continue to build the probability that our specific case shows up for all parameters
specific_case_product = specific_case_product + log((our_specific_case * 1.0 + m# * p#) / (NB_Classifiers(id).train_point * 1.0 + m#))
`print "SPECIFIC CASE FOR ", NB_TrainingData(id, 0, other_param).value,": ", our_specific_case
endif
next other_param
` Save the result for this potential classification. Finish off the probability calculation by multiplying by
` P(param_to_classify=class) = times_param_is_class / (total number of data points)
` Use logarithms to avoid integer underflow (#'s might get too tiny if we keep multiplying them).
` This is why we add/subtract rather than multiply/divide.
prob_product = prob_product + log(times_param_is_class * 1.0 / (NB_Classifiers(id).train_point * 1.0))
prob_product = prob_product - specific_case_product
if prob_product >= best_prob
best_prob = prob_product
best_class = class
NB_Classifiers(id).last_prob_product = best_prob
endif
next idx
endfunction best_class
function NB_GetLastProbability(id)
prob# = exp( NB_Classifiers(id).last_prob_product)
endfunction prob#
function NB_GetFreeClassifier()
id = 0
repeat
id = id + 1
until NB_Classifiers(id).active = NB_INACTIVE
endfunction id
Anyway, I hope this code is helpful somehow, and if it isn't, I hope the post was at least informative

. Thanks for reading!
Hmm. It's been a while

.