3 Module rml/classify

3 Module rml/classify🔗ℹ

This module provides higher order functions to run classifiers over data sets. Specific algorithm modules are expected to provide classifier functions that this module can use over data sets.

Examples:

> (require rml/data rml/individual rml/classify)
> (define my-classifier (make-my-classifier 5 95.0))
> (displayln (classify iris-data-set default-partition an-iris my-classifier))
(Iris-versicolor)

In this example we create a classifier using the algorithm-specific function make-my-classifier and use it in the call to classify to predict classification values for the individual an-iris.

value
classifier/c : contract?

Supplies a contract that defines classifier functions that are used by the higher order functions in this module. Typically one would expect that an algorithm provider would include a factory function, of the form (-> args ... classifier/c).

procedure
(classify dataset
against-partition
individual
classifier) → list?
  dataset : data-set?
  against-partition : exact-positive-integer?
  individual : individual?
  classifier : classifier/c

This procedure will return a list of classifier values predicted for the provided individual based on the specific algorithm implemented by classifier.

3.1 Partitioned Classification🔗ℹ

procedure
(partitioned-test-classify dataset
train-percentage
classifier) → result-matrix?
  dataset : data-set?
  train-percentage : (real-in 1.0 50.0)
  classifier : classifier/c

This form of training uses the partition-for-test procedure to create two partitions, a training data partition and a test data partition. It then classifies all the individuals in the test partition against the training partition and records the results in a result-matrix. The result matrix can be inspected to determine the accuracy of the classifier.

procedure
(cross-classify dataset
partition-count
classifier) → result-matrix?
  dataset : data-set?
  partition-count : exact-positive-integer?
  classifier : classifier/c

This form of training uses the partition-equally procedure to create partition-count partitions. Each partition is then classified against all the others and the results are collated into a single result-matrix. The result matrix can then be inspected to determine the accuracy of the classifier.

1	Module rml/ data
2	Module rml/ individual
3	Module rml/ classify
4	Module rml/ statistics
5	Module rml/ gini
6	Module rml/ results
7	Module rml/ not-implemented