3 Module rml/classify
(require rml/classify) | package: rml-core |
This module provides higher order functions to run classifiers over data sets. Specific algorithm modules are expected to provide classifier functions that this module can use over data sets.
Examples:
> (require rml/data rml/individual rml/classify) > (define my-classifier (make-my-classifier 5 95.0)) > (displayln (classify iris-data-set default-partition an-iris my-classifier)) (Iris-versicolor)
In this example we create a classifier using the algorithm-specific function make-my-classifier and use it in the call to classify to predict classification values for the individual an-iris.
value
classifier/c : contract?
Supplies a contract that defines classifier functions that are used by the higher
order functions in this module. Typically one would expect that an algorithm
provider would include a factory function, of the form
(-> args ... classifier/c).
procedure
(classify dataset against-partition individual classifier) → list? dataset : data-set? against-partition : exact-positive-integer? individual : individual? classifier : classifier/c
This procedure will return a list of classifier values predicted for the provided
individual based on the specific algorithm implemented by classifier.
3.1 Partitioned Classification
procedure
(partitioned-test-classify dataset train-percentage classifier) → result-matrix? dataset : data-set? train-percentage : (real-in 1.0 50.0) classifier : classifier/c
This form of training uses the partition-for-test procedure to create two
partitions, a training data partition and a test data partition. It then classifies
all the individuals in the test partition against the training partition and records
the results in a result-matrix. The result matrix can be inspected to determine
the accuracy of the classifier.
procedure
(cross-classify dataset partition-count classifier) → result-matrix? dataset : data-set? partition-count : exact-positive-integer? classifier : classifier/c
This form of training uses the partition-equally procedure to create
partition-count partitions. Each partition is then classified against
all the others and the results are collated into a single result-matrix.
The result matrix can then be inspected to determine the accuracy of the classifier.