RapidMiner

From Robin

Revision as of 11:34, 21 October 2010 by Krisny (Talk | contribs)
Jump to: navigation, search

Installation

Download Rapidminer from the rapid-i website.


Using SVM Classifier

The SVM classifier in Rapidminer is based on LIBSVM.


Optimising SVM Parameters

To calculate the best parameters for the classifier, the python script grid.py in the LIBSVM distribution is handy. This is how you install and use it:

  • Make sure the data is formatted properly. All attributes must be scaled between -1 and 1. The data must be in a text file where each line represents an instance in the classification set. The first value is the correct class, 1:first-attribute 2:second-attribute, and so forth. The example below shows the proper format for two instances of class 1 and two instances of class 2, with four attributes.
1 1:1.000 2:-0.543 3:-0.767 4:-0.253
1 1:-0.184 2:0.144 3:-0.647 4:-0.271 
2 1:-0.684 2:-0.542 3:0.723 4:-0.244 
2 1:-0.964 2:-1.000 3:0.111 4:-0.472 
  • Run grid.py with the name of your data file as an argument, and you will get a plot showing the best parameters for the libsvm classifier (C and gamma).
  • If your python script gives an error message, saying it cannot find gnuplot executable, localize gnuplot on your computer, and change the line in the python script accordingly. Using the instructions on this page, gnuplot has been found to install in /opt/local/bin directory, and not /usr/local/bin which is the default in the grid.py script.
Personal tools
Front page