Fork me on GitHub

Milk: Machine Learning Toolkit for Python

This is the code that I use for my research projects.

Where can I get it?

Github as usual. Alternatively the python packages index also contains official releases,the latest of which can be obtained by:

easy_install milk

or:

pip install milk

if you use these tools.

Examples

Here is how to test how well you can classify some features,labels data, measured by cross-validation:

import numpy as np
import milk
features = np.random.rand(100,10) # 2d array of features: 100 examples of 10 features each
labels = np.zeros(100)
features[50:] += .5
labels[50:] = 1
confusion_matrix, names = milk.nfoldcrossvalidation(features, labels)
print 'Accuracy:', confusion_matrix.trace()/float(confusion_matrix.sum())

If want to use a classifier, you instanciate a learner object and call its train() method:

import numpy as np
import milk
features = np.random.rand(100,10)
labels = np.zeros(100)
features[50:] += .5
labels[50:] = 1
learner = milk.defaultclassifier()
model = learner.train(features, labels)

# Now you can use the model on new examples:
example = np.random.rand(10)
print model.apply(example)
example2 = np.random.rand(10)
example2 += .5
print model.apply(example2)

More Documentation

API Documentation: http://packages.python.org/milk/

Mailing list: http://groups.google.com/group/milk-users

Features

  • Pythonic interface to libSVM. I took their code and implemented a wrapper instead of using auto-generated ones.
  • Stepwise Discriminant Analysis for feature selection.
  • K-means clustering. A simple implementation but it works well with very large datasets.
  • Models can be pickle()d and unpickle()d.

Article filed in categories: Software Work Python