How can I call scikit-learn classifiers from Java?

You cannot use jython as scikit-learn heavily relies on numpy and scipy that have many compiled C and Fortran extensions hence cannot work in jython.

The easiest ways to use scikit-learn in a java environment would be to:

  • expose the classifier as a HTTP / Json service, for instance using a microframework such as flask or bottle or cornice and call it from java using an HTTP client library

  • write a commandline wrapper application in python that reads data on stdin and output predictions on stdout using some format such as CSV or JSON (or some lower level binary representation) and call the python program from java for instance using Apache Commons Exec.

  • make the python program output the raw numerical parameters learnt at fit time (typically as an array of floating point values) and reimplement the predict function in java (this is typically easy for predictive linear models where the prediction is often just a thresholded dot product).

The last approach will be a lot more work if you need to re-implement feature extraction in Java as well.

Finally you can use a Java library such as Weka or Mahout that implement the algorithms you need instead of trying to use scikit-learn from Java.

Leave a Comment