The Problem:
The pipeline is assuming LabelBinarizer’s fit_transform
method is defined to take three positional arguments:
def fit_transform(self, x, y)
...rest of the code
while it is defined to take only two:
def fit_transform(self, x):
...rest of the code
Possible Solution:
This can be solved by making a custom transformer that can handle 3 positional arguments:
-
Import and make a new class:
from sklearn.base import TransformerMixin #gives fit_transform method for free class MyLabelBinarizer(TransformerMixin): def __init__(self, *args, **kwargs): self.encoder = LabelBinarizer(*args, **kwargs) def fit(self, x, y=0): self.encoder.fit(x) return self def transform(self, x, y=0): return self.encoder.transform(x)
-
Keep your code the same only instead of using LabelBinarizer(), use the class we created : MyLabelBinarizer().
Note: If you want access to LabelBinarizer Attributes (e.g. classes_), add the following line to the fit
method:
self.classes_, self.y_type_, self.sparse_input_ = self.encoder.classes_, self.encoder.y_type_, self.encoder.sparse_input_