Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better repr / str for elm.pipeline.steps wrapped classes? #224

Open
PeterDSteinberg opened this issue Nov 1, 2017 · 1 comment
Open

Comments

@PeterDSteinberg
Copy link
Contributor

PeterDSteinberg commented Nov 1, 2017

What can be done to better wrap the elm.pipeline.steps classes for appearance, repr - str?

Currently this is a repr of a Pipeline from PR #221 (run from the elm/tests directory):

$ ipython -i test_xarray_cross_validation.py
Python 3.6.2 |Continuum Analytics, Inc.| (default, Jul 20 2017, 13:14:59)
Type 'copyright', 'credits' or 'license' for more information
IPython 6.2.1 -- An enhanced Interactive Python. Type '?' for help.

In [1]: pca_regress
Out[1]:
Pipeline(memory=None,
     steps=[('get_y', GetY(layer='y')), ('pca', PCA(copy=True, iterated_power='auto', n_components=None, random_state=None,
  svd_solver='auto', tol=0.0, whiten=False)), ('estimator', Ridge(alpha=1.0, copy_X=True, fit_intercept=True, max_iter=None,
   normalize=False, random_state=None, solver='auto', tol=0.001))])

Looks like sklearn's Pipeline.__repr__ (see pca_regress._cls - here that is sklearn.pipeline.Pipeline. Some methods like repr are delegated to calling pca_regress._cls with self as an argument). Should we add a note to the repr about elm.pipeline.steps / MLDataset?

@PeterDSteinberg
Copy link
Contributor Author

Also note that using the delegation pattern mentioned above causes some of the scikit-learn exception strings to be not as informative as possible, because they look for the string self.__class__.__name__ rather than self._cls.__name__. This causes the unclarity in exception string below (if using a Pipeline, and getting an error about Wrapped (one of its steps), it may be hard to tell which transformer/estimator is having a problem):

self = Ridge(alpha=1.0, copy_X=True, fit_intercept=True, max_iter=None,
   normalize=False, random_state=None, solver='auto', tol=0.001)
params = {'copy_x': False, 'fit_intercept': False, 'normalize': False}
valid_params = {'alpha': 1.0, 'copy_X': True, 'fit_intercept': True, 'max_iter': None, ...}, key = 'copy_x', value = False, split = ['copy_x']

    def set_params(self, **params):
        """Set the parameters of this estimator.

            The method works on simple estimators as well as on nested objects
            (such as pipelines). The latter have parameters of the form
            ``<component>__<parameter>`` so that it's possible to update each
            component of a nested object.

            Returns
            -------
            self
            """
        if not params:
            # Simple optimization to gain speed (inspect is slow)
            return self
        valid_params = self.get_params(deep=True)
        for key, value in six.iteritems(params):
            split = key.split('__', 1)
            if len(split) > 1:
                # nested objects case
                name, sub_name = split
                if name not in valid_params:
                    raise ValueError('Invalid parameter %s for estimator %s. '
                                     'Check the list of available parameters '
                                     'with `estimator.get_params().keys()`.' %
                                     (name, self))
                sub_object = valid_params[name]
                sub_object.set_params(**{sub_name: value})
            else:
                # simple objects case
                if key not in valid_params:
                    raise ValueError('Invalid parameter %s for estimator %s. '
                                     'Check the list of available parameters '
                                     'with `estimator.get_params().keys()`.' %
>                                    (key, self.__class__.__name__))
E                   ValueError: Invalid parameter copy_x for estimator Wrapped. Check the list of available parameters with `estimator.get_params().keys()`.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant