Having said that, you can query sklearn.preprocessing.StandardScaler
for the fit parameters:
scale_ : ndarray, shape (n_features,) Per feature relative scaling of the data. New in version 0.17: scale_ is recommended instead of deprecated std_. mean_ : array of floats with shape [n_features] The mean value for each feature in the training set.
The following short snippet illustrates this:
from sklearn import preprocessing
import numpy as np
s = preprocessing.StandardScaler()
s.fit(np.array([[1., 2, 3, 4]]).T)
>>> s.mean_, s.scale_
(array([ 2.5]), array([ 1.11803399]))
参考:https://stackoverflow.com/questions/35944783/how-to-store-scaling-parameters-for-later-use
解法:
>>> from sklearn import preprocessing >>> import numpy as np >>> >>> s = preprocessing.StandardScaler() >>> s.fit(np.array([[1., 2, 3, 4]]).T) StandardScaler(copy=True, with_mean=True, with_std=True) >>> s.mean_, s.scale_ (array([2.5]), array([1.11803399])) >>> s.transform(np.array([[1., 2, 3, 4]]).T) array([[-1.34164079], [-0.4472136 ], [ 0.4472136 ], [ 1.34164079]]) >>> (1-s.mean_)/s.scale_ array([-1.34164079]) >>> a=np.array([1,2,3]) >>> b=np.array([1,2,3]) >>> a==b array([ True, True, True])
(np.array([1., 2, 3, 4])-s.mean_)/s.scale_
array([-1.34164079, -0.4472136 , 0.4472136 , 1.34164079]) 和transform效果一样。
可以看到,离线使用StandardScaler时,只需要s.mean_, s.scale_这两个关键参数即可!