MinMaxScaler
一、总结
一句话总结:
MinMaxScaler是min、max归一化,使用的话先fit,然后再transform归一化操作,也可以合并为fit_transform
>>> from sklearn.preprocessing import MinMaxScaler >>> data = [[-1, 2], [-0.5, 6], [0, 10], [1, 18]] >>> scaler = MinMaxScaler() >>> print(scaler.fit(data)) MinMaxScaler() >>> print(scaler.data_max_) [ 1. 18.] >>> print(scaler.transform(data)) [[0. 0. ] [0.25 0.25] [0.5 0.5 ] [1. 1. ]] >>> print(scaler.transform([[2, 2]])) [[1.5 0. ]]
1、训练集的归一化方法为 scaler.fit_transform,验证集和测试集的归一化方法为scaler.transform?
壹、training_set_scaled = sc.fit_transform(training_set) # 求得训练集的最大值,最小值这些训练集固有的属性,并在训练集上进行归一化
贰、test_set = sc.transform(test_set) # 利用训练集的属性对测试集进行归一化
# 归一化 sc = MinMaxScaler(feature_range=(0, 1)) # 定义归一化:归一化到(0,1)之间 print(sc) MinMaxScaler(copy=True, feature_range=(0, 1)) ------------- training_set_scaled = sc.fit_transform(training_set) # 求得训练集的最大值,最小值这些训练集固有的属性,并在训练集上进行归一化 test_set = sc.transform(test_set) # 利用训练集的属性对测试集进行归一化 print(training_set_scaled[:5,]) print(test_set[:5,]) [[0.011711 ] [0.00980951] [0.00540518] [0.00590914] [0.00489135]] [[0.84288404] [0.85345726] [0.84641315] [0.87046756] [0.86758781]]
二、MinMaxScaler
博客对应课程的视频位置:
>>> from sklearn.preprocessing import MinMaxScaler >>> data = [[-1, 2], [-0.5, 6], [0, 10], [1, 18]] >>> scaler = MinMaxScaler() >>> print(scaler.fit(data)) MinMaxScaler() >>> print(scaler.data_max_) [ 1. 18.] >>> print(scaler.transform(data)) [[0. 0. ] [0.25 0.25] [0.5 0.5 ] [1. 1. ]] >>> print(scaler.transform([[2, 2]])) [[1.5 0. ]]
training_set_scaled = sc.fit_transform(training_set) # 求得训练集的最大值,最小值这些训练集固有的属性,并在训练集上进行归一化
Signature: sc.fit_transform(X, y=None, **fit_params) Docstring: Fit to data, then transform it. Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X. Parameters ---------- X : numpy array of shape [n_samples, n_features] Training set. y : numpy array of shape [n_samples] Target values. **fit_params : dict Additional fit parameters. Returns ------- X_new : numpy array of shape [n_samples, n_features_new] Transformed array.
=================================================================================
training_set = maotai.iloc[0:2426 - 300, 2:3].values # 前(2426-300=2126)天的开盘价作为训练集,表格从0开始计数,2:3 是提取[2:3)列,前闭后开,故提取出C列开盘价
test_set = maotai.iloc[2426 - 300:, 2:3].values # 后300天的开盘价作为测试集
print(training_set.shape)
print(test_set.shape)
In [5]:
# 归一化
sc = MinMaxScaler(feature_range=(0, 1)) # 定义归一化:归一化到(0,1)之间
print(sc)
In [5]:
training_set_scaled = sc.fit_transform(training_set) # 求得训练集的最大值,最小值这些训练集固有的属性,并在训练集上进行归一化
test_set = sc.transform(test_set) # 利用训练集的属性对测试集进行归一化
print(training_set_scaled[:5,])
print(test_set[:5,])