我们经常遇到训练时间很长,使用起来就是Weight和Bias。那么如何将训练和测试分开操作呢?
TF给出了模型的加载与保存操作,看了网上都是很简单的使用了一下,这里给出一个神经网络的小程序去测试。
本博文使用了Titanic的数据进行操作:
Train.Py
1 import numpy as np 2 import pandas as pd 3 import tensorflow as tf 4 from sklearn.model_selection import train_test_split 5 6 ################################ 7 # Preparing Data 8 ################################ 9 10 # read data from file 11 data = pd.read_csv('data/train.csv') 12 13 # fill nan values with 0 14 data = data.fillna(0) 15 # convert ['male', 'female'] values of Sex to [1, 0] 16 data['Sex'] = data['Sex'].apply(lambda s: 1 if s == 'male' else 0) 17 # 'Survived' is the label of one class, 18 # add 'Deceased' as the other class 19 data['Deceased'] = data['Survived'].apply(lambda s: 1 - s) 20 21 # select features and labels for training 22 dataset_X = data[['Sex', 'Age', 'Pclass', 'SibSp', 'Parch', 'Fare']].as_matrix() 23 dataset_Y = data[['Deceased', 'Survived']].as_matrix() 24 25 # split training data and validation set data 26 X_train, X_val, y_train, y_val = train_test_split(dataset_X, dataset_Y, 27 test_size=0.2, 28 random_state=42) 29 30 ################################ 31 # Constructing Dataflow Graph 32 ################################ 33 34 # create symbolic variables 35 X = tf.placeholder(tf.float32, shape=[None, 6]) 36 y = tf.placeholder(tf.float32, shape=[None, 2]) 37 38 # weights and bias are the variables to be trained 39 weights = tf.Variable(tf.random_normal([6, 2]), name='weights') 40 bias = tf.Variable(tf.zeros([2]), name='bias') 41 y_pred = tf.nn.softmax(tf.matmul(X, weights) + bias) 42 43 # Minimise cost using cross entropy 44 # NOTE: add a epsilon(1e-10) when calculate log(y_pred), 45 # otherwise the result will be -inf 46 cross_entropy = - tf.reduce_sum(y * tf.log(y_pred + 1e-10), 47 reduction_indices=1) 48 cost = tf.reduce_mean(cross_entropy) 49 50 # use gradient descent optimizer to minimize cost 51 train_op = tf.train.GradientDescentOptimizer(0.001).minimize(cost) 52 53 # calculate accuracy 54 correct_pred = tf.equal(tf.argmax(y, 1), tf.argmax(y_pred, 1)) 55 acc_op = tf.reduce_mean(tf.cast(correct_pred, tf.float32)) 56 57 ################################ 58 # Training and Evaluating the model 59 ################################ 60 saver = tf.train.Saver() 61 # use session to run the calculation 62 with tf.Session() as sess: 63 # variables have to be initialized at the first place 64 tf.global_variables_initializer().run() 65 # training loop 66 for epoch in range(10): 67 total_loss = 0. 68 for i in range(len(X_train)): 69 # prepare feed data and run 70 feed_dict = {X: [X_train[i]], y: [y_train[i]]} 71 _, loss = sess.run([train_op, cost], feed_dict=feed_dict) 72 total_loss += loss 73 # display loss per epoch 74 print('Epoch: %04d, total loss=%.9f' % (epoch + 1, total_loss)) 75 saver_path = saver.save(sess,"wjy_data/model.ckpt") 76 # Accuracy calculated by TensorFlow 77 accuracy = sess.run(acc_op, feed_dict={X: X_val, y: y_val}) 78 print("Accuracy on validation set: %.9f" % accuracy) 79 80 # Accuracy calculated by NumPy 81 pred = sess.run(y_pred, feed_dict={X: X_val}) 82 correct = np.equal(np.argmax(pred, 1), np.argmax(y_val, 1)) 83 numpy_accuracy = np.mean(correct.astype(np.float32)) 84 print("Accuracy on validation set (numpy): %.9f" % numpy_accuracy) 85 86 # predict on test data 87 testdata = pd.read_csv('data/test.csv') 88 testdata = testdata.fillna(0) 89 # convert ['male', 'female'] values of Sex to [1, 0] 90 testdata['Sex'] = testdata['Sex'].apply(lambda s: 1 if s == 'male' else 0) 91 X_test = testdata[['Sex', 'Age', 'Pclass', 'SibSp', 'Parch', 'Fare']] 92 predictions = np.argmax(sess.run(y_pred, feed_dict={X: X_test}), 1) 93 submission = pd.DataFrame({ 94 "PassengerId": testdata["PassengerId"], 95 "Survived": predictions 96 }) 97 98 submission.to_csv("titanic-submission.csv", index=False)
注意:
saver_path = saver.save(sess,"wjy_data/model.ckpt")
项目目录下面必须新建一个wjy_data的文件夹,不然会报错!!!
Test.Py
1 import numpy as np 2 import pandas as pd 3 import tensorflow as tf 4 from sklearn.model_selection import train_test_split 5 6 # create symbolic variables 7 X = tf.placeholder(tf.float32, shape=[None, 6]) 8 y = tf.placeholder(tf.float32, shape=[None, 2]) 9 10 # weights and bias are the variables to be trained 11 weights = tf.Variable(tf.random_normal([6, 2]), name='weights') 12 bias = tf.Variable(tf.zeros([2]), name='bias') 13 y_pred = tf.nn.softmax(tf.matmul(X, weights) + bias) 14 15 # predict on test data 16 testdata = pd.read_csv('data/test.csv') 17 testdata = testdata.fillna(0) 18 # convert ['male', 'female'] values of Sex to [1, 0] 19 testdata['Sex'] = testdata['Sex'].apply(lambda s: 1 if s == 'male' else 0) 20 X_test = testdata[['Sex', 'Age', 'Pclass', 'SibSp', 'Parch', 'Fare']] 21 ################################ 22 # Training and Evaluating the model 23 ################################ 24 saver = tf.train.Saver() 25 # use session to run the calculation 26 with tf.Session() as sess: 27 # variables have to be initialized at the first place 28 tf.global_variables_initializer().run() 29 #save_path = saver.save(sess,"Saved_model/model.ckpt") 30 saver.restore(sess,"wjy_data/model.ckpt")#加载模型 31 predictions = np.argmax(sess.run(y_pred, feed_dict={X: X_test}), 1) 32 submission = pd.DataFrame({ 33 "PassengerId": testdata["PassengerId"], 34 "Survived": predictions 35 }) 36 #saver = tf.train.Saver() 37 submission.to_csv("titanic-submission.csv", index=False)
很方便的使用保存模型的方式去测试和训练数据,不然怎么办~~
参考:
《深度学习原理与TensorFlow实战》
https://blog.csdn.net/lujiandong1/article/details/53301994