T4-分类学习 classification

分类学习 (classification)

这里引用莫烦的话

通俗理解定量输出是回归，或者说是连续变量预测；定性输出是分类，或者说是离散变量预测。

数字有十个从 0-9, 按分类的话, 就是有十个类.

数据准备

我们利用 MNIST 提供的手写数字, 大概是这样的:
MNIST

from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)

如果你的MNIST_data文件夹下没有, 会自动下载, 如果存在就可以直接导入了. 数据中包含55000张训练图片，每张图片的分辨率是28×28. 我们要输入的 x 就是 28x28=784

xs = tf.placeholder(tf.float32, [None, 28*28])

每张图片都表示一个数字，所以我们的输出是数字0到9，共10类

ys = tf.placeholder(tf.float32, [None, 10])

调用上一节「添加层函数 add_layer()」搭建一个最简单的训练网络结构，只有输入层和输出层.

prediction = add_layer(xs, 784, 10, activation_function=tf.nn.softmax)

loss 函数: 选用「交叉熵函数」. 交叉熵用来衡量预测值和真实值的相似程度，如果完全相同，它们的交叉熵等于零.

cross_entropy = tf.reduce_mean(-tf.reduce_sum(ys * tf.log(prediction), reduction_indices=[1]))

train 训练: 每次(batch)取 100 张图片, 以免数据太多

batch_xs, batch_ys = mnist.train.next_batch(100)
# train_step: train方法（最优化算法）采用梯度下降法
sess.run(train_step, feed_dict={xs: batch_xs, ys: batch_ys})

完整代码

# !/usr/bin/env python3
# -*- coding: utf-8 -*-

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data

# 1. MNIST 测试图片数据
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)

# 添加层函数
def add_layer(inputs, in_size, out_size, activation_function=None):
	Weight = tf.Variable(tf.random_normal([in_size, out_size])) 
	biases = tf.Variable(tf.zeros([1, out_size]) + 0.1) # biases not 0 is good
	Wx_plus_b = tf.matmul(inputs, Weight) + biases
	# if activation function is None or not:
	if activation_function is None:
		outputs = Wx_plus_b
	else:
		outputs = activation_function(Wx_plus_b)
	return outputs

# 2. Define palceholder for inputs to nerwork
xs = tf.placeholder(tf.float32, [None, 28*28]) # input num(every symbol is 784)
ys = tf.placeholder(tf.float32, [None, 10])    # output

# 3. Add output layer
prediction = add_layer(xs, 784, 10, activation_function=tf.nn.softmax)

# 4. The error between prediction and real data
cross_entorpy = tf.reduce_mean(-tf.reduce_sum(ys*tf.log(prediction),
				reduction_indices=[1]))  # loss(“交叉熵”)
train_setp = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entorpy)

sess = tf.Session()
sess.run(tf.global_variables_initializer())

# 计算精度
def compute_accuracy(v_xs, v_ys):
	global prediction # Needed when changed it's value. 
	# 生成预测值
	y_pre = sess.run(prediction, feed_dict={xs: v_xs})
	# 预测值与真实值对比
	correct_prediction = tf.equal(tf.argmax(y_pre, 1), tf.argmax(v_ys, 1)) 
	# 计算精度
	accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
	# 获得百分比
	result = sess.run(accuracy, feed_dict={xs: v_xs, ys: v_ys})
	return result

# 训练 1000 次
for i in range(1000):
	batch_xs, batch_ys = mnist.train.next_batch(100) # 每次只取100张图片
	sess.run(train_setp, feed_dict={xs: batch_xs, ys: batch_ys})
	if 0 == i % 50:
		# 输出精度
		print(compute_accuracy(mnist.test.images, mnist.test.labels))

输出大概从 0.2 到 0.8 0.9 (精度在不断提高), 这边我用 windows 没安装 TensorFlow 就没有输出数据了.

相关阅读:
第8/24周覆盖索引临界点
 理解统计信息（1/6）：密度向量
 索引碎片检测
 索引碎片
 索引深入浅出(10/10)：创建索引时，键列位置的重要性
 索引深入浅出(9/10)：过滤索引
 索引深入浅出(8/10)：覆盖索引或列包含
 索引深入浅出(7/10)：非唯一列上的非聚集索引
 索引深入浅出(6/10)：选择正确并合适的聚集索引键
 索引深入浅出(5/10)：非聚集索引的B树结构在堆表
原文地址：https://www.cnblogs.com/TaylorBoy/p/6793320.html