一、Introduction
Perceptron can represent AND,OR,NOT
用初中的线性规划问题理解
异或的里程碑意义
想学的通透,先学历史!
据说在人工神经网络(artificial neural network, ANN)发展初期,由于无法实现对多层神经网络(包括异或逻辑)的训练而造成了一场ANN危机,到最后BP算法的出现,才让训练带有隐藏层的多层神经网络成为可能。因此异或的实现在ANN的发展史是也是具有里程碑意义的。异或之所以重要,是因为它相对于其他逻辑关系,例如与(AND), 或(OR)等,异或是线性不可分的。如下图:
要解决非线性可分问题,需考虑使用多层功能神经元. 例如下图中这个
简单的两层感知机就能解决异或问题。在图中,输出层与输入层之间的一
层神经元,被称为隐含层(hidden layer) ,隐含层和输出层神经元都是拥
有激活函数的功能神经元.
能解决异或问题的两层感知机
参考周志华老师西瓜书
二、Python 代码实现
异或肯定是不能通过一条直线区分的,因此单层网络无法实现异或,但两层(包含一个隐藏层)就可以了。
在实际应用中,异或门(Exclusive-OR gate, XOR gate)是数字逻辑中实现逻辑异或的逻辑门,这一函数能实现模为2的加法。因此,异或门可以实现计算机中的二进制加法。
可以有多种方法实现Xor功能,本代码采用的算法图示如下
将上图转化为神经网络层形式便于理解:
# ----------
#
# In this exercise, you will create a network of perceptrons that can represent
# the XOR function, using a network structure like those shown in the previous
# quizzes.
#
# You will need to do two things:
# First, create a network of perceptrons with the correct weights
# Second, define a procedure EvalNetwork() which takes in a list of inputs and
# outputs the value of this network.
#
# ----------
import numpy as np
class Perceptron:
"""
This class models an artificial neuron with step activation function.
"""
def __init__(self, weights = np.array([1]), threshold = 0):
"""
Initialize weights and threshold based on input arguments. Note that no
type-checking is being performed here for simplicity.
"""
self.weights = weights
self.threshold = threshold
def activate(self, values):
"""
Takes in @param values, a list of numbers equal to length of weights.
@return the output of a threshold perceptron with given inputs based on
perceptron weights and threshold.
"""
# First calculate the strength with which the perceptron fires
strength = np.dot(values,self.weights)
# Then return 0 or 1 depending on strength compared to threshold
return int(strength >= self.threshold)#this row changed by myself
# Part 1: Set up the perceptron network
Network = [
# input layer, declare input layer perceptrons here
[ Perceptron([1,0],1),Perceptron([1,1],2),Perceptron([0,1],1) ],
# output node, declare output layer perceptron here
[ Perceptron([1, -2, 1], 1) ]
]
# Part 2: Define a procedure to compute the output of the network, given inputs
def EvalNetwork(inputValues, Network):
"""
Takes in @param inputValues, a list of input values, and @param Network
that specifies a perceptron network. @return the output of the Network for
the given set of inputs.
"""
# MY MAIN CODE HERE
# Be sure your output value is a single number
#Method1 :
return Network[1][0].activate([p.activate(inputValues) for p in Network[0]])
# p is an instance of Perceptron.
# inner brackets -->input layer
# Network[1][0] -->Perceptron([1, -2, 1], 1) -- Only one element
#Method2 :
# OutputValue = inputValues
# for layer in Network:
# OutputValue = map(lambda p:p.activate(OutputValue), layer)
# return OutputValue
## but warning:this method return a list ,not a single number
## to review Python Grammar?
def test():
"""
A few tests to make sure that the perceptron class performs as expected.
"""
print "0 XOR 0 = 0?:", EvalNetwork(np.array([0,0]), Network)
print "0 XOR 1 = 1?:", EvalNetwork(np.array([0,1]), Network)
print "1 XOR 0 = 1?:", EvalNetwork(np.array([1,0]), Network)
print "1 XOR 1 = 0?:", EvalNetwork(np.array([1,1]), Network)
if __name__ == "__main__":
test()
OUTPUT:
Running test()...
0 XOR 0 = 0?: 0
0 XOR 1 = 1?: 1
1 XOR 0 = 1?: 1
1 XOR 1 = 0?: 0
All done!