• tflearn kears GAN官方demo代码——本质上GAN是先训练判别模型让你能够识别噪声,然后生成模型基于噪声生成数据,目标是让判别模型出错。GAN的过程就是训练这个生成模型参数!!!

    GAN:通过 将 样本 特征 化 以后, 告诉 模型 哪些 样本 是 黑 哪些 是 白, 模型 通过 训练 后, 理解 了 黑白 样本 的 区别, 再输入 测试 样本 时, 模型 就可以 根据 以往 的 经验 判断 是 黑 还是 白。 与 这些 分类 的 算法 不同, GAN 的 基本 原理 是, 有两 个 相生相克 的 模型 Generator 和 Discriminator,Generator 随机 生成 样本, Discriminator 将 真实 样本 标记 为 Real, 将 Generator 生成 的 样本 标记 为 Fake 进行 监督 学习, 然后 Generator 随机 生成 新的 样本, 标记 为 Real 企图 欺骗 Discriminator, Discriminator 反馈 给 Generator 判断 的 结果, Generator 得到 反馈 后 就可以 生成 更 逼真 的 样本 直到 可以 完全 欺骗 Discriminator(???实际应用中生成的数据和真实数据没有啥区别啊???无非一个是程序生成的,一个是现实世界真实的????现实应用的话,图像识别是不是可以用来伪造照片???嗯,看来就是这样!!!DNS tunnel或者DGA的话,可以直接利用GAN来生成DGA域名,骗过判别模型,因为他们和白样本数据几乎没有区别)

    GAN 之所以 可以 生成 逼真 的 样本, 很大 程度 上 依赖于 对抗 模型, 所谓 的 对抗 模型 其实 就是 Generator 和 Discriminator 结合 在一起, 前者 的 输出 作为 后者 的 输入, 然后 使用 欺骗 性的 标记, 企图 欺骗 Discriminator。

    DCGAN 的 训练 过程 分为 两步: 第一步, 生成 一个 大小 为( BATCH_ SIZE, 100) 的 在- 1 ~ 1 平均 分布 的 噪声, 使用 Generator 生成 图像 样本, 然后 和 同样 大小 的 真实 MNIST 图像 样本 合并, 分别 标记 为 0 和 1, 对 Discriminator 进行 训练。 这个 过程中 Discriminator 的 trainable 状态 为 True, 训练过 程 会 更新 其 参数。

    第 二步, 生成 一个 大小 为( BATCH_ SIZE, 100) 的 在- 1 ~ 1 平均 分布 的 噪声, 使用 Generator 生成 图像 样本, 标记 为 1, 欺骗 Discriminator, 这个 过程 针对 对抗 模型 进行 训练

    过程中 Discriminator 的 trainable 状态 为 False, 训练 过程 不会 更新 其 参数。 训练 完成 后 将 重新 把 Discriminator 的 trainable 状态 设为 True。

     tflearn GAN:

    # -*- coding: utf-8 -*-
    """ GAN Example
    Use a generative adversarial network (GAN) to generate digit images from a
    noise distribution.
        - Generative adversarial nets. I Goodfellow, J Pouget-Abadie, M Mirza,
        B Xu, D Warde-Farley, S Ozair, Y. Bengio. Advances in neural information
        processing systems, 2672-2680.
        - [GAN Paper](https://arxiv.org/pdf/1406.2661.pdf).
    from __future__ import division, print_function, absolute_import
    import matplotlib.pyplot as plt
    import numpy as np
    import tensorflow as tf
    import tflearn
    # Data loading and preprocessing
    import tflearn.datasets.mnist as mnist
    X, Y, testX, testY = mnist.load_data()
    image_dim = 784 # 28*28 pixels
    z_dim = 200 # Noise data points
    total_samples = len(X)
    # Generator
    def generator(x, reuse=False):
        with tf.variable_scope('Generator', reuse=reuse):
            x = tflearn.fully_connected(x, 256, activation='relu')
            x = tflearn.fully_connected(x, image_dim, activation='sigmoid')
            return x
    # Discriminator
    def discriminator(x, reuse=False):
        with tf.variable_scope('Discriminator', reuse=reuse):
            x = tflearn.fully_connected(x, 256, activation='relu')
            x = tflearn.fully_connected(x, 1, activation='sigmoid')
            return x
    # Build Networks
    gen_input = tflearn.input_data(shape=[None, z_dim], name='input_noise')
    disc_input = tflearn.input_data(shape=[None, 784], name='disc_input')
    gen_sample = generator(gen_input)
    disc_real = discriminator(disc_input)
    disc_fake = discriminator(gen_sample, reuse=True)
    # Define Loss
    disc_loss = -tf.reduce_mean(tf.log(disc_real) + tf.log(1. - disc_fake))
    gen_loss = -tf.reduce_mean(tf.log(disc_fake))
    # Build Training Ops for both Generator and Discriminator.
    # Each network optimization should only update its own variable, thus we need
    # to retrieve each network variables (with get_layer_variables_by_scope) and set
    # 'placeholder=None' because we do not need to feed any target.
    gen_vars = tflearn.get_layer_variables_by_scope('Generator')
    gen_model = tflearn.regression(gen_sample, placeholder=None, optimizer='adam',
                                   loss=gen_loss, trainable_vars=gen_vars,
                                   batch_size=64, name='target_gen', op_name='GEN')
    disc_vars = tflearn.get_layer_variables_by_scope('Discriminator')
    disc_model = tflearn.regression(disc_real, placeholder=None, optimizer='adam',
                                    loss=disc_loss, trainable_vars=disc_vars,
                                    batch_size=64, name='target_disc', op_name='DISC')
    # Define GAN model, that output the generated images.
    gan = tflearn.DNN(gen_model)
    # Training
    # Generate noise to feed to the generator
    z = np.random.uniform(-1., 1., size=[total_samples, z_dim])
    # Start training, feed both noise and real images.
    gan.fit(X_inputs={gen_input: z, disc_input: X},
    # Generate images from noise, using the generator network.
    f, a = plt.subplots(2, 10, figsize=(10, 4))
    for i in range(10):
        for j in range(2):
            # Noise input.
            z = np.random.uniform(-1., 1., size=[1, z_dim])
            # Generate image from noise. Extend to 3 channels for matplot figure.
            temp = [[ii, ii, ii] for ii in list(gan.predict([z])[0])]
            a[j][i].imshow(np.reshape(temp, (28, 28, 3)))

     tflearn DCGAN:

    # -*- coding: utf-8 -*-
    """ DCGAN Example
    Use a deep convolutional generative adversarial network (DCGAN) to generate
    digit images from a noise distribution.
        - Unsupervised representation learning with deep convolutional generative
        adversarial networks. A Radford, L Metz, S Chintala. arXiv:1511.06434.
        - [DCGAN Paper](https://arxiv.org/abs/1511.06434).
    from __future__ import division, print_function, absolute_import
    import matplotlib.pyplot as plt
    import numpy as np
    import tensorflow as tf
    import tflearn
    # Data loading and preprocessing
    import tflearn.datasets.mnist as mnist
    X, Y, testX, testY = mnist.load_data()
    X = np.reshape(X, newshape=[-1, 28, 28, 1])
    z_dim = 200 # Noise data points
    total_samples = len(X)
    # Generator
    def generator(x, reuse=False):
        with tf.variable_scope('Generator', reuse=reuse):
            x = tflearn.fully_connected(x, n_units=7 * 7 * 128)
            x = tflearn.batch_normalization(x)
            x = tf.nn.tanh(x)
            x = tf.reshape(x, shape=[-1, 7, 7, 128])
            x = tflearn.upsample_2d(x, 2)
            x = tflearn.conv_2d(x, 64, 5, activation='tanh')
            x = tflearn.upsample_2d(x, 2)
            x = tflearn.conv_2d(x, 1, 5, activation='sigmoid')
            return x
    # Discriminator
    def discriminator(x, reuse=False):
        with tf.variable_scope('Discriminator', reuse=reuse):
            x = tflearn.conv_2d(x, 64, 5, activation='tanh')
            x = tflearn.avg_pool_2d(x, 2)
            x = tflearn.conv_2d(x, 128, 5, activation='tanh')
            x = tflearn.avg_pool_2d(x, 2)
            x = tflearn.fully_connected(x, 1024, activation='tanh')
            x = tflearn.fully_connected(x, 2)
            x = tf.nn.softmax(x)
            return x
    # Input Data
    gen_input = tflearn.input_data(shape=[None, z_dim], name='input_gen_noise')
    input_disc_noise = tflearn.input_data(shape=[None, z_dim], name='input_disc_noise')
    input_disc_real = tflearn.input_data(shape=[None, 28, 28, 1], name='input_disc_real')
    # Build Discriminator
    disc_fake = discriminator(generator(input_disc_noise))
    disc_real = discriminator(input_disc_real, reuse=True)
    disc_net = tf.concat([disc_fake, disc_real], axis=0)
    # Build Stacked Generator/Discriminator
    gen_net = generator(gen_input, reuse=True)
    stacked_gan_net = discriminator(gen_net, reuse=True)
    # Build Training Ops for both Generator and Discriminator.
    # Each network optimization should only update its own variable, thus we need
    # to retrieve each network variables (with get_layer_variables_by_scope).
    disc_vars = tflearn.get_layer_variables_by_scope('Discriminator')
    # We need 2 target placeholders, for both the real and fake image target.
    disc_target = tflearn.multi_target_data(['target_disc_fake', 'target_disc_real'],
                                            shape=[None, 2])
    disc_model = tflearn.regression(disc_net, optimizer='adam',
                                    batch_size=64, name='target_disc',
    gen_vars = tflearn.get_layer_variables_by_scope('Generator')
    gan_model = tflearn.regression(stacked_gan_net, optimizer='adam',
                                   batch_size=64, name='target_gen',
    # Define GAN model, that output the generated images.
    gan = tflearn.DNN(gan_model)
    # Training
    # Prepare input data to feed to the discriminator
    disc_noise = np.random.uniform(-1., 1., size=[total_samples, z_dim])
    # Prepare target data to feed to the discriminator (0: fake image, 1: real image)
    y_disc_fake = np.zeros(shape=[total_samples])
    y_disc_real = np.ones(shape=[total_samples])
    y_disc_fake = tflearn.data_utils.to_categorical(y_disc_fake, 2)
    y_disc_real = tflearn.data_utils.to_categorical(y_disc_real, 2)
    # Prepare input data to feed to the stacked generator/discriminator
    gen_noise = np.random.uniform(-1., 1., size=[total_samples, z_dim])
    # Prepare target data to feed to the discriminator
    # Generator tries to fool the discriminator, thus target is 1 (e.g. real images)
    y_gen = np.ones(shape=[total_samples])
    y_gen = tflearn.data_utils.to_categorical(y_gen, 2)
    # Start training, feed both noise and real images.
    gan.fit(X_inputs={'input_gen_noise': gen_noise,
                      'input_disc_noise': disc_noise,
                      'input_disc_real': X},
            Y_targets={'target_gen': y_gen,
                       'target_disc_fake': y_disc_fake,
                       'target_disc_real': y_disc_real},
    # Create another model from the generator graph to generate some samples
    # for testing (re-using same session to re-use the weights learnt).
    gen = tflearn.DNN(gen_net, session=gan.session)
    f, a = plt.subplots(4, 10, figsize=(10, 4))
    for i in range(10):
        # Noise input.
        z = np.random.uniform(-1., 1., size=[4, z_dim])
        g = np.array(gen.predict({'input_gen_noise': z}))
        for j in range(4):
            # Generate image from noise. Extend to 3 channels for matplot figure.
            img = np.reshape(np.repeat(g[j][:, :, np.newaxis], 3, axis=2),
                             newshape=(28, 28, 3))

    keras DCGAN

    # -*- coding: utf-8 -*-
    Train an Auxiliary Classifier Generative Adversarial Network (ACGAN) on the
    MNIST dataset. See https://arxiv.org/abs/1610.09585 for more details.
    You should start to see reasonable images after ~5 epochs, and good images
    by ~15 epochs. You should use a GPU, as the convolution-heavy operations are
    very slow on the CPU. Prefer the TensorFlow backend if you plan on iterating,
    as the compilation time can be a blocker using Theano.
    Hardware           | Backend | Time / Epoch
     CPU               | TF      | 3 hrs
     Titan X (maxwell) | TF      | 4 min
     Titan X (maxwell) | TH      | 7 min
    Consult https://github.com/lukedeo/keras-acgan for more information and
    example output
    from __future__ import print_function
    from collections import defaultdict
        import cPickle as pickle
    except ImportError:
        import pickle
    from PIL import Image
    from six.moves import range
    from keras.datasets import mnist
    from keras import layers
    from keras.layers import Input, Dense, Reshape, Flatten, Embedding, Dropout
    from keras.layers import BatchNormalization
    from keras.layers.advanced_activations import LeakyReLU
    from keras.layers.convolutional import Conv2DTranspose, Conv2D
    from keras.models import Sequential, Model
    from keras.optimizers import Adam
    from keras.utils.generic_utils import Progbar
    import numpy as np
    num_classes = 10
    def build_generator(latent_size):
        # we will map a pair of (z, L), where z is a latent vector and L is a
        # label drawn from P_c, to image space (..., 28, 28, 1)
        cnn = Sequential()
        cnn.add(Dense(3 * 3 * 384, input_dim=latent_size, activation='relu'))
        cnn.add(Reshape((3, 3, 384)))
        # upsample to (7, 7, ...)
        cnn.add(Conv2DTranspose(192, 5, strides=1, padding='valid',
        # upsample to (14, 14, ...)
        cnn.add(Conv2DTranspose(96, 5, strides=2, padding='same',
        # upsample to (28, 28, ...)
        cnn.add(Conv2DTranspose(1, 5, strides=2, padding='same',
        # this is the z space commonly referred to in GAN papers
        latent = Input(shape=(latent_size, ))
        # this will be our label
        image_class = Input(shape=(1,), dtype='int32')
        cls = Flatten()(Embedding(num_classes, latent_size,
        # hadamard product between z-space and a class conditional embedding
        h = layers.multiply([latent, cls])
        fake_image = cnn(h)
        return Model([latent, image_class], fake_image)
    def build_discriminator():
        # build a relatively standard conv net, with LeakyReLUs as suggested in
        # the reference paper
        cnn = Sequential()
        cnn.add(Conv2D(32, 3, padding='same', strides=2,
                       input_shape=(28, 28, 1)))
        cnn.add(Conv2D(64, 3, padding='same', strides=1))
        cnn.add(Conv2D(128, 3, padding='same', strides=2))
        cnn.add(Conv2D(256, 3, padding='same', strides=1))
        image = Input(shape=(28, 28, 1))
        features = cnn(image)
        # first output (name=generation) is whether or not the discriminator
        # thinks the image that is being shown is fake, and the second output
        # (name=auxiliary) is the class that the discriminator thinks the image
        # belongs to.
        fake = Dense(1, activation='sigmoid', name='generation')(features)
        aux = Dense(num_classes, activation='softmax', name='auxiliary')(features)
        return Model(image, [fake, aux])
    if __name__ == '__main__':
        # batch and latent size taken from the paper
        epochs = 100
        batch_size = 100
        latent_size = 100
        # Adam parameters suggested in https://arxiv.org/abs/1511.06434
        adam_lr = 0.0002
        adam_beta_1 = 0.5
        # build the discriminator
        print('Discriminator model:')
        discriminator = build_discriminator()
            optimizer=Adam(lr=adam_lr, beta_1=adam_beta_1),
            loss=['binary_crossentropy', 'sparse_categorical_crossentropy']
        # build the generator
        generator = build_generator(latent_size)
        latent = Input(shape=(latent_size, ))
        image_class = Input(shape=(1,), dtype='int32')
        # get a fake image
        fake = generator([latent, image_class])
        # we only want to be able to train generation for the combined model
        discriminator.trainable = False
        fake, aux = discriminator(fake)
        combined = Model([latent, image_class], [fake, aux])
        print('Combined model:')
            optimizer=Adam(lr=adam_lr, beta_1=adam_beta_1),
            loss=['binary_crossentropy', 'sparse_categorical_crossentropy']
        # get our mnist data, and force it to be of shape (..., 28, 28, 1) with
        # range [-1, 1]
        (x_train, y_train), (x_test, y_test) = mnist.load_data()
        x_train = (x_train.astype(np.float32) - 127.5) / 127.5
        x_train = np.expand_dims(x_train, axis=-1)
        x_test = (x_test.astype(np.float32) - 127.5) / 127.5
        x_test = np.expand_dims(x_test, axis=-1)
        num_train, num_test = x_train.shape[0], x_test.shape[0]
        train_history = defaultdict(list)
        test_history = defaultdict(list)
        for epoch in range(1, epochs + 1):
            print('Epoch {}/{}'.format(epoch, epochs))
            num_batches = int(x_train.shape[0] / batch_size)
            progress_bar = Progbar(target=num_batches)
            # we don't want the discriminator to also maximize the classification
            # accuracy of the auxiliary classifier on generated images, so we
            # don't train discriminator to produce class labels for generated
            # images (see https://openreview.net/forum?id=rJXTf9Bxg).
            # To preserve sum of sample weights for the auxiliary classifier,
            # we assign sample weight of 2 to the real images.
            disc_sample_weight = [np.ones(2 * batch_size),
                                  np.concatenate((np.ones(batch_size) * 2,
            epoch_gen_loss = []
            epoch_disc_loss = []
            for index in range(num_batches):
                # generate a new batch of noise
                noise = np.random.uniform(-1, 1, (batch_size, latent_size))
                # get a batch of real images
                image_batch = x_train[index * batch_size:(index + 1) * batch_size]
                label_batch = y_train[index * batch_size:(index + 1) * batch_size]
                # sample some labels from p_c
                sampled_labels = np.random.randint(0, num_classes, batch_size)
                # generate a batch of fake images, using the generated labels as a
                # conditioner. We reshape the sampled labels to be
                # (batch_size, 1) so that we can feed them into the embedding
                # layer as a length one sequence
                generated_images = generator.predict(
                    [noise, sampled_labels.reshape((-1, 1))], verbose=0)
                x = np.concatenate((image_batch, generated_images))
                # use one-sided soft real/fake labels
                # Salimans et al., 2016
                # https://arxiv.org/pdf/1606.03498.pdf (Section 3.4)
                soft_zero, soft_one = 0, 0.95
                y = np.array([soft_one] * batch_size + [soft_zero] * batch_size)
                aux_y = np.concatenate((label_batch, sampled_labels), axis=0)
                # see if the discriminator can figure itself out...
                    x, [y, aux_y], sample_weight=disc_sample_weight))
                # make new noise. we generate 2 * batch size here such that we have
                # the generator optimize over an identical number of images as the
                # discriminator
                noise = np.random.uniform(-1, 1, (2 * batch_size, latent_size))
                sampled_labels = np.random.randint(0, num_classes, 2 * batch_size)
                # we want to train the generator to trick the discriminator
                # For the generator, we want all the {fake, not-fake} labels to say
                # not-fake
                trick = np.ones(2 * batch_size) * soft_one
                    [noise, sampled_labels.reshape((-1, 1))],
                    [trick, sampled_labels]))
                progress_bar.update(index + 1)
            print('Testing for epoch {}:'.format(epoch))
            # evaluate the testing loss here
            # generate a new batch of noise
            noise = np.random.uniform(-1, 1, (num_test, latent_size))
            # sample some labels from p_c and generate images from them
            sampled_labels = np.random.randint(0, num_classes, num_test)
            generated_images = generator.predict(
                [noise, sampled_labels.reshape((-1, 1))], verbose=False)
            x = np.concatenate((x_test, generated_images))
            y = np.array([1] * num_test + [0] * num_test)
            aux_y = np.concatenate((y_test, sampled_labels), axis=0)
            # see if the discriminator can figure itself out...
            discriminator_test_loss = discriminator.evaluate(
                x, [y, aux_y], verbose=False)
            discriminator_train_loss = np.mean(np.array(epoch_disc_loss), axis=0)
            # make new noise
            noise = np.random.uniform(-1, 1, (2 * num_test, latent_size))
            sampled_labels = np.random.randint(0, num_classes, 2 * num_test)
            trick = np.ones(2 * num_test)
            generator_test_loss = combined.evaluate(
                [noise, sampled_labels.reshape((-1, 1))],
                [trick, sampled_labels], verbose=False)
            generator_train_loss = np.mean(np.array(epoch_gen_loss), axis=0)
            # generate an epoch report on performance
            print('{0:<22s} | {1:4s} | {2:15s} | {3:5s}'.format(
                'component', *discriminator.metrics_names))
            print('-' * 65)
            ROW_FMT = '{0:<22s} | {1:<4.2f} | {2:<15.4f} | {3:<5.4f}'
            print(ROW_FMT.format('generator (train)',
            print(ROW_FMT.format('generator (test)',
            print(ROW_FMT.format('discriminator (train)',
            print(ROW_FMT.format('discriminator (test)',
            # save weights every epoch
                'params_generator_epoch_{0:03d}.hdf5'.format(epoch), True)
                'params_discriminator_epoch_{0:03d}.hdf5'.format(epoch), True)
            # generate some digits to display
            num_rows = 40
            noise = np.tile(np.random.uniform(-1, 1, (num_rows, latent_size)),
                            (num_classes, 1))
            sampled_labels = np.array([
                [i] * num_rows for i in range(num_classes)
            ]).reshape(-1, 1)
            # get a batch to display
            generated_images = generator.predict(
                [noise, sampled_labels], verbose=0)
            # prepare real images sorted by class label
            real_labels = y_train[(epoch - 1) * num_rows * num_classes:
                                  epoch * num_rows * num_classes]
            indices = np.argsort(real_labels, axis=0)
            real_images = x_train[(epoch - 1) * num_rows * num_classes:
                                  epoch * num_rows * num_classes][indices]
            # display generated images, white separator, real images
            img = np.concatenate(
                 np.repeat(np.ones_like(x_train[:1]), num_rows, axis=0),
            # arrange them into a grid
            img = (np.concatenate([r.reshape(-1, 28)
                                   for r in np.split(img, 2 * num_classes + 1)
                                   ], axis=-1) * 127.5 + 127.5).astype(np.uint8)
        with open('acgan-history.pkl', 'wb') as f:
    pickle.dump({'train': train_history, 'test': test_history}, f)


     GAN(Generative adversarial nets),中文是生成对抗网络,他是一种生成式模型,也是一种无监督学习模型。其最大的特点是为深度网络提供了一种对抗训练的方式,此方式有助于解决一些普通训练方式不容易解决的问题。并且Yan lecun明确表示GAN是近几十年除了面包机最伟大的发明,并且希望是自己发明的GAN。


    1. GAN诞生背后的故事:

    学术界流传,GAN创始人 Ian Goodfellow 在酒吧微醉后与同事讨论学术问题,当时灵光乍现提出了GAN初步的想法,不过当时并没有得到同事的认可,在从酒吧回去后发现女朋友已经睡了,于是自己熬夜写了代码,发现还真有效果,于是经过一番研究后,GAN就诞生了,一篇开山之作。附上一张大神照片。


     Ian goodfellow


    2. GAN的原理:

    GAN的主要灵感来源于博弈论中零和博弈的思想,应用到深度学习神经网络上来说,就是通过生成网络G(Generator)和判别网络D(Discriminator)不断博弈,进而使G学习到数据的分布,如果用到图片生成上,则训练完成后,G可以从一段随机数中生成逼真的图像。G, D的主要功能是:

    ●  G是一个生成式的网络,它接收一个随机的噪声z(随机数),通过这个噪声生成图像 

    ●  D是一个判别网络,判别一张图片是不是“真实的”。它的输入参数是x,x代表一张图片,输出D(x)代表x为真实图片的概率,如果为1,就代表100%是真实的图片,而输出为0,就代表不可能是真实的图片


    3. GAN的特点:

    ●  相比较传统的模型,他存在两个不同的网络,而不是单一的网络,并且训练方式采用的是对抗训练方式

    ●  GAN中G的梯度更新信息来自判别器D,而不是来自数据样本

    4. GAN 的优点:

    (以下部分摘自ian goodfellow 在Quora的问答)

    ●  GAN是一种生成式模型,相比较其他生成模型(玻尔兹曼机和GSNs)只用到了反向传播,而不需要复杂的马尔科夫链

    ●  相比其他所有模型, GAN可以产生更加清晰,真实的样本

    ●  GAN采用的是一种无监督的学习方式训练,可以被广泛用在无监督学习和半监督学习领域

    ●  相比于变分自编码器, GANs没有引入任何决定性偏置( deterministic bias),变分方法引入决定性偏置,因为他们优化对数似然的下界,而不是似然度本身,这看起来导致了VAEs生成的实例比GANs更模糊

    ●  相比VAE, GANs没有变分下界,如果鉴别器训练良好,那么生成器可以完美的学习到训练样本的分布.换句话说,GANs是渐进一致的,但是VAE是有偏差的

    ●  GAN应用到一些场景上,比如图片风格迁移,超分辨率,图像补全,去噪,避免了损失函数设计的困难,不管三七二十一,只要有一个的基准,直接上判别器,剩下的就交给对抗训练了。

    5. GAN的缺点:

    ●  训练GAN需要达到纳什均衡,有时候可以用梯度下降法做到,有时候做不到.我们还没有找到很好的达到纳什均衡的方法,所以训练GAN相比VAE或者PixelRNN是不稳定的,但我认为在实践中它还是比训练玻尔兹曼机稳定的多

    ●  GAN不适合处理离散形式的数据,比如文本

    ●  GAN存在训练不稳定、梯度消失、模式崩溃的问题(目前已解决)

    模式崩溃(model collapse)原因



    关于梯度消失的问题可以参考郑华滨的令人拍案叫绝的wassertein GAN,里面给出了详细的解释,不过多重复。





    1. SGD容易震荡,容易使GAN训练不稳定,

    2. GAN的目的是在高维非凸的参数空间中找到纳什均衡点,GAN的纳什均衡点是一个鞍点,但是SGD只会找到局部极小值,因为SGD解决的是一个寻找最小值的问题,GAN是一个博弈问题。


    1. 文本数据相比较图片数据来说是离散的,因为对于文本来说,通常需要将一个词映射为一个高维的向量,最终预测的输出是一个one-hot向量,假设softmax的输出是(0.2, 0.3, 0.1,0.2,0.15,0.05)那么变为onehot是(0,1,0,0,0,0),如果softmax输出是(0.2, 0.25, 0.2, 0.1,0.15,0.1 ),one-hot仍然是(0, 1, 0, 0, 0, 0),所以对于生成器来说,G输出了不同的结果但是D给出了同样的判别结果,并不能将梯度更新信息很好的传递到G中去,所以D最终输出的判别没有意义。

    2. 另外就是GAN的损失函数是JS散度,JS散度不适合衡量不想交分布之间的距离。



    1. 输入规范化到(-1,1)之间,最后一层的激活函数使用tanh(BEGAN除外)

    2. 使用wassertein GAN的损失函数,

    3. 如果有标签数据的话,尽量使用标签,也有人提出使用反转标签效果很好,另外使用标签平滑,单边标签平滑或者双边标签平滑

    4. 使用mini-batch norm, 如果不用batch norm 可以使用instance norm 或者weight norm

    5. 避免使用RELU和pooling层,减少稀疏梯度的可能性,可以使用leakrelu激活函数

    6. 优化器尽量选择ADAM,学习率不要设置太大,初始1e-4可以参考,另外可以随着训练进行不断缩小学习率,

    7. 给D的网络层增加高斯噪声,相当于是一种正则


    自从GAN出世后,得到了广泛研究,先后几百篇不同的GANpaper横空出世,国外有大神整理了一个GAN zoo(GAN动物园),链接如下,感兴趣的可以参考一下:




    由于GAN的变种实在太多,这里我只简单介绍几种比较常常用的成果,包括DCGAN,, WGAN, improved-WGAN,BEGAN,并附有详细的代码github链接。


    1. GAN本身是一种生成式模型,所以在数据生成上用的是最普遍的,最常见的是图片生成,常用的有DCGAN WGAN,BEGAN,个人感觉在BEGAN的效果最好而且最简单。

    2. GAN本身也是一种无监督学习的典范,因此它在无监督学习,半监督学习领域都有广泛的应用,比较好的论文有

    Improved Techniques for Training GANs

    Bayesian GAN(最新)

    Good Semi-supervised Learning

    3. 不仅在生成领域,GAN在分类领域也占有一席之地,简单来说,就是替换判别器为一个分类器,做多分类任务,而生成器仍然做生成任务,辅助分类器训练。

    4. GAN可以和强化学习结合,目前一个比较好的例子就是seq-GAN

    5. 目前比较有意思的应用就是GAN用在图像风格迁移,图像降噪修复,图像超分辨率了,都有比较好的结果,详见pix-2-pix GAN 和cycle GAN。但是GAN目前在视频生成上和预测上还不是很好。

    6. 目前也有研究者将GAN用在对抗性攻击上,具体就是训练GAN生成对抗文本,有针对或者无针对的欺骗分类器或者检测系统等等,但是目前没有见到很典范的文章。








  • 相关阅读:
    NOIP 2018 day1 题解
    luogu 1373 小a和uim之大逃离 dp
  • 原文地址:https://www.cnblogs.com/bonelee/p/9186445.html
Copyright © 2020-2023  润新知