• tensorflow:Your input ran out of data; interrupting training. Make sure that your dataset or generator can generate at least steps_per_epoch * epochs batches


    tensorflow:Your input ran out of data; interrupting training. Make sure that your dataset or generator can generate at least steps_per_epoch * epochs batches

    一、总结

    一句话总结:

    保证batch_size(图像增强中)*steps_per_epoch(fit中)小于等于训练样本数


    train_generator = train_datagen.flow_from_directory(         
        train_dir, # 目标目录         
        target_size=(150, 150), # 将所有图像的大小调整为 150×150
        batch_size=20, # 因为使用了 binary_crossentropy 损失,所以需要用二进制标签
        class_mode='binary')

    history = model.fit(       
        train_generator,
        steps_per_epoch=100,
        epochs=150,
        validation_data=validation_generator,
        validation_steps=50)

    # case 1
    # 如果上面train_generator的batch_size是32,如果这里steps_per_epoch=100,那么会报错
    """
    tensorflow:Your input ran out of data; interrupting training.
    Make sure that your dataset or generator can generate at least `steps_per_epoch * epochs` batches (in this case, 50 batches).
    You may need to use the repeat() function when building your dataset.
    """
    # 因为train样本数是2000(猫1000,狗1000),小于100*32
    # case 2
    # 如果上面train_generator的batch_size是20,如果这里steps_per_epoch=100,那么不会报错
    # 因为大小刚好
    # case 3
    # 如果上面train_generator的batch_size是32,如果这里steps_per_epoch=int(1000/32),
    # 那么不会报错,但是会有警告,因为也是不整除
    # 不会报错因为int(1000/32)*32 < 2000
    # case 4
    # 如果上面train_generator的batch_size是40,如果这里steps_per_epoch=100,照样报错
    # 因为40*100>2000

    二、tensorflow:Your input ran out of data; interrupting training. Make sure that your dataset or generator can generate at least steps_per_epoch * epochs batches

    转自或参考:https://stackoverflow.com/questions/60509425/how-to-use-repeat-function-when-building-data-in-keras

    1、报错

    WARNING:tensorflow:Your input ran out of data;
    interrupting training. Make sure that your dataset or generator can generate at least
    steps_per_epoch * epochs batches (in this case, 5000 batches).
    You may need to use the repeat() function when building your dataset.

    2、现象

    I am training a binary classifier on a dataset of cats and dogs:
    Total Dataset: 10000 images
    Training Dataset: 8000 images
    Validation/Test Dataset: 2000 images

    The Jupyter notebook code:

    # Part 2 - Fitting the CNN to the images
    train_datagen = ImageDataGenerator(rescale = 1./255,
                                       shear_range = 0.2,
                                       zoom_range = 0.2,
                                       horizontal_flip = True)
    
    test_datagen = ImageDataGenerator(rescale = 1./255)
    
    training_set = train_datagen.flow_from_directory('dataset/training_set',
                                                     target_size = (64, 64),
                                                     batch_size = 32,
                                                     class_mode = 'binary')
    
    test_set = test_datagen.flow_from_directory('dataset/test_set',
                                                target_size = (64, 64),
                                                batch_size = 32,
                                                class_mode = 'binary')
    
    history = model.fit_generator(training_set,
                                  steps_per_epoch=8000,
                                  epochs=25,
                                  validation_data=test_set,
                                  validation_steps=2000)
    

    I trained it on a CPU without a problem but when I run on GPU it throws me this error:

    Found 8000 images belonging to 2 classes.
    Found 2000 images belonging to 2 classes.
    WARNING:tensorflow:From <ipython-input-8-140743827a71>:23: Model.fit_generator (from tensorflow.python.keras.engine.training) is deprecated and will be removed in a future version.
    Instructions for updating:
    Please use Model.fit, which supports generators.
    WARNING:tensorflow:sample_weight modes were coerced from
      ...
        to  
      ['...']
    WARNING:tensorflow:sample_weight modes were coerced from
      ...
        to  
      ['...']
    Train for 8000 steps, validate for 2000 steps
    Epoch 1/25
     250/8000 [..............................] - ETA: 21:50 - loss: 7.6246 - accuracy: 0.5000
    WARNING:tensorflow:Your input ran out of data; interrupting training. Make sure that your dataset or generator can generate at least `steps_per_epoch * epochs` batches (in this case, 200000 batches). You may need to use the repeat() function when building your dataset.
     250/8000 [..............................] - ETA: 21:52 - loss: 7.6246 - accuracy: 0.5000
    

    I would like to know how to use the repeat() function in keras using Tensorflow 2.0?

    3、解决

    Your problem stems from the fact that the parameters steps_per_epoch and validation_steps need to be equal to the total number of data points divided to the batch_size.

    Your code would work in Keras 1.X, prior to August 2017.

    Change your model.fit function to:

    history = model.fit_generator(training_set,
                                  steps_per_epoch=int(8000/batch_size),
                                  epochs=25,
                                  validation_data=test_set,
                                  validation_steps=int(2000/batch_size))
    

    As of TensorFlow2.1, fit_generator is being deprecated. You can use .fit() method also on generators.

    TensorFlow >= 2.1 code:

    history = model.fit(training_set.repeat(),
                        steps_per_epoch=int(8000/batch_size),
                        epochs=25,
                        validation_data=test_set.repeat(),
                        validation_steps=int(2000/batch_size))
    

    Notice that int(8000/batch_size) is equivalent to 8000 // batch_size (integer division)

    ============================================================================

    也就是steps_per_epoch=int(8000/batch_size),这里的8000是训练样本数

    4、实例

    训练样本为1000张,

    train_datagen = ImageDataGenerator(     
        rescale=1./255,     
        rotation_range=40,     
        width_shift_range=0.2,     
        height_shift_range=0.2, 
        shear_range=0.2,     
        zoom_range=0.2,     
        horizontal_flip=True,) 
    # 注意,不能增强验证数据
    test_datagen = ImageDataGenerator(rescale=1./255) 
    
    
    # 这里batch_size不能是32,不然就报如下错误
    '''
    WARNING:tensorflow:Your input ran out of data; 
    interrupting training. Make sure that your dataset or generator can generate at least 
    steps_per_epoch * epochs batches (in this case, 5000 batches). 
    You may need to use the repeat() function when building your dataset.
    
    '''
    # 可能是整除关系吧
    
    
    
    train_generator = train_datagen.flow_from_directory(         
        train_dir, # 目标目录         
        target_size=(150, 150), # 将所有图像的大小调整为 150×150
        batch_size=20, # 因为使用了 binary_crossentropy 损失,所以需要用二进制标签
        class_mode='binary') 
    
    validation_generator = test_datagen.flow_from_directory(         
        validation_dir,         
        target_size=(150, 150),         
        batch_size=20,         
        class_mode='binary') 
     

    history = model.fit(       
        train_generator,
        steps_per_epoch=100,
        epochs=150,
        validation_data=validation_generator,
        validation_steps=50)

    如果上面的batch_size=32,那么这里如果steps_per_epoch=100会报错

    steps_per_epoch 参数的作用:从生成器中抽取 steps_per_epoch 个批量后(即运行了 steps_per_epoch 次梯度下降),拟合过程 将进入下一个轮次。

    4.1、具体测试情况

    # case 1
    # 如果上面train_generator的batch_size是32,如果这里steps_per_epoch=100,那么会报错
    """
    tensorflow:Your input ran out of data; interrupting training.
    Make sure that your dataset or generator can generate at least `steps_per_epoch * epochs` batches (in this case, 50 batches).
    You may need to use the repeat() function when building your dataset.
    """
    # 因为train样本数是2000(猫1000,狗1000),小于100*32
    # case 2
    # 如果上面train_generator的batch_size是20,如果这里steps_per_epoch=100,那么不会报错
    # 因为大小刚好
    # case 3
    # 如果上面train_generator的batch_size是32,如果这里steps_per_epoch=int(1000/32),
    # 那么不会报错,但是会有警告,因为也是不整除
    # 不会报错因为int(1000/32)*32 < 2000
    # case 4
    # 如果上面train_generator的batch_size是40,如果这里steps_per_epoch=100,照样报错
    # 因为40*100>2000

    5、具体代码

     
    我的旨在学过的东西不再忘记(主要使用艾宾浩斯遗忘曲线算法及其它智能学习复习算法)的偏公益性质的完全免费的编程视频学习网站: fanrenyi.com;有各种前端、后端、算法、大数据、人工智能等课程。
    博主25岁,前端后端算法大数据人工智能都有兴趣。
    大家有啥都可以加博主联系方式(qq404006308,微信fan404006308)互相交流。工作、生活、心境,可以互相启迪。
    聊技术,交朋友,修心境,qq404006308,微信fan404006308
    26岁,真心找女朋友,非诚勿扰,微信fan404006308,qq404006308
    人工智能群:939687837

    作者相关推荐

  • 相关阅读:
    mac安装mysql 8.0.20
    leetcode之两数之和
    家人闲坐,灯火可亲汪曾祺散文集读书笔记
    java入门知识代码练习
    苏世民:我的经验与教训读后感
    java入门知识
    创业者日志——易居cms产品有什么不同的地方?
    易优CMS:channelartlist 获取当前频道的下级栏目的内容列表
    房产小程序可以实现什么功能?有什么优势?怎么推广小程序?
    房产中介是否需要用管理系统?哪个房产中介管理软件好?
  • 原文地址:https://www.cnblogs.com/Renyi-Fan/p/13795558.html
Copyright © 2020-2023  润新知