• Google Colab V100 +TensorFlow1.15.2 性能测试


    为了对比滴滴云内测版NVIDIA A100,跑了一下Google Colab V100 的 TensorFlow基准测试,现在把结果记录一下!

    运行环境

    平台为:Google Colab

    系统为:Ubuntu 18.04

    显卡为:V100-SXM2-16GB

    Python版本: 3.6

    TensorFlow版本:1.15.2

    显卡相关:

     

    测试方法

     

    TensorFlow benchmarks测试方法:

    https://github.com/tensorflow/benchmarks

    ResNet50_v1.5 BS64

    !python tf_cnn_benchmarks.py --num_gpus=1 --batch_size=64 --model=resnet50_v1.5
    Step	Img/sec	total_loss
    1 images/sec: 349.6 +/- 0.0 (jitter = 0.0) 7.848
    10 images/sec: 349.9 +/- 0.2 (jitter = 0.4) 8.053
    20 images/sec: 349.9 +/- 0.1 (jitter = 0.6) 8.103
    30 images/sec: 350.2 +/- 0.1 (jitter = 0.6) 8.118
    40 images/sec: 350.2 +/- 0.1 (jitter = 0.8) 7.894
    50 images/sec: 350.3 +/- 0.1 (jitter = 0.8) 7.918
    60 images/sec: 350.1 +/- 0.1 (jitter = 0.7) 8.103
    70 images/sec: 350.0 +/- 0.1 (jitter = 0.8) 7.986
    80 images/sec: 350.0 +/- 0.1 (jitter = 0.8) 7.808
    90 images/sec: 350.0 +/- 0.1 (jitter = 0.8) 7.972
    100 images/sec: 350.0 +/- 0.1 (jitter = 0.9) 7.649
    ----------------------------------------------------------------
    total images/sec: 349.78
    ----------------------------------------------------------------

    Resnet50 BS64

    python tf_cnn_benchmarks.py --num_gpus=1 --batch_size=64 --model=resnet50
    Step	Img/sec	total_loss
    1 images/sec: 386.2 +/- 0.0 (jitter = 0.0) 8.220
    10 images/sec: 384.8 +/- 0.4 (jitter = 0.7) 7.880
    20 images/sec: 385.5 +/- 0.5 (jitter = 2.2) 7.910
    30 images/sec: 385.7 +/- 0.4 (jitter = 2.6) 7.821
    40 images/sec: 386.0 +/- 0.4 (jitter = 2.3) 8.004
    50 images/sec: 386.2 +/- 0.3 (jitter = 2.4) 7.768
    60 images/sec: 386.3 +/- 0.3 (jitter = 2.4) 8.118
    70 images/sec: 386.1 +/- 0.3 (jitter = 2.5) 7.816
    80 images/sec: 386.3 +/- 0.2 (jitter = 2.4) 7.977
    90 images/sec: 386.2 +/- 0.2 (jitter = 2.5) 8.098
    100 images/sec: 386.3 +/- 0.2 (jitter = 2.4) 8.045
    ----------------------------------------------------------------
    total images/sec: 386.06
    ----------------------------------------------------------------

    --use_fp16

    python tf_cnn_benchmarks.py --num_gpus=1 --batch_size=64 --model=resnet50 --use_fp16
    Step	Img/sec	total_loss
    1 images/sec: 911.0 +/- 0.0 (jitter = 0.0) 8.103
    10 images/sec: 918.1 +/- 1.2 (jitter = 3.1) 7.756
    20 images/sec: 914.3 +/- 2.3 (jitter = 4.3) 7.915
    30 images/sec: 914.2 +/- 2.2 (jitter = 4.2) 7.769
    40 images/sec: 912.8 +/- 1.7 (jitter = 6.5) 7.915
    50 images/sec: 911.7 +/- 1.5 (jitter = 7.3) 7.888
    60 images/sec: 912.9 +/- 1.3 (jitter = 7.0) 7.707
    70 images/sec: 911.8 +/- 1.2 (jitter = 7.6) 8.011
    80 images/sec: 912.3 +/- 1.1 (jitter = 7.3) 7.779
    90 images/sec: 912.9 +/- 1.0 (jitter = 6.9) 7.805
    100 images/sec: 913.1 +/- 0.9 (jitter = 6.8) 8.034
    ----------------------------------------------------------------
    total images/sec: 912.08
    ----------------------------------------------------------------

    AlexNet BS512

    python tf_cnn_benchmarks.py --num_gpus=1 --batch_size=512 --model=alexnet
    Step	Img/sec	total_loss
    1 images/sec: 4824.0 +/- 0.0 (jitter = 0.0) nan
    10 images/sec: 4804.0 +/- 5.9 (jitter = 23.3) nan
    20 images/sec: 4802.3 +/- 4.3 (jitter = 24.4) nan
    30 images/sec: 4801.7 +/- 4.4 (jitter = 24.0) nan
    40 images/sec: 4804.5 +/- 3.9 (jitter = 23.0) nan
    50 images/sec: 4805.4 +/- 4.0 (jitter = 24.4) nan
    60 images/sec: 4806.7 +/- 3.5 (jitter = 24.8) nan
    70 images/sec: 4810.1 +/- 3.4 (jitter = 24.4) nan
    80 images/sec: 4810.0 +/- 3.1 (jitter = 25.7) nan
    90 images/sec: 4810.9 +/- 2.8 (jitter = 23.4) nan
    100 images/sec: 4811.5 +/- 2.7 (jitter = 23.4) nan
    ----------------------------------------------------------------
    total images/sec: 4808.18
    ----------------------------------------------------------------

    Inception v3 BS64

    python tf_cnn_benchmarks.py --num_gpus=1 --batch_size=64 --model=inception3
    Step	Img/sec	total_loss
    1 images/sec: 255.3 +/- 0.0 (jitter = 0.0) 7.277
    10 images/sec: 254.3 +/- 0.5 (jitter = 2.2) 7.304
    20 images/sec: 254.4 +/- 0.3 (jitter = 2.4) 7.292
    30 images/sec: 254.3 +/- 0.3 (jitter = 2.3) 7.402
    40 images/sec: 254.2 +/- 0.3 (jitter = 2.3) 7.314
    50 images/sec: 254.3 +/- 0.2 (jitter = 2.3) 7.283
    60 images/sec: 254.3 +/- 0.2 (jitter = 2.2) 7.363
    70 images/sec: 254.3 +/- 0.2 (jitter = 2.1) 7.350
    80 images/sec: 254.3 +/- 0.2 (jitter = 2.2) 7.384
    90 images/sec: 254.3 +/- 0.2 (jitter = 1.9) 7.318
    100 images/sec: 254.3 +/- 0.1 (jitter = 1.9) 7.376
    ----------------------------------------------------------------
    total images/sec: 254.19
    ----------------------------------------------------------------

    VGG16 BS64

    python tf_cnn_benchmarks.py --num_gpus=1 --batch_size=64 --model=vgg16
    Step	Img/sec	total_loss
    1 images/sec: 250.0 +/- 0.0 (jitter = 0.0) 7.319
    10 images/sec: 250.2 +/- 0.2 (jitter = 0.2) 7.297
    20 images/sec: 250.4 +/- 0.1 (jitter = 0.5) 7.284
    30 images/sec: 250.4 +/- 0.1 (jitter = 0.6) 7.274
    40 images/sec: 250.4 +/- 0.1 (jitter = 0.6) 7.288
    50 images/sec: 250.4 +/- 0.1 (jitter = 0.6) 7.278
    60 images/sec: 250.3 +/- 0.1 (jitter = 0.6) 7.278
    70 images/sec: 250.3 +/- 0.1 (jitter = 0.6) 7.266
    80 images/sec: 250.3 +/- 0.1 (jitter = 0.6) 7.288
    90 images/sec: 250.2 +/- 0.1 (jitter = 0.6) 7.269
    100 images/sec: 250.3 +/- 0.1 (jitter = 0.6) 7.270
    ----------------------------------------------------------------
    total images/sec: 250.19
    ----------------------------------------------------------------

    GoogLeNet BS128

    python tf_cnn_benchmarks.py --num_gpus=1 --batch_size=128 --model=googlenet
    Step	Img/sec	total_loss
    1 images/sec: 1034.6 +/- 0.0 (jitter = 0.0) 7.105
    10 images/sec: 1034.2 +/- 0.9 (jitter = 1.8) 7.105
    20 images/sec: 1030.9 +/- 1.8 (jitter = 2.9) 7.094
    30 images/sec: 1031.0 +/- 1.3 (jitter = 4.2) 7.086
    40 images/sec: 1031.6 +/- 1.0 (jitter = 3.9) 7.067
    50 images/sec: 1030.6 +/- 0.9 (jitter = 5.4) 7.093
    60 images/sec: 1030.4 +/- 0.8 (jitter = 5.4) 7.050
    70 images/sec: 1030.6 +/- 0.8 (jitter = 5.7) 7.073
    80 images/sec: 1030.3 +/- 0.7 (jitter = 5.9) 7.078
    90 images/sec: 1030.3 +/- 0.6 (jitter = 5.6) 7.078
    100 images/sec: 1030.0 +/- 0.6 (jitter = 5.5) 7.069
    ----------------------------------------------------------------
    total images/sec: 1029.42
    ----------------------------------------------------------------

    ResNet152 BS32

    python tf_cnn_benchmarks.py --num_gpus=1 --batch_size=32 --model=resnet152
    Step	Img/sec	total_loss
    1 images/sec: 137.0 +/- 0.0 (jitter = 0.0) 9.023
    10 images/sec: 138.0 +/- 0.4 (jitter = 1.4) 8.574
    20 images/sec: 138.5 +/- 0.3 (jitter = 1.6) 8.600
    30 images/sec: 138.5 +/- 0.2 (jitter = 1.6) 8.755
    40 images/sec: 138.6 +/- 0.2 (jitter = 1.6) 8.624
    50 images/sec: 138.5 +/- 0.2 (jitter = 1.6) 8.801
    60 images/sec: 138.4 +/- 0.1 (jitter = 1.7) 8.679
    70 images/sec: 138.4 +/- 0.1 (jitter = 1.8) 9.112
    80 images/sec: 138.4 +/- 0.1 (jitter = 1.7) 8.872
    90 images/sec: 138.4 +/- 0.1 (jitter = 1.7) 9.025
    100 images/sec: 138.4 +/- 0.1 (jitter = 1.7) 8.847
    ----------------------------------------------------------------
    total images/sec: 138.39
    ----------------------------------------------------------------

    性能对比

    A100 和V100 和 2080ti 性能对比:

    https://www.tonyisstark.com/383.html

  • 相关阅读:
    关联A850刷机包 高级电源 时间中心 优化 ROOT 动力 美化 简化
    CodeForces 425E Sereja and Sets
    int有符号和无符号类型内存 -- C
    软件体系结构————防御性编程
    Hibernate各保存方法之间的差 (save,persist,update,saveOrUpdte,merge,flush,lock)等一下
    椭圆识别
    UVa 10223
    照片详细解释YUV420数据格式
    LeetCode:Reverse Integer
    看了此文你还不懂傅里叶变换,那就来掐我吧
  • 原文地址:https://www.cnblogs.com/wangpg/p/13689583.html
Copyright © 2020-2023  润新知