• 关于最近的cuda原子操作问题


    一定一定得避免原子操作,因为对于性能的影响实在是太明显了,例如,throughput从800MBps骤降至110MBps,

    看论坛是看到有人转述的一筒子的话,记录于下:

    honestly, if you're trying to do this you're probably going down the wrong path, but general rules of thumb are

    - don't have multiple threads within a warp contending for a lock, that leads to all sorts of confusing issues for most people because inter-warp branches are not the same as intra-warp branches
    - avoid global memory contention as much as possible (e.g., if you need to have a critical section among all warps in all CTAs, do per-CTA shared memory locks then a global lock)
    - traditional threading primitives implemented with atomics are a pretty terrible idea, if you can avoid atomics as much as possible (or entirely) you can get a big perf win (and there are very interesting ways you can do this, and when I say big perf win, I mean on the order of 5-10x)

    ("well," you think, "it sounds like tim is speaking from experience!" oh yes, I am)

  • 相关阅读:
    今天还要去一次北仑
    重归漫漫长路
    双休日,累
    调整心情,迎接新的挑战
    多喝点水,对身体有好处
    丈人生病住院了
    WPF,DataGrid数据绑定
    AXIS2简介
    心事一件件的了掉,希望一切都能恢复到正常
    驾车是种乐趣,也是种累
  • 原文地址:https://www.cnblogs.com/superniaoren/p/2121837.html
Copyright © 2020-2023  润新知