[LeetCode] LFU Cache 最近最不常用页面置换缓存器

[LeetCode] LFU Cache 最近最不常用页面置换缓存器
Design and implement a data structure for Least Frequently Used (LFU) cache. It should support the following operations: get and put.

get(key) - Get the value (will always be positive) of the key if the key exists in the cache, otherwise return -1.
put(key, value) - Set or insert the value if the key is not already present. When the cache reaches its capacity, it should invalidate the least frequently used item before inserting a new item. For the purpose of this problem, when there is a tie (i.e., two or more keys that have the same frequency), the least recently used key would be evicted.

Follow up:
Could you do both operations in O(1) time complexity?

Example:
```
LFUCache cache = new LFUCache( 2 /* capacity */ );

cache.put(1, 1);
cache.put(2, 2);
cache.get(1);       // returns 1
cache.put(3, 3);    // evicts key 2
cache.get(2);       // returns -1 (not found)
cache.get(3);       // returns 3.
cache.put(4, 4);    // evicts key 1.
cache.get(1);       // returns -1 (not found)
cache.get(3);       // returns 3
cache.get(4);       // returns 4
```
这道题是让我们实现最近不常用页面置换算法LFU (Least Frequently Used), 之前我们做过一道类似的题LRU Cache，让我们求最近最少使用页面置换算法LRU (Least Recnetly Used)。两种算法虽然名字看起来很相似，但是其实是不同的。顾名思义，LRU算法是首先淘汰最长时间未被使用的页面，而LFU是先淘汰一定时间内被访问次数最少的页面。光说无凭，举个例子来看看，比如说我们的cache的大小为3，然后我们按顺序存入 5，4，5，4，5，7，这时候cache刚好被装满了，因为put进去之前存在的数不会占用额外地方。那么此时我们想再put进去一个8，如果使用LRU算法，应该将4删除，因为4最久未被使用，而如果使用LFU算法，则应该删除7，因为7被使用的次数最少，只使用了一次。相信这个简单的例子可以大概说明二者的区别。

这道题比之前那道LRU的题目还要麻烦一些，因为那道题只要用个list把数字按时间顺序存入，链表底部的位置总是最久未被使用的，每次删除底部的值即可。而这道题不一样，由于需要删除最少次数的数字，那么我们必须要统计每一个key出现的次数，所以我们用一个哈希表m来记录当前数据{key, value}和其出现次数之间的映射，这样还不够，为了方便操作，我们需要把相同频率的key都放到一个list中，那么需要另一个哈希表freq来建立频率和一个里面所有key都是当前频率的list之间的映射。由于题目中要我们在O(1)的时间内完成操作了，为了快速的定位freq中key的位置，我们再用一个哈希表iter来建立key和freq中key的位置之间的映射。最后当然我们还需要两个变量cap和minFreq，分别来保存cache的大小，和当前最小的频率。

为了更好的讲解思路，我们还是用例子来说明吧，我们假设cache的大小为2，假设我们已经按顺序put进去5，4，那么来看一下内部的数据是怎么保存的，由于value的值并不是很重要，为了不影响key和frequence，我们采用value#来标记：

m:

5 -> {value5, 1}

4 -> {value4, 1}

freq:

1 -> {5，4}

iter:

4 -> list.begin() + 1

5 -> list.begin()

这应该不是很难理解，m中5对应的频率为1，4对应的频率为1，然后freq中频率为1的有4和5。iter中是key所在freq中对应链表中的位置的iterator。然后我们的下一步操作是get(5)，下面是get需要做的步骤：

1. 如果m中不存在5，那么返回-1

2. 从freq中频率为1的list中将5删除

3. 将m中5对应的frequence值自增1

4. 将5保存到freq中频率为2的list的末尾

5. 在iter中保存5在freq中频率为2的list中的位置

6. 如果freq中频率为minFreq的list为空，minFreq自增1

7. 返回m中5对应的value值

经过这些步骤后，我们再来看下此时内部数据的值：

m:

5 -> {value5, 2}

4 -> {value4, 1}

freq:

1 -> {4}

2 -> {5}

iter:

4 -> list.begin()

5 -> list.begin()

这应该不是很难理解，m中5对应的频率为2，4对应的频率为1，然后freq中频率为1的只有4，频率为2的只有5。iter中是key所在freq中对应链表中的位置的iterator。然后我们下一步操作是要put进去一个7，下面是put需要做的步骤：

1. 如果调用get(7)返回的结果不是-1，那么在将m中7对应的value更新为当前value，并返回

2. 如果此时m的大小大于了cap，即超过了cache的容量，则：

　　a）在m中移除minFreq对应的list的首元素的纪录，即移除4 -> {value4, 1}

　　b）在iter中清除4对应的纪录，即移除4 -> list.begin()

　　c）在freq中移除minFreq对应的list的首元素，即移除4

3. 在m中建立7的映射，即 7 -> {value7, 1}

4. 在freq中频率为1的list末尾加上7

5. 在iter中保存7在freq中频率为1的list中的位置

6. minFreq重置为1

经过这些步骤后，我们再来看下此时内部数据的值：

m:

5 -> {value5, 2}

7 -> {value7, 1}

freq:

1 -> {7}

2 -> {5}

iter:

7 -> list.begin()

5 -> list.begin()

参见代码如下：
```
class LFUCache {
public:
    LFUCache(int capacity) {
        cap = capacity;
    }
    
    int get(int key) {
        if (m.count(key) == 0) return -1;
        freq[m[key].second].erase(iter[key]);
        ++m[key].second;
        freq[m[key].second].push_back(key);
        iter[key] = --freq[m[key].second].end();
        if (freq[minFreq].size() == 0) ++minFreq;
        return m[key].first;
    }
    
    void put(int key, int value) {
        if (cap <= 0) return;
        if (get(key) != -1) {
            m[key].first = value;
            return;
        }
        if (m.size() >= cap) {
            m.erase(freq[minFreq].front());
            iter.erase(freq[minFreq].front());
            freq[minFreq].pop_front();
        }
        m[key] = {value, 1};
        freq[1].push_back(key);
        iter[key] = --freq[1].end();
        minFreq = 1;
    }

private:
    int cap, minFreq;
    unordered_map<int, pair<int, int>> m;
    unordered_map<int, list<int>> freq;
    unordered_map<int, list<int>::iterator> iter;
};
```
类似题目：

LRU Cache

参考资料：

https://leetcode.com/problems/lfu-cache/

https://discuss.leetcode.com/topic/69436/concise-c-o-1-solution-using-3-hash-maps-with-explanation

LeetCode All in One 题目讲解汇总(持续更新中...)
相关阅读:
分布式存储
 存储知识学习
 洛谷 P1003 铺地毯 (C/C++, JAVA)
多线程面试题系列3_生产者消费者模式的两种实现方法
 多线程面试题系列2_监视线程的简单实现
 多线程面试题系列1_数组多线程分解
 《深度学习》阅读笔记1
素数在两种常见情况下的标准最优算法
 dfs与dp算法之关系与经典入门例题
 百度之星资格赛2018B题-子串查询
原文地址：https://www.cnblogs.com/grandyang/p/6258459.html