Given an array of citations (each citation is a non-negative integer) of a researcher, write a function to compute the researcher's h-index.
According to the definition of h-index on Wikipedia: "A scientist has index h if h of his/her N papers have at least h citations each, and the other N − h papers have no more than h citations each."
Example:
Input:citations = [3,0,6,1,5]
Output: 3 Explanation:[3,0,6,1,5]
means the researcher has5
papers in total and each of them had received3, 0, 6, 1, 5
citations respectively. Since the researcher has3
papers with at least3
citations each and the remaining two with no more than3
citations each, her h-index is3
.
Note: If there are several possible values for h, the maximum one is taken as the h-index.
H指数(H index)是一个混合量化指标,可用于评估研究人员的学术产出数量与学术产出水平
可以按照如下方法确定某人的H指数:
将其发表的所有SCI论文按被引次数从高到低排序;
从前往后查找排序后的列表,直到某篇论文的序号大于该论文被引次数。所得序号减一即为H指数。
解法1: 先将数组排序,T:O(nlogn), S:O(1)。然后对于每个引用次数,比较大于该引用次数的文章,取引用次数和文章数的最小值,即 Math.min(citations.length-i, citations[i]),并更新 level,取最大值。排好序之后可以用二分查找进行遍历,这样速度会更快,可见:275. H-Index II H指数 II
解法2: Counting sort,T:O(n), S:O(n)。使用一个大小为 n+1 的数组count统计引用数,对于count[i]表示的是引用数为 i 的文章数量。从后往前遍历数组,当满足 count[i] >= i 时,i 就是 h 因子,返回即可,否则返回0。
为什么要从后面开始遍历? 为什么 count[i] >= i 时就返回?
一方面引用数引用数大于 i-1 的数量是i-1及之后的累加,必须从后往前遍历。另一方面,h 因子要求尽可能取最大值,而 h 因子最可能出现最大值的地方在后面,往前值只会越来越小,能尽快返回就尽快返回,所以一遇到 count[i] >= i 就返回。参考:Code_Granker
Java:
public class Solution { public int hIndex(int[] citations) { Arrays.sort(citations); int level = 0; for(int i = 0; i < citations.length; i++) level = Math.max(level,Math.min(citations.length - i,citations[i])); return level; } }
Java:
public class Solution { public int hIndex(int[] citations) { int n = citations.length; int[] count = new int[n + 1]; for(int c : citations) if(c >= n) count[n]++; //当引用数大于等于 n 时,都计入 count[n]中 else count[c]++; for(int i = n; i > 0; i--) { //从后面开始遍历 if(count[i] >= i) return i; count[i-1] += count[i]; //引用数大于 i-1 的数量是i-1及之后的累加 } return 0; } }
Python: Counting sort.
class Solution(object): def hIndex(self, citations): """ :type citations: List[int] :rtype: int """ n = len(citations); count = [0] * (n + 1) for x in citations: # Put all x >= n in the same bucket. if x >= n: count[n] += 1 else: count[x] += 1 h = 0 for i in reversed(xrange(0, n + 1)): h += count[i] if h >= i: return i return h
Python: T: O(nlogn) O: O(1)
class Solution2(object): def hIndex(self, citations): """ :type citations: List[int] :rtype: int """ citations.sort(reverse=True) h = 0 for x in citations: if x >= h + 1: h += 1 else: break return h
Python: T: O(nlogn) O: O(n)
class Solution3(object): def hIndex(self, citations): """ :type citations: List[int] :rtype: int """ return sum(x >= i + 1 for i, x in enumerate(sorted(citations, reverse=True)))
Python:
class Solution(object): def hIndex(self, citations): """ :type citations: List[int] :rtype: int """ if not citations: return 0 return max([min(i + 1, c) for i, c in enumerate(sorted(citations, reverse=True))])
C++:
class Solution { public: int hIndex(vector<int>& citations) { sort(citations.begin(), citations.end(), greater<int>()); for (int i = 0; i < citations.size(); ++i) { if (i >= citations[i]) return i; } return citations.size(); } };
类似题目:
[LeetCode] 275. H-Index II H指数 II