H-Index

这题看起来很妙啊,因为解法实在很多,刚看第一眼觉得肯定是二分答案的做法,结果LC给的答案里面都没有二分的这种方法而是排序和哈希,首先是自己的二分答案的算法,这里的下界是0,上界就是引用率最大值

class Solution(object):
    def hIndex(self, citations):
        """
        :type citations: List[int]
        :rtype: int
        """
        start, end = 0, max(citations)
        n = len(citations)
        result = None

        while start <= end:
            mid = start + (end - start) / 2
            if self.check(citations, mid):
                start = mid + 1
                result = mid
            else:
                end = mid - 1

        return result

    def check(self, citations, h):
        return sum((1 for paper in citations if paper >= h)) >= h

还可以注意到,这里实际上找的H index就是倒序排序之后数组能围成的最大长方形的边长,再加上这道题输入的范围限制在了正整数和0,counting sort自然是一个好方法了,尤其在这两道题里,上界不是数组的最大值,而是数组最大值和数组长度的较小值(因为h index的定义决定的),所以我们这里对一切引用次数超过数组长度的项直接切到数组长度来处理,再加上找的是满足条件的解,所以桶建立了之后我们必须倒序查找,总之这两道题的关键在于认清上界不是最大值

class Solution(object):
    def hIndex(self, citations):
        """
        :type citations: List[int]
        :rtype: int
        """
        if not citations:
            return 0

        n = len(citations)
        tmp = [0] * (n + 1)
        for citation in citations:
            tmp[min(citation, n)] += 1

        result = 0
        for i in xrange(len(tmp) - 1, -1, -1):
            if i != len(tmp) - 1:
                tmp[i] += tmp[i + 1]
            if tmp[i] >= i:
                return i

        return 0

H-Index II

这里先用了二分答案之后再二分搜索的方法,时间复杂度是O(lgn^2)但是还是超时了

class Solution(object):
    def hIndex(self, citations):
        """
        :type citations: List[int]
        :rtype: int
        """
        if not citations:
            return 0

        start, end = 0, max(citations)
        n = len(citations)
        result = 0

        while start <= end:
            mid = start + (end - start) / 2
            index = bisect.bisect_left(citations, mid)
            if n - index >= mid:
                result = mid
                start = mid + 1
            else:
                end = mid - 1

        return result

但是这里其实可以发现一个问题,如果是一个排序了的数组,我们这里实际上还是二分的下标而不是元素,因为我们能到达的上界还是n - 1的下标,所以我们这里可以直接用普通二分法,但这里二分的时候只能以citations比较,如果以n - mid这样来做比较值答案就不对了,不是很懂为什么非要保持这个顺序才行

class Solution(object):
    def hIndex(self, citations):
        """
        :type citations: List[int]
        :rtype: int
        """
        n = len(citations)
        start, end = 0, n - 1

        while start <= end:
            mid = start + (end - start) / 2
            if citations[mid] >= n - mid:
                end = mid - 1
            else:
                start = mid + 1
        return n - start

上面这个版本对于两值相等的情况很不好理解,因为相等照道理我们应该把下标往上移才对,而这里选择了往下,下面的版本更加清楚,当两值相等时我们不用再二分可以直接返回答案了(思考一下为什么)

class Solution(object):
    def hIndex(self, citations):
        """
        :type citations: List[int]
        :rtype: int
        """
        n = len(citations)
        start, end = 0, n - 1

        while start <= end:
            mid = start + (end - start) / 2
            if citations[mid] == n - mid:
                return n - mid
            elif citations[mid] > n - mid:
                end = mid - 1
            else:
                start = mid + 1
        return n - start

results matching ""

    No results matching ""