标签：index hash key self 哈希 table 数据结构

数据结构-哈希表

1. 定义：

哈希表（也称为散列表）是一种使用哈希函数组织数据，以支持快速插入和搜索的数据结构。它通过将键映射到表中一个位置来访问记录，从而加快访问速度。

# 创建一个空字典
hash_table = {}

# 向哈希表中添加键-值对
hash_table['apple'] = 10
hash_table['banana'] = 20
hash_table['orange'] = 30

# 访问哈希表中的值
print(hash_table['apple'])  # 输出：10
print(hash_table['banana'])  # 输出：20
print(hash_table['orange'])  # 输出：30

2. 哈希函数：

哈希表的核心是哈希函数，它将输入（或'键'）转换为表中的索引。理想的哈希函数应该将键均匀分布在表中，减少碰撞（两个键映射到同一位置）的概率。

# 使用 hash() 函数计算哈希值
print(hash('hello'))  # 输出：-1740827030787550043
print(hash(42))  # 输出：42
print(hash((1, 2, 3)))  # 输出：2528502973977326415

# 字典中的键会被哈希
my_dict = {'hello': 1, 'world': 2}
print(hash('hello') == hash('hello'))  # 输出：True
print(hash('hello') == hash('world'))  # 输出：False

# 由于哈希碰撞的可能性，不同的对象可能会有相同的哈希值
print(hash(42) == hash('42'))  # 输出：True

python中哈希模块的使用

import hashlib

# 使用 SHA-256 哈希算法
hash_object = hashlib.sha256(b'hello')
hex_dig = hash_object.hexdigest()
print(hex_dig)  # 输出：2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824

3. 碰撞解决：

当两个不同的键映射到同一个位置时，会发生碰撞。解决碰撞的常见方法包括链地址法（在同一位置存储一个元素的列表）和开放寻址法（寻找另一个空槽位）。

链地址法：链地址法是一种解决哈希碰撞的方法，它在哈希表的每个槽中维护一个链表（或其他数据结构），将哈希冲突的元素都放入同一个槽中的链表中。当发生碰撞时，只需将元素插入到对应槽中的链表中即可。

class HashTable:
    def __init__(self, size):
        self.size = size
        self.table = [[] for _ in range(size)]

    def _hash_function(self, key):
        return hash(key) % self.size

    def insert(self, key, value):
        index = self._hash_function(key)
        self.table[index].append((key, value))

    def search(self, key):
        index = self._hash_function(key)
        for k, v in self.table[index]:
            if k == key:
                return v
        return None

# 创建一个大小为 10 的哈希表
hash_table = HashTable(10)

# 插入键-值对
hash_table.insert('apple', 10)
hash_table.insert('banana', 20)
hash_table.insert('orange', 30)

# 查找键对应的值
print(hash_table.search('apple'))  # 输出：10
print(hash_table.search('banana'))  # 输出：20
print(hash_table.search('orange'))  # 输出：30

开放寻址法：开放寻址法是一种解决哈希碰撞的方法，它在哈希冲突发生时不是简单地将元素放在哈希桶的下一个位置，而是通过一定的探测规则（如线性探测、二次探测等）来寻找下一个可用的位置。
常见的探测规则包括：

- 线性探测（Linear Probing）：当发生哈希冲突时，线性探测会依次检查哈希表中的下一个槽，直到找到一个空槽或者遍历完整个哈希表。

class HashTableLinearProbing:
    def __init__(self, size):
        self.size = size
        self.keys = [None] * size
        self.values = [None] * size

    def _hash_function(self, key):
        return hash(key) % self.size

    def _linear_probe(self, index):
        return (index + 1) % self.size

    def insert(self, key, value):
        index = self._hash_function(key)
        while self.keys[index] is not None:
            index = self._linear_probe(index)
        self.keys[index] = key
        self.values[index] = value

    def search(self, key):
        index = self._hash_function(key)
        while self.keys[index] is not None:
            if self.keys[index] == key:
                return self.values[index]
            index = self._linear_probe(index)
        return None

# 创建一个大小为 10 的哈希表
hash_table = HashTableLinearProbing(10)

# 插入键-值对
hash_table.insert('apple', 10)
hash_table.insert('banana', 20)
hash_table.insert('orange', 30)

# 查找键对应的值
print(hash_table.search('apple'))  # 输出：10
print(hash_table.search('banana'))  # 输出：20
print(hash_table.search('orange'))  # 输出：30

- 二次探测（Quadratic Probing）：二次探测通过一个二次方程来确定下一个探测的位置，可以避免线性探测中的“聚集”现象。

class HashTableQuadraticProbing:
    def __init__(self, size):
        self.size = size
        self.keys = [None] * size
        self.values = [None] * size

    def _hash_function(self, key):
        return hash(key) % self.size

    def _quadratic_probe(self, index, attempt):
        return (index + attempt ** 2) % self.size

    def insert(self, key, value):
        index = self._hash_function(key)
        attempt = 0
        while self.keys[index] is not None:
            attempt += 1
            index = self._quadratic_probe(index, attempt)
        self.keys[index] = key
        self.values[index] = value

    def search(self, key):
        index = self._hash_function(key)
        attempt = 0
        while self.keys[index] is not None:
            if self.keys[index] == key:
                return self.values[index]
            attempt += 1
            index = self._quadratic_probe(index, attempt)
        return None

# 创建一个大小为 10 的哈希表
hash_table = HashTableQuadraticProbing(10)

# 插入键-值对
hash_table.insert('apple', 10)
hash_table.insert('banana', 20)
hash_table.insert('orange', 30)

# 查找键对应的值
print(hash_table.search('apple'))  # 输出：10
print(hash_table.search('banana'))  # 输出：20
print(hash_table.search('orange'))  # 输出：30

- 双重散列（Double Hashing）：双重散列使用两个哈希函数来计算下一个探测的位置，可以更好地分散冲突元素。

class HashTableDoubleHashing:
    def __init__(self, size):
        self.size = size
        self.keys = [None] * size
        self.values = [None] * size

    def _hash_function1(self, key):
        return hash(key) % self.size

    def _hash_function2(self, key):
        # 使用简单的散列函数，例如取余或者乘法散列
        return 7 - (hash(key) % 7)

    def _double_hash_probe(self, index, attempt, hash2):
        return (index + attempt * hash2) % self.size

    def insert(self, key, value):
        index = self._hash_function1(key)
        hash2 = self._hash_function2(key)
        attempt = 0
        while self.keys[index] is not None:
            attempt += 1
            index = self._double_hash_probe(index, attempt, hash2)
        self.keys[index] = key
        self.values[index] = value

    def search(self, key):
        index = self._hash_function1(key)
        hash2 = self._hash_function2(key)
        attempt = 0
        while self.keys[index] is not None:
            if self.keys[index] == key:
                return self.values[index]
            attempt += 1
            index = self._double_hash_probe(index, attempt, hash2)
        return None

# 创建一个大小为 10 的哈希表
hash_table = HashTableDoubleHashing(10)

# 插入键-值对
hash_table.insert('apple', 10)
hash_table.insert('banana', 20)
hash_table.insert('orange', 30)

# 查找键对应的值
print(hash_table.search('apple'))  # 输出：10
print(hash_table.search('banana'))  # 输出：20
print(hash_table.search('orange'))  # 输出：30

使用更好的哈希函数：选择一个更好的哈希函数可以减少哈希碰撞的概率。在 Python 中，可以使用标准库中的 hashlib 模块来使用不同的哈希算法，如 SHA-256。这些哈希算法具有更好的分散性和安全性，减少了碰撞的可能性。

4. 性能：

哈希表的主要优点是数据的快速访问。理想情况下，哈希表的查找、插入和删除操作的时间复杂度都是常数时间（O(1)）。但在最坏情况下，这些操作的时间复杂度可能会上升到线性时间（O(n)），尤其是在碰撞处理得不够好时。

5. 应用场景：

哈希表广泛应用于各种场景，特别是需要快速数据访问的场合。例如，在编程语言中实现关联数组、数据库索引、缓存、集合等。

标签：index,hash,key,self,哈希,table,数据结构
From： https://www.cnblogs.com/zx-demo/p/18132868

数据结构-哈希表

数据结构-哈希表

1. 定义：

2. 哈希函数：

python中哈希模块的使用

3. 碰撞解决：

4. 性能：

5. 应用场景：

相关文章

赞助商

阅读排行