首页 > 数据库 >003-Redis 中的 BitMaps

003-Redis 中的 BitMaps

时间:2022-10-03 23:55:31浏览次数:93  
标签:end string Redis BitMaps 003 start key 字符串 bit

1. BitMap

1.1 bitcount

1.1.1 基本信息

BITCOUNT key [start end]

summary: Count set bits in a string

since: 2.6.0

Count the number of set bits (population counting) in a string. 计算字符串中的位数。

By default all the bytes contained in the string are examined. 默认情况下,将检查字符串中包含的所有字节。

It is possible to specify the counting operation only in an interval passing the additional arguments start and end. 在传递附加参数 start 和 end 时,计算指定间隔内的位数。

Like for the GETRANGE command start and end can contain negative values in order to index bytes starting from the end of the string, where -1 is the last byte, -2 is the penultimate, and so forth. 与 GETRANGE 命令相似,start 和 end 可以包含负值,以便从字符串的末尾开始索引字节,其中 -1是最后一个字节,-2是倒数第二个字节,依此类推。

Non-existent keys are treated as empty strings, so the command will return zero. 不存在的键被视为空字符串,因此该命令将返回零。

By default, the additional arguments start and end specify a byte index. We can use an additional argument BIT to specify a bit index. So 0 is the first bit, 1 is the second bit, and so forth. For negative values, -1 is the last bit, -2 is the penultimate, and so forth.

默认情况下,附加参数 start 和 end 指定一个字节索引。我们可以使用一个额外的参数 BIT 来指定位索引。0是第一位,1是第二位,依此类推。对于负值,-1是最后一位,-2是倒数第二位,依此类推。

1.1.2 返回

  • bits 集合的位数

1.1.3 练习

127.0.0.1:6379> set key1 "test"
OK
127.0.0.1:6379> bitcount key1
(integer) 17
127.0.0.1:6379> bitcount key1 1 1
(integer) 4
127.0.0.1:6379> bitcount key1 0 0
(integer) 4

1.1.4 经典使用-real-time metrics using bitmaps

Bitmaps are a very space-efficient representation of certain kinds of information. 位图是某些类型信息的一种非常节省空间的表示形式。

One example is a Web application that needs the history of user visits, so that for instance it is possible to determine what users are good targets of beta features.

如需要用户访问历史的 Web 应用程序,可以确定哪些用户是 beta 特性的好目标。

Using the SETBIT command this is trivial to accomplish, identifying every day with a small progressive integer. For instance day 0 is the first day the application was put online, day 1 the next day, and so forth.

使用 SETBIT 命令很容易完成,每天用一个小的渐进整数标识。例如,第0天是应用程序上线的第一天,第二天是第1天,以此类推。

Every time a user performs a page view, the application can register that in the current day the user visited the web site using the SETBIT command setting the bit corresponding to the current day.

每次用户访问页面时,应用程序可以使用 SETBIT 命令设置与当前日期相对应的位来标识用户在当前日期访问的网站。

Later it will be trivial to know the number of single days the user visited the web site simply calling the BITCOUNT command against the bitmap.

稍后,只要简单地对位图调用 BITCOUNT 命令,就可以知道用户访问网站的天数。

Performance considerations

In the above example of counting days, even after 10 years the application is online we still have just 365*10 bits of data per user, that is just 456 bytes per user. With this amount of data BITCOUNT is still as fast as any other O(1) Redis command like GET or INCR.

在上面计算天数的例子中,即使在应用程序上线10年后,每个用户仍然只有365 * 10位数据,也就是每个用户只有456字节。有了这样的数据量,BITCOUNT 仍然和其他任何 O (1) 的 Redis 命令(如 GET 或 INCR)一样快。

When the bitmap is big, there are two alternatives:

当位图很大时,有两种选择:

  • Taking a separated key that is incremented every time the bitmap is modified. This can be very efficient and atomic using a small Redis Lua script.

    • 获取每次修改位图时递增的分隔键。使用一个小的 RedisLua 脚本可以实现非常高效的原子化操作。
  • Running the bitmap incrementally using the BITCOUNT start and end optional parameters, accumulating the results client-side, and optionally caching the result into a key.

    • 使用 BITCOUNT 开始和结束可选参数递增地运行位图,在客户端累积结果,并可选地将结果缓存到一个键中。

1.2 bitop

1.2.1 基本信息

BITOP operation destkey key [key ...]

summary: Perform bitwise operations between strings

since: 2.6.0

Perform a bitwise operation between multiple keys (containing string values) and store the result in the destination key. 在多个键之间执行位操作(包含字符串值)并将结果存储在目标键中。

The BITOP command supports four bitwise operations: AND, OR, XOR and NOT, thus the valid forms to call the command are:

BITOP 命令支持四个按位操作: AND、 OR、 XOR 和 NOT,因此调用该命令的有效形式是:

  • BITOP AND destkey srckey1 srckey2 srckey3 ... srckeyN
  • BITOP OR destkey srckey1 srckey2 srckey3 ... srckeyN
  • BITOP XOR destkey srckey1 srckey2 srckey3 ... srckeyN
  • BITOP NOT destkey srckey

As you can see NOT is special as it only takes an input key, because it performs inversion of bits so it only makes sense as a unary operator.

正如你所看到的,NOT 是特殊的,因为它只接受一个输入键,因为它执行位的倒置,所以它只有作为一元运算符才有意义。

The result of the operation is always stored at destkey.

操作的结果总是存储在 destkey 中。

When an operation is performed between strings having different lengths, all the strings shorter than the longest string in the set are treated as if they were zero-padded up to the length of the longest string.

当在具有不同长度的字符串之间执行操作时,集合中所有短于最长字符串的字符串都被视为零填充到最长字符串的长度

The same holds true for non-existent keys, that are considered as a stream of zero bytes up to the length of the longest string.

对于不存在的键也是如此,它们被认为是零字节流,直到最长字符串的长度为止。

1.2.2 返回

  • 存储在目标键中的字符串的大小,等于最长输入字符串的大小

1.2.3 练习

127.0.0.1:6379> bitop and key3 key1 key2
(integer) 5
127.0.0.1:6379> bitop or key4 key1 key2
(integer) 5
127.0.0.1:6379> bitop xor key5 key1 key2
(integer) 5

1.2.4 经典使用-real time metrics using bitmaps

BITOP is a good complement to the pattern documented in the BITCOUNT command documentation. Different bitmaps can be combined in order to obtain a target bitmap where the population counting operation is performed.

BITOP 是对 BITCOUNT 命令文档中记录的模式的良好补充。可以组合不同的位图,以获得执行计数操作的目标位图。

Performance considerations

BITOP is a potentially slow command as it runs in O(N) time. Care should be taken when running it against long input strings.

BITOP 是一个潜在的慢命令,因为它在 O (N)时间内运行。在针对长输入字符串运行它时应该小心。

For real-time metrics and statistics involving large inputs a good approach is to use a replica (with read-only option disabled) where the bit-wise operations are performed to avoid blocking the master instance.

对于涉及大量输入的实时度量和统计信息,一种好的方法是使用一个副本(禁用只读选项) ,其中执行位操作以避免阻塞主实例。

1.3 bitpos

1.3.1 基本信息

BITPOS key bit [start] [end]

summary: Find first bit set or clear in a string

since: 2.8.7

Return the position of the first bit set to 1 or 0 in a string. 返回字符串中第一个1或0的位置。

The position is returned, thinking of the string as an array of bits from left to right, where the first byte's most significant bit is at position 0, the second byte's most significant bit is at position 8, and so forth.

返回位置,将字符串想象成一个从左到右的位数组,其中第一个字节的最高有效位位于位置0,第二个字节的最高有效位位于位置8,依此类推。

The same bit position convention is followed by GETBIT and SETBIT.

GETBIT 和 SETBIT 遵循相同的位位置约定。

By default, all the bytes contained in the string are examined. It is possible to look for bits only in a specified interval passing the additional arguments start and end (it is possible to just pass start, the operation will assume that the end is the last byte of the string. However there are semantic differences as explained later). By default, the range is interpreted as a range of bytes and not a range of bits, so start=0 and end=2 means to look at the first three bytes.

默认情况下,将检查字符串中包含的所有字节。传递附加参数 start 和 end (可以只传递 start,操作将假定 end 是字符串的最后一个字节时,在指定范围内查找。然而,正如后面解释的那样,存在语义上的差异)。默认情况下,该范围被解释为一个字节范围,而不是一个位范围,因此 start = 0和 end = 2表示查看前三个字节。

You can use the optional BIT modifier to specify that the range should be interpreted as a range of bits. So start=0 and end=2 means to look at the first three bits.

您可以使用可选的 BIT 修饰符来指定应该将范围解释为一个位范围。Start = 0 and end = 2表示查看前三位。

Note that bit positions are returned always as absolute values starting from bit zero even when start and end are used to specify a range.

请注意,即使在开始和结束用于指定范围时,位位置始终作为从位零开始的绝对值返回。

Like for the GETRANGE command start and end can contain negative values in order to index bytes starting from the end of the string, where -1 is the last byte, -2 is the penultimate, and so forth. When BIT is specified, -1 is the last bit, -2 is the penultimate, and so forth.

与 GETRANGE 命令相似,start 和 end 可以包含负值,以便从字符串的末尾开始索引字节,其中 -1是最后一个字节,-2是倒数第二个字节,依此类推。如果指定了 BIT,则 -1为最后一位,-2为倒数第二位,依此类推。

Non-existent keys are treated as empty strings.

不存在的键被视为空字符串。

1.3.2 返回

  • 根据请求返回设置为1或0的第一个位的位置。
  • 如果查找位(位参数为1) ,并且字符串为空或仅由零个字节组成,则返回 -1。
  • 如果查找位(位参数为0) ,并且字符串只包含设置为1的位,那么函数将返回第一个位,而不是右边字符串的一部分。因此,如果字符串设置为值0xff 的三个字节,那么命令 BITPOS key 0将返回24,因为到位23为止,所有位都是1。
  • 如果查找位并指定没有范围或只指定开始参数,那么该函数将字符串的右边用零填充。
  • 如果在指定的范围内没有找到位,则当用户指定了明确的范围并且该范围内没有0位时,函数返回 -1。

1.3.3 练习

127.0.0.1:6379> bitpos key2 1
(integer) 1
127.0.0.1:6379> bitpos key2 0
(integer) 0
127.0.0.1:6379> bitpos key2 0 3 4
(integer) 24

1.4 getbit

1.4.1 基本信息

GETBIT key offset

summary: Returns the bit value at offset in the string value stored at key

since: 2.2.0

Returns the bit value at offset in the string value stored at key.

返回字符串中偏移量处的位值。

When offset is beyond the string length, the string is assumed to be a contiguous space with 0 bits. When key does not exist it is assumed to be an empty string, so offset is always out of range and the value is also assumed to be a contiguous space with 0 bits.

当偏移量超过字符串长度时,假定字符串是一个0位的连续空间。当键不存在时,它被假定为一个空字符串,因此偏移量总是超出范围,并且值也被假定为一个0位的连续空间。

1.4.2 返回

  • 存储在偏移量处的位值

1.4.3 练习

127.0.0.1:6379> getbit key2 2
(integer) 1
127.0.0.1:6379> getbit key2 100
(integer) 0

1.5 setbit

1.5.1 基本信息

SETBIT key offset value

summary: Sets or clears the bit at offset in the string value stored at key

since: 2.2.0

Sets or clears the bit at offset in the string value stored at key. 设置或清除存储在字符串偏移量处的位。

The bit is either set or cleared depending on value, which can be either 0 or 1.

根据 bit (可以是0或1)设置或清除位。

When key does not exist, a new string value is created. The string is grown to make sure it can hold a bit at offset. The offset argument is required to be greater than or equal to 0, and smaller than 2^32 (this limits bitmaps to 512MB). When the string at key is grown, added bits are set to 0.

当键不存在时,将创建一个新的字符串值。增大字符串以确保它可以保持偏移位。偏移量参数必须大于或等于0,小于2 ^ 32(这将位映射限制为512MB)。当键处的字符串增长时,添加的位被设置为0。

1.5.2 返回

  • 存储在偏移量处的原始位值

1.5.3 练习

127.0.0.1:6379> setbit bitkey 0 1
(integer) 0
127.0.0.1:6379> getbit bitkey 0
(integer) 1

1.5.4 经典使用-accessing the entire bitmap

There are cases when you need to set all the bits of single bitmap at once, for example when initializing it to a default non-zero value. It is possible to do this with multiple calls to the SETBIT command, one for each bit that needs to be set. However, so as an optimization you can use a single SET command to set the entire bitmap.

在某些情况下,您需要一次设置单个位图的所有位,例如,将其初始化为默认的非零值时。可以通过对 SETBIT 命令的多次调用来实现这一点,每个调用对应一个需要设置的位。但是,作为一种优化,您可以使用一个 SET 命令来设置整个位图。

Bitmaps are not an actual data type, but a set of bit-oriented operations defined on the String type (for more information refer to the Bitmaps section of the Data Types Introduction page). This means that bitmaps can be used with string commands, and most importantly with SET and GET.

位图不是实际的数据类型,而是在 String 类型上定义的一组面向位的操作。这意味着位图可以与字符串命令一起使用,最重要的是与 SET 和 GET 一起使用。

Because Redis' strings are binary-safe, a bitmap is trivially encoded as a bytes stream. The first byte of the string corresponds to offsets 0..7 of the bitmap, the second byte to the 8..15 range, and so forth.

因为 Redis 的字符串是二进制安全的,所以位图被简单地编码为字节流。字符串的第一个字节对应于偏移量0..7的位图,第二个字节的8..15范围,等等。

1.5.5 经典使用-setting multiple bits

SETBIT excels at setting single bits, and can be called several times when multiple bits need to be set. To optimize this operation you can replace multiple SETBIT calls with a single call to the variadic BITFIELD command and the use of fields of type u1.

SETBIT 擅长设置单个位,在需要设置多个位时可以多次调用。要优化这个操作,您可以用一个对可变 BITFIELD 命令的调用替换多个 SETBIT 调用,并使用 u1类型的字段。

127.0.0.1:6379> BITFIELD bitsinabitmap SET u1 2 1 SET u1 3 1 SET u1 5 1 SET u1 10 1 SET u1 11 1 SET u1 14 1
1) (integer) 0
2) (integer) 0
3) (integer) 0
4) (integer) 0
5) (integer) 0
6) (integer) 0

1.5.6 经典使用-accessing bitmap ranges

It is also possible to use the GETRANGE and SETRANGE string commands to efficiently access a range of bit offsets in a bitmap. Below is a sample implementation in idiomatic Redis Lua scripting that can be run with the EVAL command:

还可以使用 GETRANGE 和 SETRANGE 字符串命令有效地访问位图中的位偏移量范围。下面是惯用的 Redis Lua 脚本的一个示例实现,可以使用 EVAL 命令运行它:

--[[
Sets a bitmap range

Bitmaps are stored as Strings in Redis. A range spans one or more bytes,
so we can call [`SETRANGE`](/commands/setrange) when entire bytes need to be set instead of flipping
individual bits. Also, to avoid multiple internal memory allocations in
Redis, we traverse in reverse.
Expected input:
  KEYS[1] - bitfield key
  ARGV[1] - start offset (0-based, inclusive)
  ARGV[2] - end offset (same, should be bigger than start, no error checking)
  ARGV[3] - value (should be 0 or 1, no error checking)
]]--

-- A helper function to stringify a binary string to semi-binary format
local function tobits(str)
  local r = ''
  for i = 1, string.len(str) do
    local c = string.byte(str, i)
    local b = ' '
    for j = 0, 7 do
      b = tostring(bit.band(c, 1)) .. b
      c = bit.rshift(c, 1)
    end
    r = r .. b
  end
  return r
end

-- Main
local k = KEYS[1]
local s, e, v = tonumber(ARGV[1]), tonumber(ARGV[2]), tonumber(ARGV[3])

-- First treat the dangling bits in the last byte
local ms, me = s % 8, (e + 1) % 8
if me > 0 then
  local t = math.max(e - me + 1, s)
  for i = e, t, -1 do
    redis.call('SETBIT', k, i, v)
  end
  e = t
end

-- Then the danglings in the first byte
if ms > 0 then
  local t = math.min(s - ms + 7, e)
  for i = s, t, 1 do
    redis.call('SETBIT', k, i, v)
  end
  s = t + 1
end

-- Set a range accordingly, if at all
local rs, re = s / 8, (e + 1) / 8
local rl = re - rs
if rl > 0 then
  local b = '\255'
  if 0 == v then
    b = '\0'
  end
  redis.call('SETRANGE', k, rs, string.rep(b, rl))
end

2. REDIS BITMAPS – FAST, EASY, REALTIME METRICS

以下内容阅读自Redis BitMap

BitMap 是由0和1组成的数组。位图中的位可以设置为0或1,数组中的每个位置都被称为偏移量。位图可以进行 AND、OR、XOR 或其他位运算操作。

2.1 Population Count

位图可以使用设置为1的位数来进行统计计数。

img

2.2 Redis 中的 BitMaps

Redis 允许设置二进制键和二进制值。位图只是二进制值。Setbit (键、偏移量、值)操作需要 O (1)时间,将给定键的指定偏移量处的位的值设置为0或1。

2.2.1 日活用户

假设计算每天登录的去重后用户,使用位图来构建数据结构,其中用偏移量标识每个用户,当用户访问一个页面或执行某个操作时,将代表用户ID的偏移量处的位设置为1。位图的键是用户执行的操作和时间戳。

img

在这个例子中,每次用户登录时,都要执行一个 redis.setbit (daily _ active _ users,user _ id,1)。这将 daily _ active _ users 位图中的适当偏移量翻转为1。这是一个 O (1)运算。键是 20220818,值为01011111。

由于日常活跃用户每天都会发生变化,需要每天创建一个新的位图。通过简单地将日期追加到位图键来实现这一点。

  • 如果想要计算在给定的一天内在一个音乐应用程序中至少播放了1首歌曲的每日独立用户,可以设置要播放的键名: yyyy-mm-dd。
  • 如果想要计算每小时播放一首歌曲的唯一用户数,可以命名播放的键名: yyyy-mm-dd-hh。

要计算每周或每月的指标,可以简单地计算所有每日位图在一周或一个月的联合,然后计算得到的位图的计数。

2.2.2 优化

  • 在 Redis 缓存计算出来的每日、每周、每月的数字,从而优化每周和每月的计算

  • 缓存的另一个好处是,它允许快速的队列分析

    • 例如每周唯一用户,也是移动用户ーー移动用户位图与每周活跃用户位图的交集。
    • 如果想计算过去 n 天内滚动的唯一用户数,那么缓存每日唯一计数就很容易,只需从缓存中获取过去 n-1天的数据,并将其与实时每日计数结合起来

2.2.3 样例代码

// 计算给定用户操作和日期的去重后用户。
import redis.clients.jedis.Jedis;
import java.util.BitSet;
...
  Jedis redis = new Jedis("localhost");
...
  public int uniqueCount(String action, String date) {
    String key = action + ":" + date;
    BitSet users = BitSet.valueOf(redis.get(key.getBytes()));
    return users.cardinality();
  }
// 计算给定用户操作的去重用户和日期列表
import redis.clients.jedis.Jedis;
import java.util.BitSet;
...
  Jedis redis = new Jedis("localhost");
...
  public int uniqueCount(String action, String... dates) {
    BitSet all = new BitSet();
    for (String date : dates) {
      String key = action + ":" + date;
      BitSet users = BitSet.valueOf(redis.get(key.getBytes()));
      all.or(users);
    }
    return all.cardinality();
  }

标签:end,string,Redis,BitMaps,003,start,key,字符串,bit
From: https://www.cnblogs.com/autumncat/p/16751607.html

相关文章

  • Redis学习目录
     目录持续更新...​​Redis简介​​​​Redis安装及基本配置​​​​Redis持久化​​​​Redis开发及管理实战​​​​Redis高可用及集群​​​​Redis多API开发​​ ......
  • Docker 启动 Redis 就停止解决方案(2022-3)
    启动命令如下:dockerrun-itd\-p6379:6379\--namemyredis\-v/home/redis/redis.conf:/etc/redis/redis.conf\-v/home/redis/data:/data\redis:latest\re......
  • 002-Redis的 String 使用
    Redis命令十分丰富,包括的命令组有Cluster、Connection、Geo、Hashes、HyperLogLog、Keys、Lists、Pub/Sub、Scripting、Server、Sets、SortedSets、Strings、Transactions......
  • Ipyton使用笔记[1003]
    第一次使用:字符串操作   In[1]:lst=[11,12,13,7,1,3,2,2,5,6,10,7]In[2]:lstOut[2]:[11,12,13,7,1,3,2,2,5,6,10,7]In[3]:lst1=[11,12,13,......
  • 总结1003
    ##用户交互交互的本质就是输入、输出关键字inputprint或者是output##格式化输出关键字占位符%s%d特殊方法\n\a等不需要使特殊符号起作用是前面加r##算术......
  • redis 允许远程访问
    1.取消绑定本地地址找到redis配置文件,redis.conf,注释掉指定的bind,当不指定时表示允许所有访问。   2.关闭保护模式在redis服务器上使用redis-cli,执行命令C......
  • Redis高性能怎么做到的?
    Redis的高性能怎么做到的?Redis这个NOSQL数据库在计算机界可谓是无人不知,无人不晓。只要涉及到数据那么就需要数据库,数据库类型很多,但是NOSQL的kv内存数据库也很多,redis作为......
  • Redis入门(四):springboot整合redis
    案例一​​demo​​​为​​chnx/springboot/redis01​​创建springboot项目,导入redis依赖<dependency><groupId>org.springframework.boot</groupId><artifactId>s......
  • Redis6
    Redis数据结构redis-cli-hhost-pport-apasswordhost:远程redis服务器hostport:远程redis服务端口password:远程redis服务密码(无密码的的话就不需要-a参数了)......
  • redis 缓存的模式
    一:读1:缓存边缘化(cacheaside)应用程序先读取缓存,如果缓存没有,再去读数据库,然后更新缓存   2:通读(Read-through)在上面的基础上抽象一层缓存层,让缓存层去读缓存数据......