首页 > 其他分享 >A Go library implementing an FST (finite state transducer)——mark下

A Go library implementing an FST (finite state transducer)——mark下

时间:2023-07-04 18:32:01浏览次数:59  
标签:err nil builder library mark implementing key Fatal log

https://github.com/couchbaselabs/vellum

Building an FST

To build an FST, create a new builder using the New() method. This method takes an io.Writer as an argument. As the FST is being built, data will be streamed to the writer as soon as possible. With this builder you MUST insert keys in lexicographic order. Inserting keys out of order will result in an error. After inserting the last key into the builder, you MUST call Close() on the builder. This will flush all remaining data to the underlying writer.

In memory:

var buf bytes.Buffer
  builder, err := vellum.New(&buf, nil)
  if err != nil {
    log.Fatal(err)
  }

To disk:

f, err := os.Create("/tmp/vellum.fst")
  if err != nil {
    log.Fatal(err)
  }
  builder, err := vellum.New(f, nil)
  if err != nil {
    log.Fatal(err)
  }

MUST insert keys in lexicographic order:

err = builder.Insert([]byte("cat"), 1)
if err != nil {
  log.Fatal(err)
}

err = builder.Insert([]byte("dog"), 2)
if err != nil {
  log.Fatal(err)
}

err = builder.Insert([]byte("fish"), 3)
if err != nil {
  log.Fatal(err)
}

err = builder.Close()
if err != nil {
  log.Fatal(err)
}

Using an FST

After closing the builder, the data can be used to instantiate an FST. If the data was written to disk, you can use the Open()method to mmap the file. If the data is already in memory, or you wish to load/mmap the data yourself, you can instantiate the FST with the Load() method.

Load in memory:

fst, err := vellum.Load(buf.Bytes())
  if err != nil {
    log.Fatal(err)
  }

Open from disk:

fst, err := vellum.Open("/tmp/vellum.fst")
  if err != nil {
    log.Fatal(err)
  }

Get key/value:

val, exists, err = fst.Get([]byte("dog"))
  if err != nil {
    log.Fatal(err)
  }
  if exists {
    fmt.Printf("contains dog with val: %d\n", val)
  } else {
    fmt.Printf("does not contain dog")
  }

Iterate key/values:

itr, err := fst.Iterator(startKeyInclusive, endKeyExclusive)
  for err == nil {
    key, val := itr.Current()
    fmt.Printf("contains key: %s val: %d", key, val)
    err = itr.Next()
  }
  if err != nil {
    log.Fatal(err)
  }

How does the FST get built?

A full example of the implementation is beyond the scope of this README, but let's consider a small example where we want to insert 3 key/value pairs.

First we insert "are" with the value 4.

Next, we insert "ate" with the value 2.

Notice how the values associated with the transitions were adjusted so that by summing them while traversing we still get the expected value.

At this point, we see that state 5 looks like state 3, and state 4 looks like state 2. But, we cannot yet combine them because future inserts could change this.

Now, we insert "see" with value 3. Once it has been added, we now know that states 5 and 4 can longer change. Since they are identical to 3 and 2, we replace them.

Again, we see that states 7 and 8 appear to be identical to 2 and 3.

Having inserted our last key, we call Close() on the builder.

A Go library implementing an FST (finite state transducer)——mark下_ico

Now, states 7 and 8 can safely be replaced with 2 and 3.

For additional information, see the references at the bottom of this document.

标签:err,nil,builder,library,mark,implementing,key,Fatal,log
From: https://blog.51cto.com/u_11908275/6624117

相关文章

  • markdown语法
    #一级标题##二级标题###三级标题---**粗体内容1**__粗体内容2__*斜体内容1*_斜体内容2_***斜粗体内容1***___斜粗体内容2___~~删除线~~分段>引用1>>引用2*列表项1*子项*子项*列表项2*列表项31.列表项11.列表项21.列表......
  • Markdown插入图片
    插入图片通过将本地或者网络上的图片往markdown文件传入图片时,都可能会存在因图片资源缺失或者防盗链等问题,图片显示不出来。而通过使用base64的编码将图片嵌入文档中,可解决阅读时图片可能显示不出的问题。转换工具https://kz16.top/png2base64.htmlbase64代码太长可采......
  • Markdown折叠内容
    折叠内容HTML <details> 标签指定了用户可以根据需要打开和关闭的额外细节。语法:<details><summary>Title</summary>contents...</details>标签介绍参考如下:details:折叠语法标签summary:折叠语法展示的摘要内容里面可以嵌套使用Markdown语法和HTML语法效果......
  • 使用 Benchmark.NET 测试代码性能
    今天,我们将研究如何使用Benchmark.Net来测试代码性能。借助基准测试,我们可以创建基准来验证所做的更改是否按预期工作并且不会导致性能下降。并非每个项目都需要进行基准测试,但是如果您正在开发的是NuGet程序包或通用dll,则很有意义。 今天,我们将研究如何......
  • Typora实现Markdown标题自动编号
    1、背景Typora编写Markdown时,各级标题需要手动维护编号,如果标题顺序有调整,需要依次手工重新修改编号,特别是多级标题都要调整的话,更是异常麻烦!昨天在网上看到一个通过修改Typora风格主题的css文件实现自动编号的方法,试用之后感觉非常nice,再也不用管编号了,简直不要太爽!2、原文在此......
  • Markdown语法学习
    Markdown语法标题+空格为一级标题,#数量随级数递增最高六级字体Hello,World!粗体(两个*)Hello,World!斜体(一个*)Hello,World!斜体加粗(三个*)Hello,World!横线(两个~)引用(一个>+空格)分割线(三个-)(三个*)图片!+[图片名字]+(图片的路径/可以是本地地址/也可以是网络地址)超链......
  • # Day01 Markdown学习 ##
    Day01Markdown学习标题对应于Ctrl+1234,或者对应数量的#+""+标题名字体哈哈哈哈哈哈用对应数量的*Ctrl+u=下划线+b=粗体+i=斜体哈哈~~表示划线引用不乱于心,不困于情。不畏将来,不念过往。如此,安好。用>+""+话语 分割线用三个-或者三个*图片用英文的!......
  • Quote Driven Market
    http://www.investopedia.com/terms/q/quotedriven.asp#axzz2FID9sqge QuoteDrivenMarketDefinitionof'QuoteDrivenMarket'Anelectronicstockexchangesysteminwhichpricesaredeterminedfrombidandaskquotationsmadebymarketmakers,deale......
  • 小工具 | cnblogs自动上传图片并生成markdown
    博客文章在本地都是用typora写的,文本可以直接复制上去,图片一个个上传太麻烦,这里推荐一个dotnet工具,给一个本地的typora文档,它会自动读取图片,上传到cnblogs,并替换掉原文档里的图片链接很方便,mark一下,工具地址为链接......
  • Markdown操作方式
    Markdown操作方式标题​ 一共分为六级书写方式:​ #(个数不同级数不同)+空格+编写内容引用​ 书写方式:>(个数不同效果不同)+空格字体​ 加粗:****在中间写文字​ 斜体:**在中间写文字​ 删除线:~~~~在中间写文字​ 高亮:====在中间写文字​ 上标:^^在中间写文字......