首页 > 其他分享 >coca flex (variable length) queries

coca flex (variable length) queries

时间:2024-10-12 17:00:55浏览次数:11  
标签:searches flex NOUN money away PUT length words variable

 

LIST display: flex (variable length) queries

You can now do searches where there are a variable number of "slots". For example, the search:

PUT (NOUN){3} away  (click to run the query)

would find strings with PUT at the beginning and away at the end, with up to three words between, at least one of which has to be a NOUN. In other words, it would do the following seven searches, one right after another, and would then display the results for all of the searches on one page.

  Searches (done one right after another) Matching strings

1

PUT     away

put away  (no words in between)

2

PUT  NOUN  away

put toys away

3

PUT  * NOUN  away

put the toys away

4

PUT  NOUN *  away

put toys far away

5

PUT  * * NOUN  away

put the fun toys away

6

PUT  * NOUN *  away

put the toys far away

7

PUT  NOUN * *  away

put toys and crayons away

In terms of search syntax, note that:

1. {n} indicates the number of words (0 to n) that can be in this "variable length" string. Valid numbers are 1, 2, or 3 (in other words, the longest variable length string is three words)

2. If you don't indicate {n} -- for example (NOUN) -- then it would be just one word -- meaning that it will be either that one word or nothing

3. Any "slot" without parentheses around it is obligatory. For example, put * away would not match put away, since * doesn't have parentheses around it.

4. You can't include multiple "flex" operators in a search. For example, they (VERB+}{2} notice (NOUN){3} would not be possible.

The following are some additional searches. They produce interesting results in the one billion word COCA corpus), but the results in other corpora may not be as good. In each case, we show a few sample matching strings, and some strings that would not be generated by the search (and why not).

Sample search (click to run) What WOULD be matched What would NOT be matched
might (*) know

might know
might never know

might never really know (without {}, matches at most one word)
 

was (really) interesting

was interesting (really is optional)
was really interesting

was very interesting (not really)
was not really interesting (too many words)

BE (NEG) worried

is worried (NEG is optional)
are n't worried

is really worried (not NEG)
is n't so worried (two words, search is max of 1)

made (*){3} money

made more money ( {3} means 0-3 words)
made a lot more money (max of 3 words)

made quite a bit of money (4 words; max of 3)

take * (NOUN){2} away

take it away (it from *, which is not optional; no other words from {2}, since 0-2 words)
take the money away (the from *, money (one slot) from {2})
take even more money away (even from *, more money (two slots) from {2})

take away (* forces at least one word)
take it quickly away (no NOUN)
take even more easy money away (more easy money = 3 words)

(VERB+){3} NOTICE_v

was noticing
had never even noticed (VERB+ matches any verb, including do, be, have; VERB is only lexical verbs)

sometimes notice (no VERB+)
had never even ever noticed (4 words; max of 3)

 Some additional notes:

1. Because a "flex search" had involve up to seven different searches (see above), there are some limits on the number of flex searches in a given 24 hour period. For those who do not have a premium or academic license, there is a limit of five flex searches in 24 hours. Those who do have a license can do up to 50 flex searches in a 24 hour period.

2. Again, because of the number of searches that are done in a flex search, it would take a long time to do these searches if all of the "slots" are high frequency. This can be a real limitation in very large corpora like NOW (19+ billion words) or iWeb (14 billion words). So a search like HAVE (ADJ){3} time probably won't work in those corpora -- HAVE and time are too high of frequency. In a case like this, you will probably need to do these as a series of separate searches -- HAVE time, HAVE * time, HAVE * ADJ time, etc. But again, this should be a problem with a small corpus like the BNC.

 

标签:searches,flex,NOUN,money,away,PUT,length,words,variable
From: https://www.cnblogs.com/hhdom/p/18460885

相关文章

  • Mybatis-Flex的增、删、改、查以及swagger (knife4J)的使用
    现代Java开发中,Mybatis-Flex是一个功能强大的Java持久层框架,使数据库操作高效灵活,而Swagger(Knife4J)则改善了API文档化与测试体验,两者结合能提高效率、增强协作、保证代码质量。本文将详细描述Mybatis-Flex增、删、改、查操作及与Swagger(Knife4J)协同使用,以下均已C......
  • Flexbox弹性盒子详解
    弹性盒子模型详解Flex弹性盒子模型详解Flex布局的基本概念Flex布局的常见属性及用法1.主轴方向2.主轴换行方式3.flex-flow(分开写更好)4.主轴对齐方式5.侧轴对齐方式5.1一行的情况5.2多行的情况6.伸缩性6.1flex-basis6.2flex-grow(伸)6.3flex-shrink(缩)7.flex复......
  • CSS Flex 布局教程
    简介弹性盒子是CSS3的一种新的布局模式。CSS3弹性盒(FlexibleBox或flexbox),是一种当页面需要适应不同的屏幕大小以及设备类型时确保元素拥有恰当的行为的布局方式。引入弹性盒布局模型的目的是提供一种更加有效的方式来对一个容器中的子元素进行排列、对齐和分配空白......
  • 错误消息:UndefinedVariableError: Variable 'user' does not exist
    错误消息:UndefinedVariableError:Variable'user'doesnotexist原因:模板中引用的变量未在上下文中定义。解决方法:检查变量定义:确认变量是否在模板上下文中定义。例如,在PHP中传递变量:$user=['name'=>'Alice'];echo$twig->render('index.html.twig',[�......
  • C4996 'scanf': This function or variable may be unsafe. Consider using scanf_s i
    错误原因VS平台认为scanf函数不安全,要求换成scanf_s函数解决方案方案一:将scanf换成scanf_s[不建议]将scanf换成scanf_s但是,scanf_s函数只能在vs上使用,其他平台无法使用,故修改后代码无法移植,不建议方案二:#define_CRT_SECURE_NO_WARNINGS在头文件之前增加预处理器指令#defin......
  • CSS display属性 inline-block flex grid
    CSSdisplayinline-block flexgrid=======================================CSS的display属性是一个核心属性,用于控制元素如何在页面布局中显示,包括其盒模型的行为。以下是display属性的一些常见值及其示例代码:1.block   说明:将元素变为块级元素,独占一行,可以设置宽高、......