extractWithoutOrder
- Select the best match in a list or dictionary of choices.
- Find best matches in a list or dictionary of choices, return a generator of tuples containing the match and its score.
- If a dictionary is used, also returns the key for each match.
extract
- Select the best match in a list or dictionary of choices.
- Find best matches in a list or dictionary of choices, return a list of tuples containing the match and its score.
- If a dictionary is used, also returns the key for each match
extractBests
- Get a list of the best matches to a collection of choices.
- Convenience function for getting the choices with best scores
extractOne
-
参数:
(query, choices, processor=default_processor, scorer=default_scorer, score_cutoff=0):
-
Find the single best match above a score in a list of choices.
-
This is a convenience method which returns the single best choice.
-
See
extract()
for the full arguments list. -
query
: A string to match against -
choices
: A list or dictionary of choices, suitable for use withextract()
. -
processor
: Optional function for transforming choices before matching.Seeextract()
. -
scorer
: Scoring function for extract(). -
score_cutoff
: Optional argument for score threshold. If the best match is found, but it is not greater than this number, then return None anyway ("not a good enough match"). Defaults to 0.
dedupe
- This convenience function takes a list of strings containing duplicates and uses fuzzy matching to identify and remove duplicates.
- Specifically, it uses the process.extract to identify duplicates that score greater than a user defined threshold.
- Then, it looks for the longest item in the duplicate list since we assume this item contains the most entity information and returns that.
- It breaks string length ties on an alphabetical sort.