|
FuzzySearch library includes such procedures:
| · |
FindStringExact
- fast realization of exact match. It works the same way as
standrt Pos from Delphi VCL, and also allows to search the line
entry without registering into account (case insensitive).
|
| ·
|
CompareStringFuzzy
- illegible comparison of two words.It allows to get their similarity
coefficient (%).
|
| · |
CompareTextFuzzy
- illegible comparison to conformity of two text fragments.
|
| ·
|
FindStringFuzzy
- searchs the illegible word entry to a text by natural
language.
|
| ·
|
FindQueryFuzzy
- searchs the illegible phase entry to a text by natural language.
|
Every procedure allows working
as to
ANSI string, as to WideString. There are also some procedures for
woring to null-terminated ANSI/Unicode strings. The alphabetic suffix
at the end of procedure title shows a tipe of accepted arguments:
| ·
|
AP null-terminated
ANSI string
|
| ·
|
WP null-terminated
Unicode string
|
| |
Function CompareStringFuzzyA(const pattern, str: string; PrefixPattern: Boolean): integer;
|
The function allows to calculate the similarity of two string. Meaning
by every line is the word of natural language.The culculation is realized
case insensitive. The result is the number from 0 to 100, that expresses
the lines similarity in %.
If PrefixPattern=False, so there is no difference between arguments
pattern and str. So the equality is realized at every
meaning of S1 and S2 strings:
| |
CompareStringFuzzyA(S1,S2,False)
= CompareStringFuzzyA(S2,S1,False)
|
At represented in appendix table you can evaluate
the function results.
If PrefixPattern=True, so string pattern is regarded
as prefix at comparison, and parametrs pattern and str
quit to be equitable:
| |
-------------------------------------------------------------------------------
|
Calculating the line proximity, some heuristics proper to most of
languages (except hieroglyphic character) are used:
| · |
substitution,
insertion , exclusion, symbol transposition are operated on
|
| ·
|
error
at the begonning of word is more significant then at the end
|
Usually for making desision of two lines similarity one can use the
level of 80-90%.
| |
Function CompareStringFuzzyW(const pattern, str: WideString; PrefixPattern: Boolean): integer;
|
Unicode version of CompareStringFuzzyA
| |
Function CompareStringFuzzyAP(pattern, str: PChar; PrefixPattern: Boolean): integer;
|
Null-terminated version of CompareStringFuzzyA
| |
Function CompareStringFuzzyWP(pattern, str: PWideChar; PrefixPattern: Boolean): integer;
|
Null-terminated Unicode version of CompareStringFuzzyA
| |
Function FindStringExactA(const pattern, text: string; CaseSensitive: Boolean; StartPos: integer = 1): integer;
|
It
works the same way as standart Pos from Delphi VCL. It searchs the
line entry pattern to text starting from StartPos.
Depending on established flag CaseSensitive it searchs with/without
taking symbols register into account. For symbols register transformation
the default Windows codepage is used
If entry is not found it takes to back to 0, in other case - position
index.
In function is used the fast algorithm that doesn`t yields to Boyer
& Moore Search in speed, but has a very fast initialisation stage
and dosen`t demand the additional memory expences to extra structures
for search.
| |
Function FindStringExactW(const pattern, text: WideString; CaseSensitive: Boolean; StartPos: integer = 1): integer;
|
Unicode version of FindStringExactA
| |
Function FindStringExactAP(pattern, text: PChar; CaseSensitive: Boolean): PChar;
|
Null-terminated version of FindStringExactA. Succesful searh gives
out the index to enrollment, or unsuccesful gives out a nil.
| |
Function FindStringExactWP(pattern, text: PWideChar; CaseSensitive: Boolean): PWideChar;
|
Null-terminated Unicode version of FindStringExactA. Succesful search
gives out the index to enrollment, or unsuccesful gives out a nil.
| |
Function FindStringFuzzyA(const pattern, text: string; MinSimilarity: integer; PrefixPattern, UseBestMatch: Boolean; StartPos: integer = 1; FoundString: PString = nil; FoundSimilarity: PInteger = nil): integer;
|
The function StringFuzzy is intended for entry search of string
pattern to text. Implying that text contains
a text at natural language (consists of words and separator). Any
symbol, except letter and number, can be a word separator.
Aborning the search, the function compares pattern to every
word from text using illegible comparison CompareStringFuzzy.
Search starts with position StartPos ang goes on to the
first coincedence to similarity more then MinSimilarity,
if UseBestMatch=False. If UseBestMatch=True, so
search will be executing over the whole text, and the result will
be the most similar.
The result of function work is position in text. If parametres
(one or both) FoundString and FoundSimilarity were
transmitted, so the additional search results (found line in text
and its similarity to pattern) will be returned through them.
| |
Function FindStringFuzzyW(const pattern, text: WideString; MinSimilarity: integer; PrefixPattern, UseBestMatch: Boolean; StartPos: integer = 1; FoundString: PWideString = nil; FoundSimilarity: PInteger = nil): integer;
|
Unicode version of FindStringFuzzyA
| |
Function FindStringFuzzyAP(pattern, text: PChar; MinSimilarity: integer; PrefixPattern, UseBestMatch: Boolean; FoundLen: PInteger = nil; FoundSimilarity: PInteger = nil): PChar;
|
Null-terminated version of FindStringFuzzyA. Succesful searh gives
out the index to enrollment, or unsuccesful gives out a nil.
| |
Function FindStringFuzzyWP(pattern, text: PWideChar; MinSimilarity: integer; PrefixPattern, UseBestMatch: Boolean; FoundLen: PInteger = nil; FoundSimilarity: PInteger = nil): PWideChar;
|
Null-terminated Unicode version of FindStringFuzzyA. Succesful searh
gives out the index to enrollment, or unsuccesful gives out a nil.
| |
Function FindQueryFuzzyA(const pattern, text: string; MinSimilarity: integer): integer;
|
The function FindQueryFuzzy works to parametres pattern and
text as to text fragments (so they consist of words and separator).
The function checks that text contains (approximately) the
words from pattern. It returns the accordance coefficient
text to query pattern.
| |
Function FindQueryFuzzyW(const pattern, text: WideString; MinSimilarity: integer): integer;
|
Unicode version of FindQueryFuzzyA
| |
Function FindQueryFuzzyAP(pattern, text: PChar; MinSimilarity: integer): integer;
|
Null-terminated version of FindQueryFuzzyA
| |
Function FindQueryFuzzyWP(pattern, text: PWideChar; MinSimilarity: integer): integer;
|
Null-terminated Unicode version of FindQueryFuzzyA
| |
Function CompareTextFuzzyA(const Text1,Text2: string): integer;
|
The function allows to calculate the similarity of two text fragments.The
calculation is realized case insensitive. The result is the number
from 0 to 100, that expresses the similarity of compared fragments
in %.
By example the comparison of two text fragment bellow gives us the
similarity coefficient amounted 76%:
| 1. |
The
procedure call model (hereafter termed the Model) views a process
as a collection of remotely callable subroutines or "procedures."
|
| 2.
|
Described
procedure call model views a process as a collection of remotely
callable subroutines (or procedures).
|
| |
Function CompareTextFuzzyW(const Text1,Text2: WideString): integer;
|
Unicode version of CompareTextFuzzyA
| |
Function CompareTextFuzzyAP(Text1,Text2: PChar): integer;
|
Null-terminated version of CompareTextFuzzyA
| |
Function CompareTextFuzzyWP(Text1,Text2: PWideChar): integer;
|
Null-terminated Unicode version of CompareTextFuzzyA
Test result of CompareStringFuzzy (PrefixPattern
= False)
| |
shackleford shackelford 93%
|
| |
brookhaven brrokhaven 91%
|
| |
brook
hallow brook hllw 95%
|
| |
fitzrureiter fitzenreiter 87%
|
| |
gondiwindi gondiwindiro 97%
|
| |
fuzzysearch fuzySerrch 93%
|
|