clear solution for fuzzy tasks
EXECryptor
strongest anti-crack & anti-piracy software protection and license management system with custom security level settings and ultra short serial number generation
Online Demo           [Function List]

FuzzySearch library includes such procedures:

· FindStringExact - fast realization of exact match. It works the same way as standrt Pos from Delphi VCL, and also allows to search the line entry without registering into account (case insensitive).
· CompareStringFuzzy - illegible comparison of two words.It allows to get their similarity coefficient (%).
· CompareTextFuzzy - illegible comparison to conformity of two text fragments.
· FindStringFuzzy - searchs the illegible word entry to a text by natural language.
· FindQueryFuzzy - searchs the illegible phase entry to a text by natural language.

Every procedure allows working
as to ANSI string, as to WideString. There are also some procedures for woring to null-terminated ANSI/Unicode strings. The alphabetic suffix at the end of procedure title shows a tipe of accepted arguments:
· A ANSI string
· W Unicode string
· AP null-terminated ANSI string
· WP null-terminated Unicode string

Function CompareStringFuzzyA(const pattern, str: string; PrefixPattern: Boolean): integer;

The function allows to calculate the similarity of two string. Meaning by every line is the word of natural language.The culculation is realized case insensitive. The result is the number from 0 to 100, that expresses the lines similarity in %.
If PrefixPattern=False, so there is no difference between arguments pattern and str. So the equality is realized at every meaning of S1 and S2 strings:

CompareStringFuzzyA(S1,S2,False) = CompareStringFuzzyA(S2,S1,False)

At represented in appendix table you can evaluate the function results.

If PrefixPattern=True, so string pattern is regarded as prefix at comparison, and parametrs pattern and str quit to be equitable:

pattern str similarity
-------------------------------------------------------------------------------
jon jonnatan 100%
jonnatan jon 78%
john jonnatan 90%
jonnatan john 70%

Calculating the line proximity, some heuristics proper to most of languages (except hieroglyphic character) are used:
· substitution, insertion , exclusion, symbol transposition are operated on
· error at the begonning of word is more significant then at the end

Usually for making desision of two lines similarity one can use the level of 80-90%.

Function CompareStringFuzzyW(const pattern, str: WideString; PrefixPattern: Boolean): integer;

Unicode version of CompareStringFuzzyA

Function CompareStringFuzzyAP(pattern, str: PChar; PrefixPattern: Boolean): integer;

Null-terminated version of CompareStringFuzzyA

Function CompareStringFuzzyWP(pattern, str: PWideChar; PrefixPattern: Boolean): integer;

Null-terminated Unicode version of CompareStringFuzzyA

Function FindStringExactA(const pattern, text: string; CaseSensitive: Boolean; StartPos: integer = 1): integer;

It works the same way as standart Pos from Delphi VCL. It searchs the line entry pattern to text starting from StartPos. Depending on established flag CaseSensitive it searchs with/without taking symbols register into account. For symbols register transformation the default Windows codepage is used

If entry is not found it takes to back to 0, in other case - position index.

In function is used the fast algorithm that doesn`t yields to Boyer & Moore Search in speed, but has a very fast initialisation stage and dosen`t demand the additional memory expences to extra structures for search.

Function FindStringExactW(const pattern, text: WideString; CaseSensitive: Boolean; StartPos: integer = 1): integer;

Unicode version of FindStringExactA

Function FindStringExactAP(pattern, text: PChar; CaseSensitive: Boolean): PChar;


Null-terminated version of FindStringExactA. Succesful searh gives out the index to enrollment, or unsuccesful gives out a nil.


Function FindStringExactWP(pattern, text: PWideChar; CaseSensitive: Boolean): PWideChar;


Null-terminated Unicode version of FindStringExactA. Succesful search gives out the index to enrollment, or unsuccesful gives out a nil.


Function FindStringFuzzyA(const pattern, text: string; MinSimilarity: integer; PrefixPattern, UseBestMatch: Boolean; StartPos: integer = 1; FoundString: PString = nil; FoundSimilarity: PInteger = nil): integer;


The function StringFuzzy is intended for entry search of string pattern to text. Implying that text contains a text at natural language (consists of words and separator). Any symbol, except letter and number, can be a word separator.
Aborning the search, the function compares pattern to every word from text using illegible comparison CompareStringFuzzy. Search starts with position StartPos ang goes on to the first coincedence to similarity more then MinSimilarity, if UseBestMatch=False. If UseBestMatch=True, so search will be executing over the whole text, and the result will be the most similar.

The result of function work is position in text. If parametres (one or both) FoundString and FoundSimilarity were transmitted, so the additional search results (found line in text and its similarity to pattern) will be returned through them.


Function FindStringFuzzyW(const pattern, text: WideString; MinSimilarity: integer; PrefixPattern, UseBestMatch: Boolean; StartPos: integer = 1; FoundString: PWideString = nil; FoundSimilarity: PInteger = nil): integer;


Unicode version of FindStringFuzzyA


Function FindStringFuzzyAP(pattern, text: PChar; MinSimilarity: integer; PrefixPattern, UseBestMatch: Boolean; FoundLen: PInteger = nil; FoundSimilarity: PInteger = nil): PChar;


Null-terminated version of FindStringFuzzyA. Succesful searh gives out the index to enrollment, or unsuccesful gives out a nil.


Function FindStringFuzzyWP(pattern, text: PWideChar; MinSimilarity: integer; PrefixPattern, UseBestMatch: Boolean; FoundLen: PInteger = nil; FoundSimilarity: PInteger = nil): PWideChar;


Null-terminated Unicode version of FindStringFuzzyA. Succesful searh gives out the index to enrollment, or unsuccesful gives out a nil.


Function FindQueryFuzzyA(const pattern, text: string; MinSimilarity: integer): integer;


The function FindQueryFuzzy works to parametres pattern and text as to text fragments (so they consist of words and separator). The function checks that text contains (approximately) the words from pattern. It returns the accordance coefficient text to query pattern.


Function FindQueryFuzzyW(const pattern, text: WideString; MinSimilarity: integer): integer;

Unicode version of FindQueryFuzzyA

Function FindQueryFuzzyAP(pattern, text: PChar; MinSimilarity: integer): integer;


Null-terminated version of FindQueryFuzzyA


Function FindQueryFuzzyWP(pattern, text: PWideChar; MinSimilarity: integer): integer;


Null-terminated Unicode version of FindQueryFuzzyA


Function CompareTextFuzzyA(const Text1,Text2: string): integer;

The function allows to calculate the similarity of two text fragments.The calculation is realized case insensitive. The result is the number from 0 to 100, that expresses the similarity of compared fragments in %.

By example the comparison of two text fragment bellow gives us the similarity coefficient amounted 76%:
1. The procedure call model (hereafter termed the Model) views a process as a collection of remotely callable subroutines or "procedures."
2. Described procedure call model views a process as a collection of remotely callable subroutines (or procedures).

Function CompareTextFuzzyW(const Text1,Text2: WideString): integer;

Unicode version of CompareTextFuzzyA

Function CompareTextFuzzyAP(Text1,Text2: PChar): integer;

Null-terminated version of CompareTextFuzzyA

Function CompareTextFuzzyWP(Text1,Text2: PWideChar): integer;

Null-terminated Unicode version of CompareTextFuzzyA

Test result of CompareStringFuzzy (PrefixPattern = False)

pattern str similarity

shackleford shackelford 93%
dunningham cunnigham 86%
nichleson nichulson 92%
jones johnson 85%
massey massie 95%
abroms abrams 89%
hardin martinez 70%
itman smith 29%
jeraldine geraldine 90%
marhta martha 88%
michelle michael 91%
julies julius 95%
tanya tonya 86%
dwayne duane 79%
sean susan 74%
jon john 90%
jon jan 78%
brookhaven brrokhaven 91%
brook hallow brook hllw 95%
decatur decatir 96%
fitzrureiter fitzenreiter 87%
higbee highee 89%
higbee higvee 89%
lacura locura 85%
iowa iona 83%
1st ist 75%
peter peter 100%
abcde fghij 0%
yz abcdef 0%
cunningham cunnigham 96%
campell campbell 95%
galloway calloway 88%
frederick fredrick 95%
michele michelle 98%
jesse jessie 97%
jonathon jonathan 96%
julies juluis 92%
yvette yevett 72%
dickson dixon 57%
dixon dickson 57%
peter ole 32%
gondiwindi gondiwindiro 97%
abcd 1234 0%
network new 78%
fuzzysearch fuzySerrch 93%