Line prefix function and position in it is the length largest substring prefix , which is also the suffix of this substring. That is, at the beginning of a substring long need to find such a prefix of maximum length which would be the suffix of the given substring .
Designated ; Where - line; Is the length of the substring in S. It is believed that .
Often, the prefix function is defined in vector form:
Line prefix function there is a vector , each whose element is equal to .
For example, for the string 'abcdabscabcdabia' the prefix function would be: π (abcdabscabcdabia) = '0000120012345601' .
This function is used, for example, in the Knut-Morris-Pratt algorithm . [one]
Computation Algorithm
- Line characters are numbered from 1.
Let be . Let's try to calculate the prefix function for
.
If a then naturally
. If not, try smaller suffixes. Iterate over all suffixes by linear search. You can use the already calculated values of the prefix function. You may notice that
will also be the suffix of the string
, because
- length of the prefix-suffix at this point. For anyone
line
the suffix will not be. Thus, the algorithm is obtained:
- At
- put
.
- Otherwise when
- put
.
- Otherwise, install
and go to step 1.
For the string 'abcdabscabcdabia' calculation will be like this:
'a'! = 'b' => π = 0; 'a'! = 'c' => π = 0; 'a'! = 'd' => π = 0; 'a' == 'a' => π = π + 1 = 1; 'b' == 'b' => π = π + 1 = 2; 'c'! = 's' => π = 0; 'a'! = 'c' => π = 0; 'a' == 'a' => π = π + 1 = 1; 'b' == 'b' => π = π + 1 = 2; 'c' == 'c' => π = π + 1 = 3; 'd' == 'd' => π = π + 1 = 4; 'a' == 'a' => π = π + 1 = 5; 'b' == 'b' => π = π + 1 = 6; 's'! = 'i' => π = 0; 'a' == 'a' => π = π + 1 = 1;
A slightly more interesting example is for the string 'abcdabcabcdabcdab' calculation would be like this:
1 S [1] = 'a', k = π = 0; 2 S [2] = 'b'! = S [k + 1] => k = π = 0; 3 S [3] = 'c'! = S [1] => k = π = 0; 4 S [4] = 'd'! = S [1] => k = π = 0; 5 S [5] = 'a' == S [1] => k = π = 1; 6 S [6] = 'b' == S [2] => k = π = 2; 7 S [7] = 'c' == S [3] => k = π = 3; 8 S [8] = 'a'! = S [4] => k: = π (S, 3) = 0, S [8] == S [1] => k = π = 1; 9 S [9] = 'b' == S [2] => k = π = 2; 10 S [10] = 'c' == S [3] => k = π = 3; 11 S [11] = 'd' == S [4] => k = π = 4; 12 S [12] = 'a' == S [5] => k = π = 5; 13 S [13] = 'b' == S [6] => k = π = 6; 14 S [14] = 'c' == S [7] => k = π = 7; 15 S [15] = 'd'! = S [8] => k: = π (S, 7) = 3, S [15] == S [4] => k = π = 4; 16 S [16] = 'a' == S [5] => k = π = 5; 17 S [17] = 'b' == S [6] => k = π = 6;
And the result is: '00001231234567456' .
Speed
Despite the fact that paragraph 3 is an inner loop, the calculation time of the prefix function is estimated as . Let us prove it.
Everything are divided into:
- Magnifying per unit. The loop goes through one iteration.
- Non-zero . The loop also goes through one iteration. Cases 1 and 2 in total no more pieces.
- Not changing or decreasing positive . Since inside the loop the value can only decrease, and increase only possible by one, then the total value cannot decrease more than times, which limits the number of operations of the internal cycle.
Total algorithm requires no more iterations, which proves the order of speed . The worst case for an algorithm is the case of processing a string of the form 'aa…ab' .
C ++ implementation example
vector < int > compute_prefix_function ( const string & s )
{
int len = s . length ();
vector < int > p ( len ); // prefix function values
// index of the vector corresponds to the number of the last character of the argument
p [ 0 ] = 0 ; // for a prefix of zero and one character, the function is zero
int k = 0 ;
for ( int i = 1 ; i < len ; ++ i ) {
while (( k > 0 ) && ( s [ k ] ! = s [ i ]))
k = p [ k - 1 ];
if ( s [ k ] == s [ i ])
++ k ;
p [ i ] = k ;
}
return p ;
}
Delphi implementation example
type Arr = array of LongInt ;
function compute_prefix_function ( s : string ) : Arr ;
var k , i , l : LongInt ;
begin
l : = Length ( s ) ;
SetLength ( Result , 1 + l ) ;
Result [ 0 ] : = 0 ;
Result [ 1 ] : = 0 ;
k : = 0 ;
for i : = 2 to l do
begin
while ( k > 0 ) and ( s [ k + 1 ] <> s [ i ]) do
k : = Result [ k ] ;
if s [ k + 1 ] = s [ i ] then
Inc ( k ) ;
Result [ i ] : = k ;
end ;
end ;
Ruby implementation example
def prefix s
n = s . length - 1
p = Array . new ( n , 0 )
n times do | i |
i + = 1
j = p [ i - 1 ]
while j > 0 && s [ i ] ! = s [ j ]
j = p [ j - 1 ]
end
j + = 1 if s [ i ] == s [ j ]
p [ i ] = j
end
return p
end
Java implementation example
public static List < Integer > compute_prefix_function ( String s ) {
int len = s . length ();
List < Integer > p = new ArrayList <> (); // prefix function values
for ( int i = 0 ; i < len ; ++ i ) { // fill the sheet with zeros default
p . add ( 0 );
}
// sheet index corresponds to the number of the last character of the argument
int k = 0 ;
for ( int i = 1 ; i < len ; ++ i ) {
while (( k > 0 ) && ( s . charAt ( k ) ! = s . charAt ( i )))
k = p . get ( k - 1 );
if ( s . charAt ( k ) == s . charAt ( i ))
++ k ;
p . set ( i , k );
}
return p ;
}
Python implementation example
def prefix ( s ):
v = [ 0 ] * len ( s )
for i in range ( 1 , len ( s )):
k = v [ i - 1 ]
while k > 0 and s [ k ] ! = s [ i ]:
k = v [ k - 1 ]
if s [ k ] == s [ i ]:
k = k + 1
v [ i ] = k
return v
Haskell implementation example
prefix :: String -> [ Int ]
prefix [] = []
prefix str @ ( _ : t ) =
go 0 0 str t []
where
go vlen _ [] _ v = reverse v
go vlen k s @ ( si : ssi ) ( sk : ssk ) v
| si == sk = go ( vlen + 1 ) ( k + 1 ) ssi ssk (( k + 1 ) : v )
| k == 0 = go ( vlen + 1 ) 0 ssi str ( 0 : v )
| otherwise =
let
k ' = v !! ( vlen - k )
in
go vlen k ' s ( drop k' str ) v