Clever Geek Handbook
📜 ⬆️ ⬇️

Prefix function

Line prefix functionS {\ displaystyle S} S and positioni {\ displaystyle i} i in it is the lengthk {\ displaystyle k} k largest substring prefixS[one..i] {\ displaystyle S [1..i]} {\ displaystyle S [1..i]} , which is also the suffix of this substring. That is, at the beginning of a substringS[one..i] {\ displaystyle S [1..i]} {\ displaystyle S [1..i]} longi {\ displaystyle i} i need to find such a prefix of maximum lengthk<i {\ displaystyle k <i} {\ displaystyle k <i} which would be the suffix of the given substring(S[one..k]==S[(i-k+one)..i]) {\ displaystyle \ left (S [1..k] == S \ left [(i-k + 1) .. i \ right] \ right)} {\ displaystyle \ left (S [1..k] == S \ left [(i-k + 1) .. i \ right] \ right)} .

Designatedπ(S,i) {\ displaystyle \ pi (S, i)} {\ displaystyle \ pi (S, i)} ; WhereS∈Σ+ {\ displaystyle S \ in \ Sigma ^ {+}} {\ displaystyle S \ in \ Sigma ^ {+}} - line;one⩽i<|S| {\ displaystyle 1 \ leqslant i <\ left | S \ right |} {\ displaystyle 1 \ leqslant i <\ left | S \ right |} Is the length of the substring in S. It is believed thatπ(S,one)=0 {\ displaystyle \ pi (S, 1) = 0} {\ displaystyle \ pi (S, 1) = 0} .

Often, the prefix function is defined in vector form:

Line prefix functionS∈Σ+ {\ displaystyle S \ in \ Sigma ^ {+}} {\ displaystyle S \ in \ Sigma ^ {+}} there is a vectorπ(S)∈Z|S|-one {\ displaystyle \ pi (S) \ in \ mathbb {Z} ^ {\ left | S \ right | -1}} {\ displaystyle \ pi (S) \ in \ mathbb {Z} ^ {\ left | S \ right | -1}} , eachi {\ displaystyle i} i whose element is equal toπ(S,i) {\ displaystyle \ pi (S, i)} {\ displaystyle \ pi (S, i)} .

For example, for the string 'abcdabscabcdabia' the prefix function would be: π (abcdabscabcdabia) = '0000120012345601' .

This function is used, for example, in the Knut-Morris-Pratt algorithm . [one]

Computation Algorithm

Line characters are numbered from 1.

Let beπ(S,i)=k {\ displaystyle \ pi (S, i) = k} {\displaystyle \pi (S,i)=k} . Let's try to calculate the prefix function fori+one {\ displaystyle i + 1} i+1 .

If aS[i+one]=S[k+one] {\ displaystyle S [i + 1] = S [k + 1]} {\displaystyle S[i+1]=S[k+1]} then naturallyπ(S,i+one)=k+one {\ displaystyle \ pi (S, i + 1) = k + 1} {\displaystyle \pi (S,i+1)=k+1} . If not, try smaller suffixes. Iterate over all suffixes by linear search. You can use the already calculated values ​​of the prefix function. You may notice thatS[one...π(S,k)] {\ displaystyle S [1 \ ldots \ pi (S, k)]} {\displaystyle S[1\ldots \pi (S,k)]} will also be the suffix of the stringS[one...i] {\ displaystyle S [1 \ ldots i]} {\displaystyle S[1\ldots i]} , becausek {\ displaystyle k} k - length of the prefix-suffix at this point. For anyonej∈(k,i) {\ displaystyle j \ in (k, i)} {\displaystyle j\in (k,i)} lineS[one...j] {\ displaystyle S [1 \ ldots j]} {\displaystyle S[1\ldots j]} the suffix will not be. Thus, the algorithm is obtained:

  1. AtS[i+one]=S[k+one] {\ displaystyle S [i + 1] = S [k + 1]} {\displaystyle S[i+1]=S[k+1]} - putπ(S,i+one)=k+one {\ displaystyle \ pi (S, i + 1) = k + 1} {\displaystyle \pi (S,i+1)=k+1} .
  2. Otherwise whenk=0 {\ displaystyle k = 0} k=0 - putπ(S,i+one)=0 {\ displaystyle \ pi (S, i + 1) = 0} {\displaystyle \pi (S,i+1)=0} .
  3. Otherwise, installk: =π(S,k) {\ displaystyle k: = \ pi (S, k)} {\displaystyle k:=\pi (S,k)} and go to step 1.

For the string 'abcdabscabcdabia' calculation will be like this:

  'a'! = 'b' => π = 0;
 'a'! = 'c' => π = 0;
 'a'! = 'd' => π = 0;
 'a' == 'a' => π = π + 1 = 1;
 'b' == 'b' => π = π + 1 = 2;
 'c'! = 's' => π = 0;
 'a'! = 'c' => π = 0;
 'a' == 'a' => π = π + 1 = 1;
 'b' == 'b' => π = π + 1 = 2;
 'c' == 'c' => π = π + 1 = 3;
 'd' == 'd' => π = π + 1 = 4;
 'a' == 'a' => π = π + 1 = 5;
 'b' == 'b' => π = π + 1 = 6;
 's'! = 'i' => π = 0;
 'a' == 'a' => π = π + 1 = 1;

A slightly more interesting example is for the string 'abcdabcabcdabcdab' calculation would be like this:

  1 S [1] = 'a', k = π = 0;
 2 S [2] = 'b'! = S [k + 1] => k = π = 0;
 3 S [3] = 'c'! = S [1] => k = π = 0;
 4 S [4] = 'd'! = S [1] => k = π = 0;
 5 S [5] = 'a' == S [1] => k = π = 1;
 6 S [6] = 'b' == S [2] => k = π = 2;
 7 S [7] = 'c' == S [3] => k = π = 3;
 8 S [8] = 'a'! = S [4] => k: = π (S, 3) = 0, S [8] == S [1] => k = π = 1;
 9 S [9] = 'b' == S [2] => k = π = 2;
 10 S [10] = 'c' == S [3] => k = π = 3;
 11 S [11] = 'd' == S [4] => k = π = 4;
 12 S [12] = 'a' == S [5] => k = π = 5;
 13 S [13] = 'b' == S [6] => k = π = 6;
 14 S [14] = 'c' == S [7] => k = π = 7;
 15 S [15] = 'd'! = S [8] => k: = π (S, 7) = 3, S [15] == S [4] => k = π = 4;
 16 S [16] = 'a' == S [5] => k = π = 5;
 17 S [17] = 'b' == S [6] => k = π = 6;

And the result is: '00001231234567456' .

Speed

Despite the fact that paragraph 3 is an inner loop, the calculation time of the prefix function is estimated asO(|S|) {\ displaystyle O (| S |)}   . Let us prove it.

Everythingi {\ displaystyle i}   are divided into:

  1. Magnifyingk {\ displaystyle k}   per unit. The loop goes through one iteration.
  2. Non-zerok {\ displaystyle k}   . The loop also goes through one iteration. Cases 1 and 2 in total no more|S|-one {\ displaystyle \ left | S \ right | -1}   pieces.
  3. Not changing or decreasing positivek {\ displaystyle k}   . Since inside the loop the valuek {\ displaystyle k}   can only decrease, and increasek {\ displaystyle k}   only possible by one, then the total valuek {\ displaystyle k}   cannot decrease more than|S|-2 {\ displaystyle \ left | S \ right | -2}   times, which limits the number of operations of the internal cycle.

Total algorithm requires no more2|S| {\ displaystyle 2 \ left | S \ right |}   iterations, which proves the order of speedO(|S|) {\ displaystyle O (\ left | S \ right |)}   . The worst case for an algorithm is the case of processing a string of the form 'aa…ab' .

C ++ implementation example

  vector < int > compute_prefix_function ( const string & s ) 
 {
	 int len = s .  length ();
	 vector < int > p ( len );  // prefix function values
	                     // index of the vector corresponds to the number of the last character of the argument
	 p [ 0 ] = 0 ;  // for a prefix of zero and one character, the function is zero

     int k = 0 ;	
	 for ( int i = 1 ; i < len ; ++ i ) {	
		 while (( k > 0 ) && ( s [ k ] ! = s [ i ])) 
			 k = p [ k - 1 ]; 
		 if ( s [ k ] == s [ i ])
			 ++ k ;
		 p [ i ] = k ;
	 }
	 return p ;
 }

Delphi implementation example

  type Arr = array of LongInt ;
 function compute_prefix_function ( s : string ) : Arr ;
 var k , i , l : LongInt ;
 begin
   l : = Length ( s ) ;
   SetLength ( Result , 1 + l ) ;
   Result [ 0 ] : = 0 ;
   Result [ 1 ] : = 0 ;
   k : = 0 ;
   for i : = 2 to l do
   begin
     while ( k > 0 ) and ( s [ k + 1 ] <> s [ i ]) do
       k : = Result [ k ] ;
     if s [ k + 1 ] = s [ i ] then
       Inc ( k ) ;
     Result [ i ] : = k ;
   end ;
 end ;

Ruby implementation example

  def prefix s
   n = s .  length - 1
   p = Array .  new ( n , 0 )
   n  times do |  i |
     i + = 1
     j = p [ i - 1 ]
     while j > 0 && s [ i ] !  = s [ j ]
       j = p [ j - 1 ]
     end
     j + = 1 if s [ i ] == s [ j ]
     p [ i ] = j
   end
   return p
 end

Java implementation example

  public static List < Integer > compute_prefix_function ( String s ) {
     int len = s .  length ();
     List < Integer > p = new ArrayList <> ();  // prefix function values
     for ( int i = 0 ; i < len ; ++ i ) { // fill the sheet with zeros default
         p .  add ( 0 );
     }
                                          // sheet index corresponds to the number of the last character of the argument
     int k = 0 ;
     for ( int i = 1 ; i < len ; ++ i ) {
         while (( k > 0 ) && ( s . charAt ( k ) ! = s . charAt ( i )))
             k = p .  get ( k - 1 );
         if ( s . charAt ( k ) == s . charAt ( i ))
             ++ k ;
         p .  set ( i , k );
     }
     return p ;
 }

Python implementation example

  def prefix ( s ):
 v = [ 0 ] * len ( s )
 for i in range ( 1 , len ( s )):
 k = v [ i - 1 ]
 while k > 0 and s [ k ] ! = s [ i ]:
 k = v [ k - 1 ]
 if s [ k ] == s [ i ]:
 k = k + 1
 v [ i ] = k
 return v

Haskell implementation example

  prefix :: String -> [ Int ]
 prefix [] = []
 prefix str @ ( _ : t ) = 
     go 0 0 str t []
   where 
     go vlen _ [] _ v = reverse v
     go vlen k s @ ( si : ssi ) ( sk : ssk ) v 
         |  si == sk = go ( vlen + 1 ) ( k + 1 ) ssi ssk (( k + 1 ) : v )
         |  k == 0 = go ( vlen + 1 ) 0 ssi str ( 0 : v )
         |  otherwise = 
             let
                k ' = v !!  ( vlen - k )
             in
                go vlen k ' s ( drop k' str ) v
Source - https://ru.wikipedia.org/w/index.php?title=Prefix-function&oldid=97492216


More articles:

  • Summer Palace of Troy
  • Borzig, August
  • Yarrow Naked
  • Buddley David
  • Branichevsky District
  • Mikrantemum few flowers
  • Palazzo Normanni
  • East Sussex
  • Chinaz
  • Strait of Singapore

All articles

Clever Geek | 2019