[ Index ]

PHP Cross Reference of DokuWiki

title

Body

[close]

/inc/ -> indexer.php (summary)

Common DokuWiki functions

Author: Andreas Gohr
License: GPL 2 (http://www.gnu.org/licenses/gpl.html)
File Size: 680 lines (21 kb)
Included or required: 4 times
Referenced: 0 times
Includes or requires: 3 files
 inc/utf8.php
 inc/parserutils.php
 inc/io.php

Defines 17 functions

  wordlen()
  idx_saveIndex()
  idx_getIndex()
  idx_touchIndex()
  _freadline()
  idx_saveIndexLine()
  idx_getIndexLine()
  idx_getPageWords()
  idx_addPage()
  idx_writeIndexLine()
  idx_updateIndexLine()
  idx_indexLengths()
  idx_getIndexWordsSorted()
  idx_lookup()
  idx_parseIndexLine()
  idx_tokenizer()
  idx_upgradePageWords()

Functions
Functions that are not part of a class:

wordlen($w)   X-Ref
Measure the length of a string.
Differs from strlen in handling of asian characters.

author: Tom N Harris <tnharris@whoopdedo.org>

idx_saveIndex($pre, $wlen, &$idx)   X-Ref
Write a list of strings to an index file.

author: Tom N Harris <tnharris@whoopdedo.org>

idx_getIndex($pre, $wlen)   X-Ref
Read the list of words in an index (if it exists).

author: Tom N Harris <tnharris@whoopdedo.org>

idx_touchIndex($pre, $wlen)   X-Ref
Create an empty index file if it doesn't exist yet.

FIXME: This function isn't currently used. It will probably be removed soon.

author: Tom N Harris <tnharris@whoopdedo.org>

_freadline($fh)   X-Ref
Read a line ending with \n.
Returns false on EOF.

author: Tom N Harris <tnharris@whoopdedo.org>

idx_saveIndexLine($pre, $wlen, $idx, $line)   X-Ref
Write a line to an index file.

author: Tom N Harris <tnharris@whoopdedo.org>

idx_getIndexLine($pre, $wlen, $idx)   X-Ref
Read a single line from an index (if it exists).

author: Tom N Harris <tnharris@whoopdedo.org>

idx_getPageWords($page)   X-Ref
Split a page into words

Returns an array of word counts, false if an error occurred.
Array is keyed on the word length, then the word index.

author: Andreas Gohr <andi@splitbrain.org>
author: Christopher Smith <chris@jalakai.co.uk>

idx_addPage($page)   X-Ref
Adds/updates the search for the given page

This is the core function of the indexer which does most
of the work. This function needs to be called with proper
locking!

author: Andreas Gohr <andi@splitbrain.org>

idx_writeIndexLine($fh,$line,$pid,$count)   X-Ref
Write a new index line to the filehandle

This function writes an line for the index file to the
given filehandle. It removes the given document from
the given line and readds it when $count is >0.

author: Andreas Gohr <andi@splitbrain.org>

idx_updateIndexLine($line,$pid,$count)   X-Ref
Modify an index line with new information

This returns a line of the index. It removes the
given document from the line and readds it if
$count is >0.

author: Tom N Harris <tnharris@whoopdedo.org>
author: Andreas Gohr <andi@splitbrain.org>

idx_indexLengths(&$filter)   X-Ref
Get the word lengths that have been indexed.

Reads the index directory and returns an array of lengths
that there are indices for.

author: Tom N Harris <tnharris@whoopdedo.org>

idx_getIndexWordsSorted($words,&$result)   X-Ref
Find the the index number of each search term.

This will group together words that appear in the same index.
So it should perform better, because it only opens each index once.
Actually, it's not that great. (in my experience) Probably because of the disk cache.
And the sorted function does more work, making it slightly slower in some cases.

author: Tom N Harris <tnharris@whoopdedo.org>
param: array    $words   The query terms. Words should only contain valid characters,
param: arrayref $result  Set to word => array("length*id" ...), use this to merge the
return: array            Set to length => array(id ...)

idx_lookup($words)   X-Ref
Lookup words in index

Takes an array of word and will return a list of matching
documents for each one.

Important: No ACL checking is done here! All results are
returned, regardless of permissions

author: Andreas Gohr <andi@splitbrain.org>

idx_parseIndexLine(&$page_idx,$line)   X-Ref
Returns a list of documents and counts from a index line

It omits docs with a count of 0 and pages that no longer
exist.

author: Andreas Gohr <andi@splitbrain.org>
param: array  $page_idx The list of known pages
param: string $line     A line from the main index

idx_tokenizer($string,&$stopwords,$wc=false)   X-Ref
Tokenizes a string into an array of search words

Uses the same algorithm as idx_getPageWords()

param: string   $string     the query as given by the user
param: arrayref $stopwords  array of stopwords
param: boolean  $wc         are wildcards allowed?

idx_upgradePageWords()   X-Ref
Create a pagewords index from the existing index.

author: Tom N Harris <tnharris@whoopdedo.org>



Generated: Tue Dec 2 01:30:01 2008 Cross-referenced by PHPXref 0.7