Patent Abstract
What you see at this site is an implementation of A Method and System for Compression
Indexing and Efficient Proximity Search of Text Data. That is the title of a patent application that
Marpex Inc. submitted to the United States Patent and Trademark Office on March 8, 2004. Here is the
abstract of that application:
This invention relates to a method and system of compression indexing and efficient proximity
search of text data sets. Compression indexing makes text redundant once it has been indexed, since the
text may be reconstituted from the index (hereinafter termed "compression index ebook"). The method
further does away with the many disk seeks associated with checking closeness of words in records found
through traditional techniques of proximity search. The method also enables efficient relevance ranking
of search results according to closeness of desired terms within each portion of text found.
The field of computational linguistics is very broad; so too is its subsidiary discipline of text search.
It is a well-known fundamental characteristic of most natural languages that adjacency of words bears
directly on meaning of the combined words. Therefore the focus here is more precisely on efficient
techniques for computation of "closeness of fit" of desired terms with a view to enhancing the ability
of the searcher to arrive at results that conform to intended meaning.