TIP: Click on subject to list as thread! ANSI
echo: nthelp
to: Bill Lucy
from: Antti Kurenniemi
date: 2004-02-25 07:47:20
subject: Re: Word Compression

From: "Antti Kurenniemi" 

"Bill Lucy"  wrote in message
news:MPG.1aa5a3fd2b192c8b98bf9e{at}news.barkto.com...
> This is from the help file on searches:
>
> earching by text content
>
> SimpleSearch can index the text content of PaperPort Image (.max), PDF,
TIFF, and
> DCX files. It can also index the content of text items, including Word,
Notepad,
> WordPad, Excel, and HTML files.
> To index the text content, SimpleSearch uses PaperPort's OCR software to
extract
> and copy textual content from the items, and creates a database of the
words or
> phrases in those items, much like the index of a book.
> You can then find scanned items by searching on words contained in those
items. For
> example, if you have scanned items from different investment companies,
you might
> search for words such as bonds, gold, or mutual funds, to find items that
contain
> those words.
> ---
>
> So, yes, you can. With 15,000 pages, it might take awhile to index your
> PDFs, but  it will do it.

Thank you, looks promising. Luckily our contracts are in very uniform shape
with no images or tables or anything, so it should be quite reliable. I'll
get a copy and start fiddling - even if it takes some time it'll still be
faster than doing it by hand.


Antti Kurenniemi

--- BBBS/NT v4.01 Flag-5
* Origin: Barktopia BBS Site http://HarborWebs.com:8081 (1:379/45)
SEEN-BY: 633/267 270
@PATH: 379/45 1 633/267

SOURCE: echomail via fidonet.ozzmosis.com

Email questions or comments to sysop@ipingthereforeiam.com
All parts of this website painstakingly hand-crafted in the U.S.A.!
IPTIA BBS/MUD/Terminal/Game Server List, © 2025 IPTIA Consulting™.