TIP: Click on subject to list as thread! ANSI
echo: power_bas
to: ALL
from: TIM HUTZLER
date: 1996-07-27 10:23:00
subject: Improved compresser/deco

Hi y'all;
Awhile back someone posted a text compression/decompression routine
that was simple and easy to implement. I have made some improvements
and cleaned it up so that you can better understand how it works.
The origional claim was that it packs/unpacks strings to about 52% of
their original size. This is the theoritical maximum comprression and
in actuallity will not occure. I got a typical compression figure of
more like 60%. Still, close enough to a 2:1 ratio to possible make it
worth while.
I'm not going to use this routine, myself. I am developing something
that should provide much better compression, but still lends itself to
file scanning. It will be optimized for text files, and may be
slightly lossy (like JPEGs) to compensate for mispellings and odd
punctuation.
Wish me luck....
Anyway, here is my contribution to the PowerBASIC echo for now.
=======  cut this and name it COMPRESS.BAS - it's ready to run =========
'the first part is the main body for testing.
'The second part is the actual sub.
$DIM ALL DECLARE SUB PackTxt (text AS STRING)
'Notes:
' 1) The SUB adds CHR$(255) to a packed string to indicate that
'    it is packed. (For automatic packing/unpacking)
' 2) The SUB checks if the values of the characters in the string are
'    between 32 and 127 to reduce errors while packing.
'    It ignores control codes and hi-ASCII.
'Usage is very simple. Take a string (A$, for example) and call PackTxt:
' A$ = "This is a test string for the text packer."
' PackTxt A$
'The string will be packed and CHR$(255) added to the beginning to indicate
'that it is packed. To unpack a string just call PackTxt again:
' PackTxt A$
DIM TLine AS STRING
DIM Test AS STRING
TLine = "PackTxt SUB begins here"+_
    "This sub packs strings to about 52% of their original size. The sub"+_
    " automatically checks if the string is packed or not and 
npacks/packs"+_
    " it accordingly. (neat, huh?)"+_
    " Don't play around with the dictionary or you can lower the packing"+_
    " matio (when I got it there was a single space missing and that"+_
    " reduced the ratio by about 10%!)."+_
    " NOTE: text$ may only contain ASCII characters with the values 
2-127!"+_
    " Usage is very simple. Take a string (A$, for example) and call 
PackTxt:"+_
    " A$ = This is a test string for the text packer."+_
    " PackTxt A$"+_
    " The string will be packed and CHR$(255) added to the beginning to 
indicate"+_
    " that it is packed. To unpack a string just call PackTxt again:"
'This is the test section of this program.
?:? LEN(TLine);"{"+TLine+"}"
Test = TLine
PackTxt TLine
?:?:? (1 - LEN(TLine) / LEN(Test))*100;" percent reduction."
?:? LEN(TLine);"{"+TLine+"}"
PackTxt TLine
?:? LEN(TLine);"{"+TLine+"}"
IF TLine  Test THEN BEEP:? "Strings not the same!!!"
?:? "Done...."
'PackTxt SUB begins here
'This sub packs strings to about 52% of their original size. The sub
' automatically checks if the string is packed or not and unpacks/packs
' it accordingly. (neat, huh?)
' Don't play around with the dictionary or you can lower the packing
' ratio (when I got it there was a single space missing and that
' reduced the ratio by about 10%!).
' NOTE: text$ may only contain ASCII characters with the values 32-127!
SUB PackTxt (text AS STRING)
DIM Pair AS LOCAL STRING
DIM temp AS LOCAL STRING
DIM char AS STRING * 2
DIM v AS WORD
DIM x AS WORD
DIM cnt AS WORD
'The dictionary... it may look like junk, but it is pretty much the heart
' of the packer (so don't mess it up :) )
'old="  e  as  tinthouerhet anreesr d onn or o i y wo tontyo. neisarte"+_
'    "ed,  ctiy  bat snd fal pensestvengitu talehaurllcousa  mf dfoof "+_
'    "siril  hmeg om Icehironsasiossbedepe rli Tetel nicho lilprcactut"+_
'    "Thpaeceachh wige ebuaisursulmawaotowtsmploI solyee Cunm rtieno S"+_
'    "diwhs.rafincademe.irplk  ury Pwoacos gams,duayavucColamowe Aoopu"
'This is a new string that renders slightly better compression using
'my mail reader packet database.
Pair="  e  ts thinhe at d n eronanatre otey or. , en indr esti salto w"+_
     "o neouedarstngis--itntvehaf l  cof brimeio pliroealeg seco fasom"+_
     " hicllde meta  drahicacens rtaelchbeh : ussiilla Im di eurec Sfo"+_
     "utpeun lpoma gotacaiwarthonactolrstr Ass19lyiggenots Cid "+CHR$(34)+"pr 
nie"+_
     "ncsoulniirad Temwee.wheyayeeThevloia ys."+CHR$(34)+" amowyoI vi 
Pewplcimi M"
    IF LEN(Pair)  320 THEN 'just to check...
        PRINT "PACKING ERROR: Dictionary has the wrong size"
        BEEP
        EXIT SUB
    END IF
'check if the string is packed
    IF LEFT$(text, 1) = CHR$(255) THEN 'unpack this string
        temp= RIGHT$(text, LEN(text) - 1) 'remove CHR$(255)
        text = ""                            're-init text$
'prepair the string
        FOR x = 1 TO LEN(temp)
            v = ASC(MID$(temp, x, 1))
            IF v > 95 THEN 'we have an index to a 'pair'
                text = text + MID$(Pair, (v - 96) * 2 + 1, 2)
            ELSE 'all we have is the character-32
                text = text + CHR$(ASC(MID$(temp, x, 1)) + 32)
            END IF
        NEXT
    ELSE 'the string isn't packed so let's pack it...
        IF LEN(text) < 4 THEN EXIT SUB 'no use with strings less than 4 
hars.
'check if there are character values out of range
        FOR x = 1 TO LEN(text) 
            IF (ASC(MID$(text,x,1)) XOR &h80) < &hA0 THEN EXIT SUB 'these 
CANNOT be packed
        NEXT
        cnt = 1
        DO
            char = MID$(text, cnt,2)
            x = 1 'read pointer in Dict$
'if a pair is found, it must be dividable by 2
'(it needs to be divided so it can be packed)
ReDo:
            x = INSTR(x, Pair, char)
            IF x AND ISFALSE(x AND 1) THEN
                INCR x
                GOTO ReDo
            END IF
            IF x THEN 'the 2 characters from text$ are in Dict$
'append the instr of the characters in Dict$ to temp$
'(chars shouldn't be compressed twice it's stored so that it's more than 95)
                temp = temp + CHR$((x\2)+96)
                INCR cnt
            ELSE 'the chars aren't found, store the first char
         '(note that it's less than 95 and that cnt is only moved up 1)
                temp = temp + CHR$(ASC(MID$(text, cnt, 1)) - 32)
            END IF
            INCR cnt 'pointer in text$
        LOOP WHILE cnt <= LEN(text)
'copy temp$ into text$ and add CHR$(255) to indicate a packed string
        text = CHR$(255) + temp 
    END IF
END SUB
 
--- Maximus/2 3.01
---------------
* Origin: Madman BBS * Chico, California * 916-893-8079 * (1:119/88)

SOURCE: echomail via exec-pc

Email questions or comments to sysop@ipingthereforeiam.com
All parts of this website painstakingly hand-crafted in the U.S.A.!
IPTIA BBS/MUD/Terminal/Game Server List, © 2025 IPTIA Consulting™.