TIP: Click on subject to list as thread! ANSI
echo: perl
to: mark lewis
from: Maurice Kinal
date: 2005-02-20 15:05:50
subject: talking to myself

Hey mark!

Feb 20 16:42 05, mark lewis wrote to Maurice Kinal:

 ml> MSGID isn't the magic bullet that some seem to want to think it is... 
 ml> some of your comments appear to be saying that it should/could be and 
 ml> that it isn't and thus should be thrown in the bitbucket...

Yes and no.  What I am trying to say, or what I think I am saying, is that
without rhyme or reason ALL that is good for won't make any difference
whether it is thrown into the bitbucket or not.  Without any meaningful
logic it is a complete waste of bytes and processing it causes.  With logic
it has potential as a viable accounting flag/tag/whatever.  I still have
doubts about it's dupechecking abilities but at least it has some
potential.  Currently I have doubts about any real usefulness to man or
machine.

 ml> while i do /tend/ to agree, i also tend more to not agree... 

Sounds reasonable.

 ml> even then, you then have the problem of crossposted messages... is it 
 ml> a dupe because it is exactly the same message in more than one area? 
 ml> i don't think so...

Nor do I.  As long as it is accountable in the area it shows up in then I
can't see a problem with it, even if it shows up in other areas.  However,
having said that, I'd think there may be a better way  to archive crossposted messages where one carries more then one
area where that message is "posted" to.  A single message could
fly more then one area tag.  Thus some redundancy could effectively be
eliminated.  No?

 ml> detecting duplicates in fidonet is a tricky science,

I would agree with that assessment.

 ml> messaging... wildcat, pcboard and wwiv systems are the first three 
 ml> that come to mind as having shoehorned retrofits for participation in 
 ml> fidonet... quite simply, their message bases were not designed with 
 ml> fidonet in mind... actually, not just fidonet but more without any 
 ml> sort of thought to control lines within messages...

Right.  Having a trimmed down archiving system where all stored messages
only contain what is absolutely needed to successfully be deemed a
"message" - say "To", "From",
"Date" - and then tack on whatever else is required depending on
the target, would greatly reduce the amount of information any archived
base or area needs to know.  For instance a dynamic cgi script could take
this information and "convert" it to html display to the end user
without affecting the archive in any meaningful way, and that exact same
archive could be employed to construct outbound Fido compliant pkts.

 ml> it is long past the time when this stuff can truely be fixed and 
 ml> enforced...

Probably but that doesn't mean we can't discuss, and/or employ, any of this
"stuff" to our advantage.  Chances are by doing that we may all
find ourselves complying out of choice as opposed to enforcement ... or so
the theory goes.

 ml> all we can do now is to play the game and hope for the 
 ml> best...

That is one way.

 ml> that takes us to the question of how to build a dataset of messages 
 ml> and what to use as the duplicate trigger...

Right.

 ml> things are done in binary in fidonet because of limited storage space 
 ml> as well as for speed of processing, we have to ask what method would 
 ml> ultimately be the best for quick processing, small storage, and 
 ml> generating truely unique IDs for the local duplicate detection 
 ml> system?

That is a toughy for sure.  Again I would think a standard method of
generation of MSGID would be of great assistance to all.  It isn't
foolproof (is anything?) but it would help.

 ml> i can see possibly a two fold method involving recording the actual 
 ml> header data as well as running it thru md5 or some such and recording 
 ml> the MSGID if it exists...

Possibly.  It sounds like it has potential.

 ml> speed... how much time are you willing to spend rummaging thru a 
 ml> duplicate dataset looking for a match before deciding if a message is 
 ml> a duplicate or not?

Heh, heh.  It depends on how big a problem dupes really are.  Not many REAL
dupes and then I would say zero "rummaging", but if I were Rusty
and seeing hundreds of REAL dupes then I'd really wish my uplink was doing
better quality control.  But then that of course brings up the question
whether or not the uplink isn't filtering out messages that aren't really
dupes but instead MSGID dupes.  I've seen those and have seriously wondered
if the few I do manage to see aren't representative of a far greater and
unseen problem regarding the whole MSGID situation as it stands today.

 ml> considering your high desire for speed, i can see 
 ml> small datasets (one per message area al la squish?) to ease the 
 ml> search time...

Possibly.  I have been pondering what I wish to do locally for myself all
the way around, not just Fido.

 ml> interesting problem, this is... i'm already visualising multiple dupe 
 ml> dataset files based on the AREA line, locally carried areas 
 ml> notwithstanding due to the processing of passthru areas, or one large 
 ml> or even multiple large datafiles containing AREA grouped datasets of 
 ml> header and MSGID data...

Interesting to ponder.

Life is good,
Maurice

--- Msged/LNX 6.1.2
* Origin: Coffin Point - Ladysmith, BC Canada (1:153/401.1)
SEEN-BY: 633/267 270
@PATH: 153/401 307 140/1 106/2000 633/267

SOURCE: echomail via fidonet.ozzmosis.com

Email questions or comments to sysop@ipingthereforeiam.com
All parts of this website painstakingly hand-crafted in the U.S.A.!
IPTIA BBS/MUD/Terminal/Game Server List, © 2025 IPTIA Consulting™.