Great Circle Associates Majordomo-Workers
(July 1997)
 

Indexed By Date: [Previous] [Next] Indexed By Thread: [Previous] [Next]

Subject: Re: Non-ASCII data files in 2.0
From: Jason L Tibbitts III <tibbs @ hpc . uh . edu>
Date: 31 Jul 1997 13:29:31 -0500
To: majordomo-workers @ greatcircle . com
In-reply-to: bill@ecology.bio.dfo.ca's message of Thu, 31 Jul 1997 14:30:15 -0300 (ADT)
References: <199707311730.OAA16799@ecology.bio.dfo.ca>

>>>>> "BS" == Bill Silvert <bill@ecology.bio.dfo.ca> writes:

BS> According to something that was posted here a while back (by Jason I
BS> think), the lists will be kept in binary format for faster processing
BS> speed.

That would depend on your local decisions.  Right now, everything's kept in
a flat text database that looks like this:

XYX:sina:2.0-lists/hpc.uh.edu/test-list> cat -T _subscribers 
tibbs@hpc.uh.edu^Itibbs@sina.hpc.uh.edu^Itibbs@sina.hpc.uh.edu^I868240998^I869893830^Ieach^I^IASh^Ihurl
nobody@hpc.uh.edu^Inobody@hpc.uh.edu^Inobody@hpc.uh.edu^I869893883^I869893883^Ieach^Ihurl^IAS^Ihurl
nobody@nordland.no^Inobody@nordland.no^Inobody@nordland.no^I869893927^I869893927^Ieach^Ihurl^IAS^Ihurl
[...]

I haven't gotten around to writing the other backends yet.

BS> The problem with this is that I find myself using grep and similar
BS> tools a lot to track down bad addresses, and if I lose that capability
BS> it may cost me a lot more time than the use of binary files will save
BS> me.

'which' takes a perl regexp; I could easily make 'who' do the same.  You
can call these from the command line; you could even do a 'listgrep' alias
and save typing in the long run.  

The problem with flat files is that they have bad O(n) properties, while
B+Trees (arranged through the use of Berkeley DB) have nice O(log n)
properties _and_ come out sorted by whatever criteria I desire.  MySQL or
MSQL or Oracle or Informix or whatever (all accessed transparently via DBI)
would let me do quick extractions of things like subscriber classes _and_
have nice O(log n) or better properties.  Small lists don't care about this
stiff, but large lists do.  Unfortunately, all of the extra fields do slow
down the flat-file case to worse than 1.9x performance, but that's life.

All of this is abstracted anyway; there will be lots of hacking potential.

BS> (1) Will the ASCII format still be supported? I think that this was
BS> mentioned, but I didn't make a careful note when the message passed by.

I've always said that you'll have the option.

BS> (2) Will something equivalent to grep work on the binary files?

Of course all of the majordomo commands will work on the databases _and_
have the benefit of not requiring you to have access to the server machine.
'which' is the majordomo equivalent of grep.  You could, of course, always
to a 'who' to a file and grep that, even if I didn't support all of the
other methods and you weren't local.

In the end, I suspect that folks (including me) use grep and vi to remove
bad addresses because the other interfaces suck.  I (and I'm sure a whole
pile of other people) hope to change that.

 - J<


References:
Indexed By Date Previous: Non-ASCII data files in 2.0
From: bill@ecology.bio.dfo.ca (Bill Silvert)
Next: problem with config-test.pl, and fix to repair same.
From: c1040@azfms.com (Rusty Carruth)
Indexed By Thread Previous: Non-ASCII data files in 2.0
From: bill@ecology.bio.dfo.ca (Bill Silvert)
Next: problem with config-test.pl, and fix to repair same.
From: c1040@azfms.com (Rusty Carruth)

Google
 
Search Internet Search www.greatcircle.com