Great Circle Associates List-Managers
(December 1996)
 

Indexed By Date: [Previous] [Next] Indexed By Thread: [Previous] [Next]

Subject: Re: Re[4]: Lyris
From: Eric Thomas <ERIC @ VM . SE . LSOFT . COM>
Date: Sun, 1 Dec 1996 10:55:09 +0100
To: brian @ ilinx . ilinx . com, rogerk @ QueerNet . ORG, Roger Fajman <RAF @ CU . NIH . GOV>
Cc: list-managers @ GreatCircle . COM
In-reply-to: Message of Sat, 30 Nov 1996 23:08:51 EST from list-managers-owner@GreatCircle.COM

On Sat, 30 Nov 1996  23:08:51 EST Roger Fajman <RAF@CU.NIH.GOV> said:

>Scaling up isn't always simply a matter of purchasing more powerful
>hardware.  Sometimes improved algorithms are needed.

To illustrate  this point,  here are performance  figures from  the three
versions  of  LISTSERV  (Lite,  Classic and  High  Performance)  for  the
so-called  "biglist"  test  suite.  This  creates  a  list  with  100,000
subscribers and a request from its list owner to add 1,000 subscribers to
it.  The timing  is from  the first  to  last lines  of code  in the  ADD
command,  ie this  includes command  parsing, privilege/password  checks,
RFC822  parsing (for  the  individual name/address  pairs,  which can  be
specified in  any valid  RFC822 format), time  to perform  alias hostname
checks, and  so forth.  However, the  time to  process the  incoming mail
message and  decide that  it is  an ADD request  from JOE@FOO.COM  is not
counted. I ran the test on my PC  (P90 with 32M), since this is not for a
press release or the like I didn't bother to close other applications and
so  on. Similarly,  I  extrapolated  the Classic  and  Lite figures  from
smaller runs because I don't have all day.

+---------+--------------+--------------+-------------+--------------+
|         | ELAPSED time | ELAPSED time | Users added | Ratio of CPU |
| Version | (1000 users) |  (per user)  |  per second | to elapsed   |
+---------+--------------+--------------+-------------+--------------+
| Lite    |    2h 4m 23s |     7.47 sec |        0.13 |        31.6% |
+---------+--------------+--------------+-------------+--------------+
| Classic |      27m 13s |     1.63 sec |        0.61 |        43.5% |
+---------+--------------+--------------+-------------+--------------+
| HPO     |     0.44 sec |  0.00044 sec |     2264.15 |        79.1% |
+---------+--------------+--------------+-------------+--------------+

The Lite  version uses  the same  data format as  the other  versions (to
facilitate migration)  but treats  the files as  flat, plain  text files.
It's a bit  like reading a database sequentially until  you've found what
you're looking for,  as opposed to searching for it  directly. At 7.5 sec
for an  ADD to a  list of  100k subscribers, it  is probably in  the same
league as  the compiled freebies,  possibly somewhat faster. You  can buy
the Lite version and  run a list of that size with it,  and we don't mind
at all. But you're  going to have to put very serious  money on the table
to get a machine big enough to give you decent numbers here. Upgrading to
an infinitely fast  CPU would offer a 31% speedup.  This algorithm is I/O
bound   and  takes   about  1-2   seconds  of   development  time   (plus
implementation of course). Note that  the 7.5 sec result already includes
a large read cache, provided automatically by NT (any free RAM is used as
a file cache until a better use is found).

The  Classic   version  uses  the  traditional   LISTSERV  algorithms  to
manipulate the data  files. At 1.6 sec it should  outperform all freebies
and most commercial  products. The difference is actually  much bigger on
systems  which don't  implement  such a  huge file  cache.  That is,  the
difference may not  be enormous on NT,  but it used to be  much bigger at
the time  these algorithms were  designed. Quite  a lot of  thinking went
into  them,  and  the  actual  implementation  was  also  optimized.  The
algorithm is only 43% CPU bound because the goal was to reduce CPU usage,
not I/O, since on mainframes people are traditionally billed based on the
amount  of CPU  time they  have used.  Saving I/O  reduces the  number of
system calls  and thus  saves CPU  cycles, but this  was not  the primary
goal.

Again you are welcome to buy a Classic license and a big machine to run a
list of this size.  1.6 sec per ADD may very well  be sufficient for your
needs, present  and projected. But quite  a number of customers  needed a
faster system, so we developed one. It  was a LOT of work, but it allowed
people to do what they needed to do using standard PC hardware. Even on a
P90 with  32M you  can add  2200 users  per second,  and this  process is
mostly CPU bound (95% CPU bound if I include the cycles the system spends
performing  I/O  on  the  application's  behalf).  Which  is  not  really
surprising since  every individual  address needs to  be parsed,  and the
figures are for  the total elapsed command processing time,  not just for
the act of adding an address to the database or whatever you want to call
it.

Why  should you  pay  more for  the High  Performance  version, which  is
otherwise  identical   to  the   Classic  version?  Simply   because  the
overwhelming  majority  of customers  couldn't  care  less how  well  the
product performs with  lists of that size. We would  never have developed
and tested these  algorithms if we had had to  sell the resulting product
at the normal price. In the  end the interested parties would have bought
$200k+ in  hardware to  get acceptable results  with Classic,  or someone
else  would have  made a  high performance  list manager  for this  niche
market, and would  have charged according to the number  of customers, ie
pretty  much what  we're charging  for the  High Performance  version. We
decided that it  would be stupid not  to grab that market as  well, so we
developed  the High  Performance  version,  but we  really  see  it as  a
separate product addressing very different needs.

  Eric

Indexed By Date Previous: Re: Re[4]: Lyris
From: dbsmith@atbbs.com (David B. Smith)
Next: Questionable Address?
From: Cindy Stanley <connect@ecentral.com>
Indexed By Thread Previous: Re: Re[6]: Lyris
From: James Cook <jcook@netcom.com>
Next: Re: Re[4]: Lyris
From: "Roger B.A. Klorese" <rogerk@QueerNet.ORG>

Google
 
Search Internet Search www.greatcircle.com