
    Raghu,

    I include my analysis, opinions, and suggestions on aggregation
    within Coral, to allow for their examination prior to our meeting
    Thursday.

    As usual, its a bit long.


    Aggregation:

    Desires:

    1) "Easy" aggregation should be much faster.

    2) It should be possible to easily add new aggregates,
       (stddev, etc.) which need not be particularly fast, though
       hopefully faster than the current implementation.


    After examination of the code, and a discussion with Joseph, the
    main efficiency problems of the current implementation have been
    determined (with high probability):

    1) The tuples used by a relation with an aggregate are deleted
       using lazy-deletion, even though the arglist (containing the
       aggregate) is moved to a new tuple (which is then inserted).

       These tuples will not be deleted until the next scan of the
       relation, and perhaps not even then (the current lazy-deletion
       implementation will not delete during a scan unless there is
       only a single iterator open on the relation at that time).

       The newly created tuple, with the arglist borrowed from the
       now "deleted" tuple, is inserted into the relation.

       So, the cost of inserting this new tuple in memory is:
        sizeof(Tuple) * (number of indexes) * sizeof(HashNode)

       The sizes of the classes of interest:
        Tuple:    16 bytes
        HashNode: 12 bytes

       for example, for a relation with 3 indexes and 100K facts,
       performing a count(<>) over a single argument, will, at the
       end of the scan result in approximately 5.2 Meg of unneeded
       tuples/hashnodes that are marked as deleted.  A few Meg here,
       a few there, and after a while we're talking real memory. (sorry)


       Ideally, a new tuple should not be created, but the current
       one simply reused.  This would not only save all this memory
       from being allocated, but avoid rehashing the columns that
       do not contain the aggregate, as they already have the correct
       entry in their respective index.  

       What we need is a function to move the HashNode for the aggregate
       column from its current to its new position in the index for
       it (after rehashing).  In general, a delete from index, insert
       to index would be a good, but this moving operation should
       probably be special-cased.

       The problem with this is, (and probably the reason its not
       currently being done this way), is that a scan over the relation
       only keeps track of what bucket in the hash its on and a pointer
       to the next HashNode to process.  To delete or move that hash
       node while its being pointed to by another iterator would 
       confuse that iterator on its next attempt to obtain a tuple;
       this is the reason why lazy-deletes are currently only performed
       if there is only a single scan open on the relation, so we are
       certain this cannot occur.

       Care must be used here, as any change to the efficiency of this
       data structure will affect every scan, not just those involving
       aggregates.

       One possibility is for the hash table, instead of being an
       array of HashNode*, to be an array of HashNodeStack*; where
       a HashNode stack would be a dynamic array of HashNode*.
       At first glance, this would appear to consume a sizeable amount
       of memory, but the HashNode itself would no longer require a
       HashNode* for a next pointer, which would compensate for the
       pointers within the stack itself.  

       So, each iterator, rather than keeping a bucket number and a
       pointer to the next HashNode, would store the bucket number
       and the current index into the HashNodeStack.  When an item
       was removed the stack could be compressed, or the entry
       simply set to NULL.  (btw, Coral currently uses IntStack
       and CArgStack that already have methods for all of these operations.)

       So, this method would require an extra indirection each time
       a scan obtained a bucket, another to index into the Stack,
       and a test of the index when moving to the next item.
       (The test against NULL is already being performed by the current
       implementation to watch for the end of the chain.)

       In my opinion the above cost is well worth the gain in flexibility
       and memory savings this approach offers.

       Other approaches to add the required indirection could also be
       considered, the above method struck me as farily clean with a
       minimal addition of overhead.


    2) Changing aggregates to be more extensible.  The aggregates
       we currently have that do not require a test for distinctness
       can all (in theory), be implemented efficiently.  Adding a
       new aggregate in general can be rather difficult though.
       Adding a set operator, on the other hand, is trivial.  Each of
       the currently implemented aggregates has an identical set operator
       which can be used in the body of a rule, ie:

       ? count( {a,b,c,d}, X).
       X=4. 

    Currently, we have the following:


    General:

         <X>
         distinct(<X>)
         min(<X>)
         max(<X>)
         any(<X>)

    Numerical:  

         MultiSet                Set
        ----------            ---------
         count(<X>)         count_distinct(<X>)
         sum(<X>)           sum_distinct(<X>)
         prod(<X>)          prod_distinct(<X>)
         avg(<X>)           avg_distinct(<X>)


    From a user-interface point of view, the <X> can be seen as,
    "make a multiset"; while distinct() can be seen as
    "make a set (from a multiset)".  As such, I think the "distinct"
    aggregates should be represented as:

         count( distinct(<X>) )
         sum( distinct(<X>) )
         prod( distinct(<X>) )
         avg( distinct(<X>) )

    Note that the representation does not necessarily have anything to
    do with the internal representation, or how these aggregates are
    solved.  Of course, if we were to make this change, the old forms
    would be retained for a while for backwared compatability.

    With this slight modification, and the addition of a set operator,
    for instance:

        stddev( {90,97,94,92}, X)

    (which we have). We could stipulate that if no implemented aggregation
    is found, to check for a set operator of the same name, and apply it
    when the result of this "aggregation" is needed.

    This method would supply both "stddev(<>)" as well as
    "stddev( distinct(<>) )" naturally.

    It would also extend naturally to the aggregates for which there
    is no advantage to special-casing out, ie:

    count( distinct(<>) )

    eliminating a portion of code within Coral that replicates abilities
    that exist elsewhere.

    With a touch of trickery, this method could be extended to allow an
    aggregation function to be specified as a Coral module as well:

    module myagg.
    export myagg[bf].

    myagg( Set, Ans ) :- blah, blah, blah.
    ....
    end_module.
  
 
    Shaun 
-- 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  Shaun Flisakowski 

  Associate Researcher / Coral DB Programmer   flisakow@cs.wisc.edu
  Computer Science Dept, UW-Madison

  Don't you guys find it tedious typing the same thing at the end
  of each message every time you mail something?
  I know I do, but when in Rome...

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
