Re: The Practical Benefits of the Relational Model

From: Lauri Pietarinen <lauri.pietarinen_at_atbusiness.com>
Date: 30 Oct 2002 11:38:45 -0800
Message-ID: <e9d83568.0210301138.2e5099e6_at_posting.google.com>


Sticking to the name of this thread posting (Practical benefits of the Relational Model) I would like to give an example of
a query that is easy to express in
a language that conforms to
"The Third Manifesto".

(see www.thethirdmanifesto.com
 for more details)

In this example I am using the language
D4 of Dataphor, since it is (to my knowledge) the only "live" language that conforms.

Problem: given the standard (slightly
simplified) "Customers and Orders" database make a query that gives all customers
with the customers two most popular
products (in terms of total purchases)
listed in one column.

So an example result would be like this:

//CustomerId CustomerName MostPopularProducts
//---------- ------------ -------------------
//1 Customer 1 Screw, Nail
//2 Customer 2 Screw, Drill

(if you want to skip the details go to
the end of the posting)

Here is the schema:

create table Customer
 { CustomerId : Integer,
   CustomerName : String,
   key { CustomerId }
 };    

create table Product
  { ProductId : Integer,
    ProductName : String,
    ProductPrice : Integer,
    key { ProductId },
    key { ProductName }
  };   

create table Order
  { OrderId : Integer,
    CustomerId : Integer,
    OrderQty : Integer,
    ProductId : Integer,
    key { OrderId },
    reference Order_Customer {CustomerId}  references Customer {CustomerId},

    reference Order_Product {ProductId}  references Product {ProductId}
  };   

Here is the sample data:

Customer :=

 table {
   row { 1 CustomerId, 'Customer 1' CustomerName },
   row { 2 , 'Customer 2' }

 };  

Product :=

  table {
    row { 1 ProductId, 'Hammer' ProductName, 10 ProductPrice },
    row { 2 , 'Nail' , 1 },
    row { 3 , 'Screw' , 2 },
    row { 4 , 'Saw' , 15 },
    row { 5 , 'Drill' , 30 }    

  };   

Order :=

   table {

      row { 1 OrderId, 1 CustomerId, 1 OrderQty, 1 ProductId },
      row { 2, 1, 9, 2 },
      row { 3, 1, 2, 2 },
      row { 4, 1, 6, 3 },
      row { 5, 2, 1, 5 },
      row { 6, 2, 29, 2 },
      row { 7, 2, 14, 3 },
      row { 8, 2, 2, 3 }

   };

Here is a summary of the contents of the tables:    

select
  Customer join
  Order join
  Product
  add { OrderQty*ProductPrice OrderTotal }   over { CustomerName, OrderId, ProductName, OrderTotal };

//CustomerName OrderId ProductName OrderTotal
//------------ ------- ----------- ----------
//Customer 1 1 Hammer 10
//Customer 1 2 Nail 9
//Customer 1 3 Nail 2
//Customer 1 4 Screw 12
//Customer 2 5 Drill 30
//Customer 2 6 Nail 29
//Customer 2 7 Screw 28
//Customer 2 8 Screw 4
   

Now we can start constructing the query:      

  1. create aggregate operator for concatenating strings

create aggregate operator ListString(AValue : String) : String   initialization
   begin
    var LString : String := "";
    var comma : String := "";
   end
  aggregation
   begin
    LString := LString + comma +AValue;
    comma := ", ";
   end
  finalization
   begin
    result := LString;
   end;

Here is an example of invokation:    

 select Product
  group
  add { ListString( ProductName ) ProductNames };   

//ProductNames
//-------------------------------
//Hammer, Nail, Screw, Saw, Drill
  

2) Get total price for each customer/product combination

select
  Customer join
  Order join
  Product
  add { OrderQty*ProductPrice OrderTotal }   group by { CustomerId, ProductName }
  add { Sum( OrderTotal ) CustProductTotal };   

CustomerId ProductName CustProductTotal

---------- ----------- ---------------- 
1          Hammer      10               
1          Nail        11               
1          Screw       12               
2          Drill       30               
2          Nail        29               
2          Screw       32                 
  

3) Get top two products for Customer 1

select
  Customer join
  Order join
  Product
  where CustomerId = 1
  add { OrderQty*ProductPrice OrderTotal }   group by { CustomerId, ProductName }
  add { Sum( OrderTotal ) CustProductTotal }   return 2 by { CustProductTotal desc };   

CustomerId ProductName CustProductTotal

---------- ----------- ---------------- 
1          Screw       12               
1          Nail        11               

4) Wrap it all up    

select
 Customer
   add
    {

      ListString
        (
          ProductName from
            (
              Order join Product
                rename { CustomerId OrderCustomerId }
                add { OrderQty*ProductPrice OrderTotal }
                where CustomerId = OrderCustomerId
                group by {ProductId, ProductName}
                add { Sum(OrderTotal) CustProductTotal }
                return 2 by {CustProductTotal desc}
            )
          ) MostPopularProducts

    };

And this is the result we wanted...

//CustomerId CustomerName MostPopularProducts
//---------- ------------ -------------------
//1 Customer 1 Screw, Nail
//2 Customer 2 Screw, Drill

Points worth noting:

  1. The ListString aggregate operator is part of infrastructure, i.e. it is only created once by perhaps some in-house support group or maybe bought as a component from some vendor.
  2. The final expression is very compact consisting of only 18 rows. The corresponding query done with SQL and a programming language (maybe a stored procedure) would be much longer.
  3. The expression is easy to create one step at a time simplifying debugging
  4. Because the expression only specifies what we want (and not how to get it) a good optimizer could find a clever access path using indexes, statistical information etc... In a hand coded version the access path would always be fixed.
  5. The '2' in 'return 2' can be replaced by any expression that returns an integer, so it could be parametrized, even by customer.

It is interesting to compare this to
APL, which is a 'vector processing engine'. The ultimate RDBMS is a 'relation processing engine'!

'KA' writes in http://www.pgro.uk7.net/logical.htm that 80% of the relational potential has been realised with SQL, and the remaining 20% would be hard to achieve.

In my opinion we have barely dipped our toes in a vast ocean! There is _so_ much more we could get.

It is so hard for me to understand why so few seem share my excitement.

Lauri Pietarinen Received on Wed Oct 30 2002 - 20:38:45 CET

Original text of this message