Re: The Practical Benefits of the Relational Model
Date: 30 Oct 2002 11:38:45 -0800
Message-ID: <e9d83568.0210301138.2e5099e6_at_posting.google.com>
Sticking to the name of this thread posting
(Practical benefits of the Relational Model)
I would like to give an example of
a query that is easy to express in
a language that conforms to
"The Third Manifesto".
(see www.thethirdmanifesto.com
for more details)
In this example I am using the language
D4 of Dataphor, since it is (to my knowledge)
the only "live" language that conforms.
Problem: given the standard (slightly
simplified) "Customers and Orders" database
make a query that gives all customers
with the customers two most popular
products (in terms of total purchases)
listed in one column.
So an example result would be like this:
//CustomerId CustomerName MostPopularProducts
//---------- ------------ -------------------
//1 Customer 1 Screw, Nail
//2 Customer 2 Screw, Drill
(if you want to skip the details go to
the end of the posting)
Here is the schema:
create table Customer
{ CustomerId : Integer,
CustomerName : String,
key { CustomerId }
};
create table Product
{ ProductId : Integer,
ProductName : String,
ProductPrice : Integer,
key { ProductId },
key { ProductName }
};
create table Order
{ OrderId : Integer,
CustomerId : Integer,
OrderQty : Integer,
ProductId : Integer,
key { OrderId },
reference Order_Customer {CustomerId}
references Customer {CustomerId},
reference Order_Product {ProductId}
references Product {ProductId}
};
Here is the sample data:
Customer :=
table { row { 1 CustomerId, 'Customer 1' CustomerName }, row { 2 , 'Customer 2' }
};
Product :=
table { row { 1 ProductId, 'Hammer' ProductName, 10 ProductPrice }, row { 2 , 'Nail' , 1 }, row { 3 , 'Screw' , 2 }, row { 4 , 'Saw' , 15 }, row { 5 , 'Drill' , 30 }
};
Order :=
table {
row { 1 OrderId, 1 CustomerId, 1 OrderQty, 1 ProductId }, row { 2, 1, 9, 2 }, row { 3, 1, 2, 2 }, row { 4, 1, 6, 3 }, row { 5, 2, 1, 5 }, row { 6, 2, 29, 2 }, row { 7, 2, 14, 3 }, row { 8, 2, 2, 3 }
};
Here is a summary of the contents of the tables:
select
Customer join
Order join
Product
add { OrderQty*ProductPrice OrderTotal }
over { CustomerName, OrderId, ProductName, OrderTotal };
//CustomerName OrderId ProductName OrderTotal
//------------ ------- ----------- ----------
//Customer 1 1 Hammer 10
//Customer 1 2 Nail 9
//Customer 1 3 Nail 2
//Customer 1 4 Screw 12
//Customer 2 5 Drill 30
//Customer 2 6 Nail 29
//Customer 2 7 Screw 28
//Customer 2 8 Screw 4
Now we can start constructing the query:
- create aggregate operator for concatenating strings
create aggregate operator ListString(AValue : String) : String
initialization
begin
var LString : String := "";
var comma : String := "";
end
aggregation
begin
LString := LString + comma +AValue;
comma := ", ";
end
finalization
begin
result := LString;
end;
Here is an example of invokation:
select Product
group
add { ListString( ProductName ) ProductNames };
//ProductNames
//-------------------------------
//Hammer, Nail, Screw, Saw, Drill
2) Get total price for each customer/product combination
select
Customer join
Order join
Product
add { OrderQty*ProductPrice OrderTotal }
group by { CustomerId, ProductName }
add { Sum( OrderTotal ) CustProductTotal };
CustomerId ProductName CustProductTotal
---------- ----------- ---------------- 1 Hammer 10 1 Nail 11 1 Screw 12 2 Drill 30 2 Nail 29 2 Screw 32
3) Get top two products for Customer 1
select
Customer join
Order join
Product
where CustomerId = 1
add { OrderQty*ProductPrice OrderTotal }
group by { CustomerId, ProductName }
add { Sum( OrderTotal ) CustProductTotal }
return 2 by { CustProductTotal desc };
CustomerId ProductName CustProductTotal
---------- ----------- ---------------- 1 Screw 12 1 Nail 11
4) Wrap it all up
select
Customer
add
{
ListString ( ProductName from ( Order join Product rename { CustomerId OrderCustomerId } add { OrderQty*ProductPrice OrderTotal } where CustomerId = OrderCustomerId group by {ProductId, ProductName} add { Sum(OrderTotal) CustProductTotal } return 2 by {CustProductTotal desc} ) ) MostPopularProducts
};
And this is the result we wanted...
//CustomerId CustomerName MostPopularProducts
//---------- ------------ -------------------
//1 Customer 1 Screw, Nail
//2 Customer 2 Screw, Drill
Points worth noting:
- The ListString aggregate operator is part of infrastructure, i.e. it is only created once by perhaps some in-house support group or maybe bought as a component from some vendor.
- The final expression is very compact consisting of only 18 rows. The corresponding query done with SQL and a programming language (maybe a stored procedure) would be much longer.
- The expression is easy to create one step at a time simplifying debugging
- Because the expression only specifies what we want (and not how to get it) a good optimizer could find a clever access path using indexes, statistical information etc... In a hand coded version the access path would always be fixed.
- The '2' in 'return 2' can be replaced by any expression that returns an integer, so it could be parametrized, even by customer.
It is interesting to compare this to
APL, which is a 'vector processing engine'.
The ultimate RDBMS is a 'relation processing
engine'!
'KA' writes in http://www.pgro.uk7.net/logical.htm that 80% of the relational potential has been realised with SQL, and the remaining 20% would be hard to achieve.
In my opinion we have barely dipped our toes in a vast ocean! There is _so_ much more we could get.
It is so hard for me to understand why so few seem share my excitement.
Lauri Pietarinen Received on Wed Oct 30 2002 - 20:38:45 CET