Query Operators in Cω
Cω adds two broad classes of query operators to the C# language: XPath-based operators for querying the member variables of an object by name or by type and SQL-based operators for performing sophisticated queries involving projection, grouping, and joining of data from one or more objects.
XPath-Based Operators
With the existence of streams and anonymous structs that can have multiple members with the same name, even ordinary direct member access with the .
operator in Cω can be considered a query operation. For example, the operation books.Book.title
from the previous section returns the titles of all the Book
objects contained within the Books
class. This is equivalent to the XPath query /Books/Book/title
which returns all the titles of Book
elements contained within the Books
element.
The wildcard member access operator .*
is used to retrieve all the fields of a type. This operator is equivalent to the XPath query child::*
which returns all the child nodes of the principal node type of the current node. For example, the operation books.Book.*
returns a stream container of all the members of all the Book
objects contained within the Books
class. This is equivalent to the XPath query /Books/Book/*
which returns all the child elements of the Book
elements contained within the Books
element.
Cω also supports transitive member access using the ...
operator which is analogous to the descendent-or-self
axis or ..
abbreviated path in XPath. The operation books...title
returns a stream containing all title
member fields that are contained within the Books
class or are member fields of any of its contents in a recursive manner. This is equivalent to the XPath query /Books//title
which returns all title
elements that are descendants of the Books
element. The transitive member access operator can also be used to match nodes restricted to a particular type using syntax of the form ...typename::*
. For example, the operation books...string::*
returns a stream containing all the member fields of type System.String
that are contained within the Books
class or are member fields of any of its contents in a recursive manner. This is analogous to the XPath 2.0 query /Books//element(*, xs:string)
which matches any descendent of the Books
element of type xs:string
.
Filter operations can be applied to the results of a transitive member access in the same manner that predicates can be used to filter XPath queries. Just as in XPath, a Cω filter is applied to a query operation by using the [expression]
operator placed after the query. As is the case with apply-to-all-expressions, the filter may contain the special variable it
which is bound to each successive element of the iterated stream. Below is an example, which queries all the fields of type System.Int32
in an anonymous struct then filters the results to those whose value is greater than 8.
struct {int a; int b; int c;} z = new {a=5, b=10, c= 15};
int* values = z...int::*[it > 8];
foreach(int i in values){
Console.WriteLine(i + " is greater than 8");
}
SQL-Based Operators
Cω includes a number of constructs from the SQL language as keywords. Operators for performing selection with projection, filtering, ordering, grouping, and joins are all built into Cω. The SQL operators can be applied to both in-memory objects and relational stores that can be accessed via ADO.NET. When applied to a relational database, the Cω query operators are converted to SQL queries over the underlying store. The primary advantage of using the SQL operators from the Cω language is that the query syntax and results can be checked at compile time instead of at runtime, as is the case with embedding SQL expressions in strings using traditional relational APIs.
To connect to a SQL database in Cω, it must be exposed as a managed assembly (i.e. a .NET library file), which is then referenced by the application. A relational database can be exposed to a Cω as a managed assembly either by using the sql2comega.exe
command line tool or the Add Database Schema...
dialog from within Visual Studio. Database
objects are used by Cω to represent the relational database hosted by the server. A Database
object has a public property for each table or view and a method for each table-valued function found in the database. To query a relational database, a table, view, or table-valued function must be specified as input to one or more of the SQL-based operators.
The following sample program and output shows some of the capabilities of using the SQL-based operators to query a relational database in Cω. The database used in this example is the sample Northwind database that comes with Microsoft SQL Server. The name DB
used in the example refers to a global instance of a Database object in the Northwind
namespace of the Northwind.dll
assembly generated using sql2comega.exe
.
using System;
using System.Data.SqlTypes;
using Northwind;
class Test {
static void Main() {
// The foreach statement can infer the type of the
iteration variable 'row'
// statically evaluating the result type of the select expression
foreach( row in select ContactName from DB.Customers ) {
Console.WriteLine("{0}", row.ContactName);
}
}
}
The following sample program and output shows some of the capabilities of using the SQL-based operators to query in-memory objects in Cω:
Code Sample
using Microsoft.Comega;
using System;
public class Test{
enum CDStyle {Alt, Classic, HipHop}
static struct{ string Title; string Artist; CDStyle Style; int Year;}* CDs =
new{new{ Title="Lucky Frog", Artist="Holly Holt", Style=CDStyle.Alt, Year=2001},
new{ Title="Kamikaze", Artist="Twista", Style=CDStyle.HipHop, Year=2004},
new{ Title="Stop Light Green", Artist="Robert O'Hara",
Style=CDStyle.Alt, Year=1981},
new{ Title="Noctures", Artist="Chopin", Style=CDStyle.Classic, Year=1892},
new{ Title="Mimosa!", Artist="Brian Groth", Style=CDStyle.Alt, Year=1980},
new {Title="Beg For Mercy", Artist="G-Unit", Style=CDStyle.HipHop, Year=2003}
};
public static void Main(){
struct { string Title; string Artist;}* results;
Console.WriteLine("QUERY #1: select Title, Artist from CDs where Style ==
CDStyle.HipHop");
results = select Title, Artist from CDs where Style == CDStyle.HipHop;
results.{ Console.WriteLine("Title = {0}, Artist = {1}",
it.Title, it.Artist); };
Console.WriteLine();
struct { string Title; string Artist; int Year;}* results2;
Console.WriteLine("QUERY #2: select Title, Artist, Year from CDs order by Year");
results2 = select Title, Artist, Year from CDs order by Year;
results2.{ Console.WriteLine("Title = {0}, Artist = {1}, Year = {2}", it.Title,
it.Artist, it.Year); };
}
}
Output
QUERY #1: select Title, Artist from CDs where Style == CDStyle.HipHop
Title = Kamikaze, Artist = Twista
Title = Beg For Mercy, Artist = G-Unit
QUERY #2: select Title, Artist, Year from CDs order by Year
Title = Noctures, Artist = Chopin, Year = 1892
Title = Mimosa!, Artist = Brian Groth, Year = 1980
Title = Stop Light Green, Artist = Robert O'Hara, Year = 1981
Title = Lucky Frog, Artist = Holly Holt, Year = 2001
Title = Beg For Mercy, Artist = G-Unit, Year = 2003
Title = Kamikaze, Artist = Twista, Year = 2004
A number of operations that require tedious nested loops can be processed in a straightforward manner using the declarative SQL-like operators in Cω. Provided below is a brief description of the major classes of SQL operators included in Cω.
Projection
The projection of the select expression is the list of expressions following the select
keyword. The projection is executed once for each row specified by the from
clause. The job of the projection is to shape the resulting rows of data into rows containing only the columns required. The simplest form of the select command consists of the select
keyword followed by a projection list of one or more expressions identifying columns from the source, followed by the from
keyword and then an expression identifying the source of the query. For example,
rows = select ContactName, Phone from DB.Customers;
foreach( row in rows ) {
Console.WriteLine("{0}", row.ContactName);
}
In the above example, the type designator for the results of the select query are not specified. The Cω compiler automatically infers the correct type. The actual type of an individual result row is a Cω tuple type. One can specify the result type directly using a tuple type (i.e. an anonymous struct) and the asterisk (*) to designate a stream of results. For example,
struct{SqlString ContactName;}* rows =
select ContactName from DB.Customers;
struct{SqlString ContactName; SqlString Phone;}* rows =
select ContactName, Phone from DB.Customers;
Filtering
The results of a select expression can be filtered using one of three keywords; distinct
, top
and where
. The distinct
keyword is used to restrict the resulting rows to only unique values. The top
keyword is used to restrict the total number of rows produced by the query. The keyword top
is followed by a constant numeric expression which specifies the number of rows to return. One can also create a distinct top
selection which restricts the total number of unique rows returned by the query. The where
clause is used to specify a Boolean expression for filtering the rows returned by the query source. Rows where the expression evaluates to true are retained, while the rest are discarded. The example below shows all three filter operators in action:
select distinct top 10 ContactName from DB.Customers where City == "London";
Ordering
The resulting rows from a select expression can be sorted by using the order by
clause. The order by clause is always the last clause of the select expression, if it is specified at all. The order by
clause consists of the two keywords order by
followed immediately by a comma separated list of expressions that define the values that will determine the order. The first expression in the list defines the ordering criteria with the most precedence. It is also possible to specify whether each expression should be considered in ascending or descending order. The default for all expressions is to be considered in ascending order. The example below shows the order by
clause in action.
rows = select ContactName, Phone from DB.Customers
order by ContactName desc, Phone asc;
Grouping
Values can be aggregated across multiple rows using the group by
clause and the built-in aggregate functions. The group by
clause enables one to specify how different rows are actually related so they can be grouped together. Aggregate functions can then be applied to the columns to compute values over the group. An aggregate function is a function that computes a single value from a series of inputs, such as computing the sum or average of a series of numbers. There are six aggregate functions built into Cω. They are Count, Min, Max, Avg, Sum and Stddev. To use these functions in a query, one must first import the System.Query
namespace. The following example shows how to use the group by
clause and the built-in aggregate functions.
rows = select Country, Count(Country) from DB.Customers
group by Country;
This example uses an aggregate to produce the set of all countries and the count of customers within each country. The Count()
aggregate tallies the number of items in the group.
Aggregates can be used in all clauses evaluated after the group by
clause. The projection list is evaluated after the group by
clause, even though it is specified earlier in the query. A consequence of this is that aggregate functions can not be applied in the where
clause. However, it is still possible to filter grouped rows by using the having
clause. The having
clause acts just like the where
clause, except that it is evaluated after the group by
clause. The following example shows how the having
clause is used.
rows = select Country, Count(Country) from DB.Customers
group by Country
having Count(Country) > 1;
Joins
Select queries can be used to combine results from multiple tables. A SQL joining is a Cartesian product of one or more tables where each row from one table is paired up with each row from another table. The full Cartesian product consists of all such pairings. To select multiple sources whose data should be joined to perform a query, the from
clause can actually contain a comma-separated list of source expressions, each with its own iteration alias. The following example pairs up all Customer rows with their corresponding Order rows, and produces a table listing the customer's name and the shipping date for the order.
rows = select c.ContactName, o.ShippedDate
from c in DB.Customers, o in DB.Orders
where c.CustomerID == o.CustomerID;
Cω also supports more sophisticated table-joining semantics from the SQL world including inner join
, left join
, right join
, and outer join
using the corresponding keywords. Explanations of the semantics of the various kinds of joins is available at the W3Schools tutorial on SQL JOIN. The following example shows a select expression that uses the inner join
keyword:
rows = select c.ContactName, o.ShippedDate
from c in DB.Customers
inner join o in DB.Orders
on c.CustomerID == o.CustomerID;
Data Modification
The relational data access capabilities of Cω are not limited to querying data. One can also insert new rows into a table using the insert
command, modify existing rows within a table using the update
command, or delete rows from a table using the delete
command.
The insert
command is an expression that evaluates to the number of successful inserts that have been made to the table as a result of executing the command. The following example inserts a new customer in the Customers table:
int n = insert CustomerID = "ABCDE", ContactName="Frank", CompanyName="Acme"
into DB.Customers;
The same effect can be obtained by using an anonymous struct as opposed to setting each field directly as shown below:
row = new{CustomerID = "ABCDE", ContactName="Frank", CompanyName="Acme"};
int n = insert row into DB.Customers;
The update
command is an expression that evaluates to the number of rows that were successfully modified as a result of executing the command. The following example shows how to do a global replace of all misspelled references to the city "London":
int n = update DB.Customers
set City = "London"
where Country == "UK" && City == "Lundon";
It is also possible to modify all rows in the table by omitting the where
clause. The delete
command is an expression that evaluates to the number of rows that were successfully deleted as a result of executing the command. The following example deletes all the orders for customers in London:
int n = delete o from c in DB.Customers
inner join o in DB.Orders
on c.CustomerID == o.CustomerID
where c.City == "London";
Most applications that use insert, update, and delete expressions will also use transactions to guarantee ACIDity (atomicity, consistency, isolation, and durability) of one or more changes to the database. The Cω language includes a transact
statement that promotes the notion of initiating and exiting transactions in a programming language feature. The transact
statement is a statement bounding a block of code associated with a database transaction. If the code executes successfully and attempts to exit the block, the transaction will automatically be committed. If the block is exited due to a thrown exception, the transaction will automatically be aborted. The developer can also specify handlers that execute some user-defined code once the transact block is exited using the commit
and rollback
handlers. The following example shows transactions in action:
transact(DB) {
delete from DB.Customers where CustomerID == "ALFKI";
}
commit {
Console.WriteLine("commited");
}
rollback {
Console.WriteLine("aborted");
}
Using XML Literals in Cω
In Cω one can construct object instances using XML syntax. This feature is modelled after the ability to construct elements in languages like XQuery and XSLT. Like XQuery and XSLT, the XML literals can contain embedded code for constructing values. However, since Cω is statically typed, the names of members and types must be known at compile time and thus can not be constructed dynamically.
One can control the shape of the XML using a number of constructs. One can specify that fields should be treated as attributes in the XML literal with the use of the attribute
modifier on the member declaration. Similarly, choice types and anonymous structs are treated as the children of the XML element that maps to the content class. The following example shows how to initialize a content class from an XML literal:
using Microsoft.Comega;
using System;
public class NewsItem{
attribute string title;
attribute string author;
struct { DateTime date; string body; }
public static void Main(){
NewsItem news = <NewsItem title="Hello World" author="Dare Obasanjo">
<date>{DateTime.Now}</date>
<body>I am the first post of the New Year.</body>
</NewsItem>;
Console.WriteLine(news.title + " by " + news.author + " on " + news.date);
}
}
XML literals are intended to make the process of constructing strongly typed XML much simpler. Consider the following XQuery example taken from the W3C XQuery Use cases document. It iterates over a bibliogrpahy that contains a number of books. For each book in the bibliography, it lists the title and authors, grouped inside a result
element.
for $b in $bs/book
return
<result>
{$b/title}
{$b/author}
</result>
Performing the equivalent task in Cω looks very similar.
foreach (b in bs.book){
yield return <result>
{b.title}
{b.author}
</result>
}
Conclusion
The Cω language is an interesting attempt to bridge the impedence mismatches involved in typical enterprise development efforts when crossing the boundaries of the relational, object oriented, and XML worlds. A number of the ideas in the language have taken hold in academia and within Microsoft as is evidenced by comments by Anders Hejlsberg about the direction of C# 3.0. Until then, developers can explore what it means to program in a more data-oriented programming language by downloading the Cω compiler or Visual Studio.NET plugin
Acknowledgements
I'd like to thank Don Box, Mike Champion, Denise Draper, and Erik Meijer for their ideas and feedback while writing this article.
本人马上就准备翻译,今天太晚了,很困