How PLINQ changes your SQL queries

Posted: (EET/GMT+2)

 

If you have used LINQ queries previously, you are probably happy to use the forthcoming Parallel LINQ or PLINQ that is part of Visual Studio 2010. It's great to have an "automatically" parallel version of your queries, so that you can maximize your use of the resources your multi-core desktop processor has. However, PLINQ isn't an automatic salvation to all your performance problems, simply because using PLINQ can transfer the work from the database server to your client.

Of course, this all depends on what you are after. But let's have a concrete example. Assume you are using SQL Server as your database, to which you then execute the following LINQ query:

NorthwindDataClassesDataContext cntx =
  new NorthwindDataClassesDataContext();
var cust = from c in cntx.Customers
           where c.Country == "USA"
           select c;

As you can guess, this query fetches all the customer records that have the country value of "USA". Now, behind the scenes, LINQ to SQL executes the following SQL statement to get the records (easy to see using the SQL Server tracing utility):

exec sp_executesql N'SELECT [t0].[CustomerID], [t0].[CompanyName],
[t0].[ContactName], [t0].[ContactTitle], [t0].[Address],
[t0].[City], [t0].[Region], [t0].[PostalCode], [t0].[Country],
[t0].[Phone], [t0].[Fax]
FROM [dbo].[Customers] AS [t0]
WHERE [t0].[Country] = @p0',
N'@p0 nvarchar(3)',@p0=N'USA'

As you can see, the filtering for the customers occurs on the database server.

Next, let's change the query to use PLINQ by simply adding the ".AsParallel()" method call after the customer table object in Visual Studio 2010:

Using PLINQ in Visual Studio 2010

This is the new C# code for the PLINQ query:

var cust = from c in cntx.Customers.AsParallel()
           where c.Country == "USA"
           select c;

But how would you guess PLINQ would execute the SQL query if the aim is to enable the client computer to use multiple threads to process the results? Well, to solution is to do the country filtering on the client. In deed, this is reflected in the SQL statement, which now becomes:

SELECT [t0].[CustomerID], [t0].[CompanyName], [t0].[ContactName],
[t0].[ContactTitle], [t0].[Address], [t0].[City], [t0].[Region],
[t0].[PostalCode], [t0].[Country], [t0].[Phone], [t0].[Fax]
FROM [dbo].[Customers] AS [t0]

As you can see, the SQL query changes quite a bit. The results are naturally the same, but the processing (filtering) changes from the server to the client. If you have 100 records in the customers table, you won't probably see much difference in processing time, network utilization or memory usage. But if you had 1,000,000 records, things would change.

This is not necessarily a bad thing, but you have to understand the consequences of using PLINQ. Using it should be decision you make after judging the effects.