LINQ to SQL and IQueryable

Famous sayings and phrases

"The easiest way to understand IQueryable is to think of it as a query that will produce a sequence of results when executed." - Jon Skeet

What is the difference between LINQ to Object and LINQ to SQL?

LINQ to SQL converts query expressions into SQL statements and executes them in the database. Instead of LINQ to Object, it is a series of methods that convert query expressions directly into Enumerables and ultimately execute inside C#. The data source of LINQ to Object always implements IEnumerable<T> (so it is not as good as LINQ to IEnumerable<T>).In contrast, LINQ to SQL data sources always implement IQueryable <T> and use Queryable's extension method.

Converting query expressions to SQL statements does not guarantee success.

IQueryable

The easiest way to understand IQueryable is to think of it as a query that, when executed, generates a sequence of results.

IQueryable is another interface that inherits the IEnumerable interface.

Queryable is a static type that combines many extension methods targeted at IQueryable and IEnumerable. It makes IQueryable, like IEnumerable, powerful for querying.

The AsQueryable method converts IEnumerable <T> to IQueryable <T>.

var seq = Enumerable.Range(0, 9).ToList();
IEnumerable<int> seq2 = seq.Where(o => o > 5);
IQueryable<int> seq3 = seq.Where(o => o > 4).AsQueryable();

Simulate interface to implement a simple LINQ to SQL

The following is an attempt to implement a very simple query provider (LINQ to xxx) that can convert simple where lambda expressions to SQL with very limited functionality. In LINQ to SQL lambda expressions are first converted to an expression tree and then to SQL statements.

We are trying to implement a query provider that can translate where this lambda expression into an SQL statement.

Dead work

First, create a database locally, then a simple table. Then, insert some test data. The entities used for testing are:

public class Staff
{
    public int Id { get; set; }
    public string Name { get; set; }
    public string Sex { get; set; }
}

Since the VS version is anti-heavenly 2010 and there is no EF, I used the more original method, which is to create a local database in mdf format. You can use EF or other methods.

public class DbHelper : IDisposable
    {
        private SqlConnection _conn;

        public bool Connect()
        {
            _conn = new SqlConnection
            {
                ConnectionString = "Data Source=.\\SQLEXPRESS;" +
                                   "AttachDbFilename=Your DB Path" +
                                   "Integrated Security=True;Connect Timeout=30;User Instance=True"
            };
            _conn.Open();
            return true;
        }

        public void ExecuteSql(string sql)
        {
            SqlCommand cmd = new SqlCommand(sql, _conn);
            cmd.ExecuteNonQuery();
        }

        public List<Staff> GetEmployees(string sql)
        {
            List<Staff> employees = new List<Staff>();
            SqlCommand cmd = new SqlCommand(sql, _conn);
            SqlDataReader sdr = cmd.ExecuteReader();
            while (sdr.Read())
            {
                employees.Add(new Staff{
                    Id = sdr.GetInt32(0),
                    Name = sdr.GetString(1),
                    Sex = sdr.GetString(2)
                });
            }
            return employees;
        }

        public void Dispose()
        {
            _conn.Close();
            _conn = null;
        }
    }

This very simple DbHelper has the ability to connect to a database, simply execute SQL statements (no return value is required for DDL or delete statements), and return the functionality of several entities (for select statements) by executing Sql statements.

public static List<Staff> Employees;

static void Main(string[] args)
{
    using (DbHelper db = new DbHelper())
    {
        db.Connect();
        //db.ExecuteSql("CREATE TABLE Staff ( Id int, Name nvarchar(10), Sex nvarchar(1))");
        db.ExecuteSql("DELETE FROM Staff");
        db.ExecuteSql("INSERT INTO Staff VALUES (1, 'Frank','M')");
        db.ExecuteSql("INSERT INTO Staff VALUES (2, 'Mary','F')");
        Employees = db.GetEmployees("SELECT * FROM Staff");             
    }

    Console.ReadKey();
}

In the main function we perform table building (only the first time)The goal is to parse the where expression, convert it to SQL, then call the ExecuteSql method to return the data, and compare it with calling where directly.

Implement IQueryable<T>

First, we build our own category FrankQueryable and inherit IQueryable<T>. Because IQueryable<T> inherits IEnumerable<T>, we also implement the GetEnumerator method. The GetEnumerator method is called only when the expression needs to be evaluated (for example, pure Select does not). In addition, IQueryable<T> has three properties:

  1. Expression: This is a very understandable expression to process
  2. Type
  3. IQueryProvider: Your own IQueryProvider. In the constructor, you need to pass in your own IQueryProvider to implement your own logic.
public class FrankQueryable<T> : IQueryable<T>
    {
        public IEnumerator<T> GetEnumerator()
        {
            throw new NotImplementedException();
        }

        IEnumerator IEnumerable.GetEnumerator()
        {
            return GetEnumerator();
        }

        public Expression Expression { get; private set; }
        public Type ElementType { get; private set; }
        public IQueryProvider Provider { get; private set; }

        public FrankQueryable()
        {
            
        }
    }

We need to implement the constructor and the GetEnumerator method.

Implement IQueryProvider

Building your own query provider requires inheriting IQueryable<T>. Query providers will do the following:

  1. Call CreateQuery to create a query, but do not compute. Calculate only when needed.
  2. If you need to perform a calculation of the expression (such as a call to ToList), call GetEnumerator at this time to trigger the execution of Execute to evaluate the expression. We need to write our logic in the Execute method. We need to call it in GetEnumerator.

We want to write a simple query provider ourselves, so we'll write an IQueryProvider and pass it in in the constructor. We'll create a new type again, inherit the IQueryProvider, and then we need to implement four more methods. Two of the non-generic versions can be ignored for now.

public class FrankQueryProvider : IQueryProvider
{
    public IQueryable CreateQuery(Expression expression)
    {
        throw new NotImplementedException();
    }

    public IQueryable<TElement> CreateQuery<TElement>(Expression expression)
    {
        throw new NotImplementedException();
    }

    public object Execute(Expression expression)
    {
        throw new NotImplementedException();
    }

    public TResult Execute<TResult>(Expression expression)
    {
        throw new NotImplementedException();
    }
}

In this case, the constructor of the FrankQueryable type can assign attributes appropriate values, and it becomes like this:

    public FrankQueryable(Expression expression, FrankQueryProvider provider)
    {
        Expression = expression;
        ElementType = typeof(T);
        Provider = provider;
    }

The implementation of the CreateQuery method is simple.

public IQueryable<TElement> CreateQuery<TElement>(Expression expression)
{
    Console.WriteLine("Going to CreateQuery");
    return new FrankQueryable<TElement>(this, expression);
}

Then we can implement FrankQueryable's GetEnumerator method, which is designed to call the Execute method in its companion provider so that our own logic can be executed (we've passed in our provider in the constructor):

public IEnumerator<T> GetEnumerator()
{
    Console.WriteLine("Begin to iterate.");
    var result = Provider.Execute<List<T>>(Expression);
    foreach (var item in result)
    {
        Console.WriteLine(item);
        yield return item;
    }
}

For convenience, we add a parameterless constructor that calls the parameterized constructor first, then executes itself, setting the expression to a default value:

public FrankQueryable() : this(new FrankQueryProvider(), null)
{
    //this is T
    Expression = Expression.Constant(this);
}

Finally, the Execute method of FrankQueryProvider, whose implementation requires us to parse the expression manually, allows us to create an ExpressionTreeToSql class and call it in the Execute method.

public TResult Execute<TResult>(Expression expression)
{
    string sql = ""; 
  
    //Get sql in some way
    //ExpressionTreeToSql
    Console.WriteLine(sql);

    using (DbHelper db = new DbHelper())
    {
        db.Connect();
        dynamic ret = db.GetEmployees(sql);
        return (TResult) ret;
    }
}

Assuming that we get the correct SQL statement, the next thing to do is to connect to the database and get the result. This is already done, and of course the last and most critical step is to parse the expression to get the SQL statement.

Note that CreateQuery generates new expression objects each time, regardless of whether the same expression already exists, which constitutes the motivation for caching expressions.

Test IQueryable's Running Flow

Assuming we write the SQL statement to death before parsing, we will get the correct output:

public TResult Execute<TResult>(Expression expression)
{
    string sql = "select * from staff where Name = 'Frank'";           
    Console.WriteLine(sql);

    using (DbHelper db = new DbHelper())
    {
        db.Connect();
        dynamic ret = db.GetEmployees(sql);
        return (TResult) ret;
    }
}

Main program:

static void Main(string[] args)
{
    using (DbHelper db = new DbHelper())
    {
        db.Connect();
        //db.ExecuteSql("CREATE TABLE Staff ( Id int, Name nvarchar(10), Sex nvarchar(1))");
        db.ExecuteSql("DELETE FROM Staff");
        db.ExecuteSql("INSERT INTO Staff VALUES (1, 'Frank','M')");
        db.ExecuteSql("INSERT INTO Staff VALUES (2, 'Mary','F')");
        Employees = db.GetEmployees("SELECT * FROM Staff");             
    }

    var aa = new FrankQueryable<Staff>();

    //Try to translate lambda expression (where)
    var bb = aa.Where(t => t.Name == "Frank");
        Console.WriteLine("Going to compute the expression.");
    var cc = bb.ToList();

    Console.WriteLine("cc has {0} members.", cc.Count);
    Console.WriteLine("Id is {0}, and sex is {1}", cc[0].Id, cc[0].Sex);
    Console.ReadKey();
}

At this point we find that the program will behave according to our query provider, not the default IQueryable. (The default provider will not print anything.) Our printout is:

Going to CreateQuery
Going to compute the expression.
Begin to iterate.
select * from staff where Name = 'Frank'
FrankORM.Staff
cc has 1 members.
Id is 1, and sex is M

When the program runs to

var bb = aa.Where(t => t.Name == "Frank");

The generic CreateQuery method is invoked here first (since the aa object is of type FrankQueryable <T> we enter our own query provider, and Where is an extension of Queryable so FrankQueryable automatically owns it)And then outputs Going to CreateQuery. Then, since the expression is not evaluated at this time, it does not immediately enter the Execute method. Then the main program continues to run, printing Going to compute the expression.

Later, on the next line of the main program, since we call the ToList method and have to evaluate the expression at this point, the program starts iterating, calls the GetEnumerator method, prints the Begin to iterate, and then calls the Execute method, still using our own query provider's logic, executes SQL, and outputs the correct values.

Through this test, we learned about the entire IQueryable workflow. Thanks to Queryable's extensive methods, we can easily get powerful query capabilities. Now, of course, it's time to parse out the SQL and fill in the last puzzle of the entire process.

We place the parsing method in the ExpressionTreeToSql class and name it VisitExpression. This class is essential for writing an ORM yourself and is sometimes referred to as the ExpressionVisitor class as well.

Parsing Where lambda expressions: Step 1

Our input is a lambda expression, which is long like this:

var bb = aa.Where(t => t.Name == "Frank");

Our goal is to:

Select * from Staff where Name = 'Frank'

Staff, Name and Frank are what we need from the outside world, and the rest are fixed syntax matches. So we need a way to parse the expression, which takes an expression as input and outputs a string. We can get the values Name and Frank through the expression. And we also need to know the type name of the target entity class, Staff, so ITheir parsing method also needs to accept a generic T.

Also, since our parsing method is likely to be recursive (because we are parsing the expression tree), our output needs to be modified with ref. So the signature of this parsing method is:

public static void VisitExpression<T>(T enumerable, Expression expression, ref string sql)

It is easy to get Select * from Staff:

public static string GenerateSelectHeader<T>(T type)
{
    var typeName = type.GetType().Name.Replace("\"", "");
    return string.Format("select * from {0} ", typeName);
}

The first thing we need to do is add:

    public static void VisitExpression<T>(T enumerable, Expression expression, ref string sql)
    {
        if (sql == String.Empty)
            sql = GenerateSelectHeader(enumerable);
   }

Of course, the default is to select all columns of an entity. If you select a part, you also need to parse the select expression.

Back to the Execute method, the puzzle code now comes to the surface, and it's just:

ExpressionTreeToSql.VisitExpression(new Staff(), expression, ref sql);

Parsing Where lambda expressions: Step 2

The second step in parsing is where the expression is. First, we need to know its NodeType (that is, Type, which is the Type of value the expression ultimately evaluates to). By setting the breakpoint, we see that the Type is Call Type, so we need to convert the expression to MethodCallExpression(Otherwise, we won't get any details, which is the same for all types of expressions).

Now we have the method name where.

switch (expression.NodeType)
{
    case ExpressionType.Call:
        MethodCallExpression method = expression as MethodCallExpression;
        if (method != null)
        {
            sql += method.Method.Name;
        }
        break;
    default:
        throw new NotSupportedException(string.Format("This kind of expression is not supported, {0}", expression.NodeType));
}

Now we can run the program, of course, the result sql is wrong, our parsing is not over yet. By checking the variables of the expression by setting a breakpoint, we find that Argument[1] is the expression itself, so we continue to parse the expression recursively:

We can tell what our next expression is based on each exception thrown. By exceptions, the next expression is a Quote-type expression. Its corresponding expression type is Unary. The only useful thing in a unary expression is Operand, so we continue to parse:

case ExpressionType.Quote:
    UnaryExpression expUnary = expression as UnaryExpression;
    if (expUnary != null)
    {
        VisitExpression(enumerable, expUnary.Operand, ref sql);
    }
    break;

The next expression: t=>t.Name=="Frank", is obviously a lambda expression. What's useful about it is its Body (t.Name="Frank"):

   case ExpressionType.Lambda:
        LambdaExpression expLambda = expression as LambdaExpression;
        if (expLambda != null)
        {
            VisitExpression(enumerable, expLambda.Body, ref sql);
        }
        break;

Finally, we have reached the end point. This time it's an Equal type expression, with t.Name on the left and Frank on the right, which are the values we need:

   case ExpressionType.Equal:
        BinaryExpression expBinary = expression as BinaryExpression;
        if (expBinary != null)
        {
            var left = expBinary.Left;
            var right = expBinary.Right;
            sql += " " + left.ToString().Split('.')[1] + " = '" + right.ToString().Replace("\"", "") + "'";
        }
        break;

Put these case s together and a simple LINQ to SQL interpreter is ready. At this point, we remove the dead SQL and the program should get the correct output:

public TResult Execute<TResult>(Expression expression)
{
            string sql = "";           

            ExpressionTreeToSql.VisitExpression(new Staff(), expression, ref sql);

            Console.WriteLine(sql);

            using (DbHelper db = new DbHelper())
            {
                db.Connect();
                dynamic ret = db.GetEmployees(sql);
                return (TResult) ret;
            }
}

As you can see, where lambda expressions are converted into a complex expression tree, and by parsing the expression tree manually, we can embed our own logic to achieve functions that LINQ to SQL cannot do.

Of course, an example is the most basic case, and if the expression tree becomes complex, the sql generated is probably wrong.

Simple extensions

Let's look at the following scenario. Let's add a where expression:

using (DbHelper db = new DbHelper())
{
    db.Connect();
    //db.ExecuteSql("CREATE TABLE Staff ( Id int, Name nvarchar(10), Sex nvarchar(1))");
    db.ExecuteSql("DELETE FROM Staff");
    db.ExecuteSql("INSERT INTO Staff VALUES (1, 'Frank','M')");
    db.ExecuteSql("INSERT INTO Staff VALUES (2, 'Mary','F')");
    db.ExecuteSql("INSERT INTO Staff VALUES (1, 'Roy','M')");
    Employees = db.GetEmployees("SELECT * FROM Staff");             
}

var test = Employees.Where(t => t.Sex == "M").Where(t => t.Name == "Frank");

var aa = new FrankQueryable<Staff>();

//Try to translate lambda expression (where)
var bb = aa.Where(t => t.Sex == "M")
    .Where(t => t.Name == "Frank");

At this point, we can get the correct result with IQueryable <T> (test has only one output), but with our own query provider, we get the wrong SQL (the first Sex = M is missing). We found that the problem is where we parsed the MethodCallExpression.

When there is only one where expression, the expression tree looks like this:

So when we parsed MethodCallExpression, we skipped argument[0] (which is actually a constant expression), and now it seems impossible for us to skip it because in the expression tree now, argument[0] is: {value (FrankORM.FrankQueryable1[FrankORM.Staff]). Where (t => (t.Sex == "M")}`

It contains useful information, so we can't skip it. We need to parse all argument s and connect using and:

   case ExpressionType.Call:
        MethodCallExpression exp = expression as MethodCallExpression;
        if (exp != null)
        {
            if(!sql.Contains(exp.Method.Name))
                sql += exp.Method.Name;
            foreach (var arg in exp.Arguments)
            {
                VisitExpression(enumerable, arg, ref sql);
            }
            sql += " and ";
        }
         break;

Run the program again at this time, and an exception occurs. The system prompts us that we have no parsing of the constant expression. For the constant expression, we don't need to do anything.

case ExpressionType.Constant:
    break;

Using the code above and parsing it again, we get a SQL that looks more correct:

select * from Staff Where Sex = 'M' and  Name = 'Frank' Sex = 'M' and  Name = 'Frank' and

The end and occurs more than once because we add and at the end of each parse. Simply remove and, the program will output the correct result.

This time the expression tree looks like this:

Of course, the code quality of this extension is already very poor, and there are a lot of hashes. But let me just take this example here to explain how to expand and add more functionality to expression tree parsing so that it can handle more types of expressions.

What are the similarities and differences between IQueryable and IEnumerable?

First, IQueryable <T> parses a tree, while IEnumerable <T> uses delegates, which are explained above (in the most basic case) by manual implementation, whereas you can implement the latter entirely by generic delegates.

IQueryable<T>inherits from IEnumerable<T>, so they are not different for data traversal. Both have the effect of delaying execution. However, the advantage of IQueryable is that it has an expression tree, all filtering, sorting operations for IQueryable<T> are cached in the expression tree first, and the expression tree is only cached by IQueryProvider when traversal really occurs.Perform the Get Data operation.

With IEnumerable <T>, all filtering and sorting operations for IEnumerable <T> occur in memory, that is, data has been retrieved from the database into memory for filtering and sorting operations in memory.

When the data source is not local, because the IEnumerable <T>query must be executed locally, we must load all the data locally before executing the query. Most of the time, there is a lot of data loaded that we don't need to be invalid, but we have to transfer more data and do more useless work. IQueryable <T>is always available onlyThe amount of data you need greatly reduces the amount of data you transfer.

IQueryable Summary

  1. The easiest way to understand IQueryable is to think of it as a query that, when executed, generates a sequence of results.
  2. Inheriting IQueryable<T> means you get powerful query capabilities because you automatically get a bunch of extension methods for Queryable.
  3. When an IQueryable <T>query is parsed, the QueryProvider for IQueryable <T>is accessed first, then the CreateQuery <T>method is accessed, and the input query expression is passed in to construct the query.
  4. A query executes by starting the iteration of the IQueryable, which calls the Execute method and passes an expression tree.
  5. Not all expression trees can be translated into SQL. Not for example, ToUpper.
  6. Writing an ORM yourself means writing a QueryProvider and customizing the Execute method to parse the expression tree. So you must have a class that parses the expression tree, which is often called the ExpressionVisitor.
  7. An expression tree is usually parsed recursively because any node of the expression tree, including leaf nodes, is an expression tree.
  8. CreateQuery generates new expression objects each time, regardless of whether the same expression already exists, which constitutes the motivation for caching expressions.

COMPARISON BETWEEN ORM AND CLASSICAL DATATATABLE

Benefits:

  1. Providing object-oriented and strong types, programmers who are familiar with the OO language will get started quickly.
  2. Hiding data access details makes it possible to kill the entire DAL. In a three-tier architecture, BL calls DAL to get data, but now BL can get data directly through lambda expressions and so on, without requiring DAL.
  3. Releasing programmers from splicing SQL statements, especially inserts, is both error prone and difficult to discover. Inserted objects are now strongly typed, just like inserting a List.
  4. Operate different databases in the same syntax (e.g. oracle, SQL server, etc.)
  5. Compared with the classic DataReader, DataReader errors occur when the data type of a column in a data table changes (the traditional way is to use the DataReader.Read method to read data one row at a time, then obtain data for each column by GetString, GetInt32, and so on).The ORM will make errors at compile time, and you only need to change the type of object property to have no problems.

Disadvantages:

  1. Some complex SQL or SQL built-in methods cannot be translated through ORM.
  2. Automatically generated SQL statements sometimes perform poorly, depending on the mechanism behind them. Programmers who are unfamiliar with ORM may result in poorer program performance.
  3. It is difficult to replace the Store procedure.

The core of ORM is DbContext. It can be viewed as a copy of a database, and we can implement CRUD on the database only by accessing it.

Keywords: SQL Interview linq

Added by larryg on Fri, 01 Oct 2021 20:26:43 +0300