75. ORM Supplement of django

This navigation:

 

QuerySet

1. Sliceable

Use Python's slicing syntax to limit the number of query set records. It is equivalent to the LIMIT and OFFSET clauses of SQL.

>>> Entry.objects.all()[:5]      # (LIMIT 5)
>>> Entry.objects.all()[5:10]    # (OFFSET 5 LIMIT 5)

Negative indexes are not supported (e.g. Entry.objects.all()[-1]). Typically, a slice of the query set returns a new query set -- it does not execute the query.

2. Iterative

articleList=models.Article.objects.all()

for article in articleList:
    print(article.title)

3. Inert Query

Query sets are lazy -- creating query sets does not bring any database access. You can keep the filter for a whole day until the query set needs to be evaluated before Django actually runs the query.

queryResult=models.Article.objects.all() # not hits database
 
print(queryResult) # hits database
 
for article in queryResult:
    print(article.title)    # hits database 

Generally speaking, only the results of the "request" query set are retrieved from the database. When you do need results, the query set is evaluated by accessing the database. For the exact time at which evaluation takes place, see When to Compute Query Sets.

4. Caching mechanism

Each query set contains a cache to minimize access to the database. Understanding how it works will allow you to write the most efficient code.

In a newly created query set, the cache is empty. Evaluating the query set for the first time -- concurrent database queries -- Django stores the results of the query into the cache of the query set and returns the results of the explicit request (for example, if the query set is being iterated, the next result is returned). Next, the evaluation of the query set will reuse the cached results.  

Keep this caching behavior in mind, because if you use the query set incorrectly, it will crater you. For example, the following statement creates two query sets, evaluates them, and then throws them away:

print([a.title for a in models.Article.objects.all()])
print([a.create_time for a in models.Article.objects.all()])

This means that the same database query will be executed twice, obviously doubling your database load. At the same time, it is also possible that the two result lists do not contain the same database record, because there may be Article s added or deleted during the two requests. To avoid this problem, just save the query set and reuse it:

queryResult=models.Article.objects.all()
print([a.title for a in queryResult])
print([a.create_time for a in queryResult])

5. When the query set will not be cached

Query sets do not always cache their results. When only part of the query set is evaluated, the cache is checked. If this part is not in the cache, then the records returned by the next query will not be cached. So, this means that using slices or indexes to restrict the query set will not fill the cache.

For example, repeated retrieval of a specific index in a query set object will query the database each time:

>>> queryset = Entry.objects.all()
>>> print queryset[5] # Queries the database
>>> print queryset[5] # Queries the database again

However, if all query sets have been evaluated, the cache is checked:

>>> queryset = Entry.objects.all()
>>> [entry for entry in queryset] # Queries the database
>>> print queryset[5] # Uses cache
>>> print queryset[5] # Uses cache

Here are some other examples that will allow all query sets to be evaluated and filled into the cache:

>>> [entry for entry in queryset]
>>> bool(queryset)
>>> entry in queryset
>>> list(queryset)

Note: Simply printing query sets does not fill the cache.

queryResult=models.Article.objects.all()
print(queryResult) #  hits database
print(queryResult) #  hits database

6. exists() and iterator() methods

1)exists:

Simply judging by using the if statement will also fully execute the entire queryset and put the data into the cache, although you don't need it! To avoid this, the exists() method can be used to check whether there is data:

 if queryResult.exists():
    #SELECT (1) AS "a" FROM "blog_article" LIMIT 1; args=()
        print("exists...")

2)iterator:

When queryset is very large, cache becomes a problem.

When processing thousands of records, it is wasteful to load them into memory at a time. Even worse, a huge queryset can lock up the system process and put your program on the verge of collapse. To avoid generating queryset cache while traversing data, you can use iterator() method to get data and discard it after processing data.

objs = Book.objects.all().iterator()
# iterator()Only a small amount of data can be retrieved from the database at a time, which saves memory.
for obj in objs:
    print(obj.title)
#BUT,Traversing again without printing,Because the iterator has traversed the last time(next)It's the last time.,No more traversing
for obj in objs:
    print(obj.title)

Of course, using iterator() to prevent cache generation means that queries are repeated when traversing the same queryset. So be careful when using iterator() to ensure that your code does not repeat queries when operating a large queryset.

Queryset's cache is used to reduce the query of the program to the database, which is guaranteed to query the database only when needed under normal usage. The existing () and iterator() methods can be used to optimize program memory usage. However, since they do not generate queryset cache, additional database queries may occur.

 

II. Mediation Model

When dealing with simple many-to-many relationships such as pizza and topping, use the standard ManyToManyField. However, sometimes you may need to correlate data to the relationship between the two models.

For example, there is an application that records the music groups of musicians. We can use a ManyToManyField to represent many-to-many relationships between groups and members. However, sometimes you may want to know more details about membership, such as when members joined the group.

In these cases, Django allows you to specify a mediation model to define many-to-many relationships. You can put other fields in the mediation model. The ManyToManyField field field of the source model points to the mediation model using the through parameter. For the example of the music group above, the code is as follows:

from django.db import models
 
class Person(models.Model):
    name = models.CharField(max_length=128)
 
    def __str__(self):              # __unicode__ on Python 2
        return self.name
 
class Group(models.Model):
    name = models.CharField(max_length=128)
    members = models.ManyToManyField(Person, through='Membership')
 
    def __str__(self):              # __unicode__ on Python 2
        return self.name
 
class Membership(models.Model):
    person = models.ForeignKey(Person)
    group = models.ForeignKey(Group)
    date_joined = models.DateField()
    invite_reason = models.CharField(max_length=64)
View Code

Now that you have set up ManyToManyField to use the mediation model (Membership in this case), you need to start creating many-to-many relationships. What you need to do is create an instance of the mediation model:

>>> ringo = Person.objects.create(name="Ringo Starr")
>>> paul = Person.objects.create(name="Paul McCartney")
>>> beatles = Group.objects.create(name="The Beatles")
>>> m1 = Membership(person=ringo, group=beatles,
...     date_joined=date(1962, 8, 16),
...     invite_reason="Needed a new drummer.")
>>> m1.save()
>>> beatles.members.all()
[<Person: Ringo Starr>]
>>> ringo.group_set.all()
[<Group: The Beatles>]
>>> m2 = Membership.objects.create(person=paul, group=beatles,
...     date_joined=date(1960, 8, 1),
...     invite_reason="Wanted to form a band.")
>>> beatles.members.all()
[<Person: Ringo Starr>, <Person: Paul McCartney>]
View Code

Unlike ordinary many-to-many fields, you can't create relationships using add, create, and assignment statements (for example, beatles.members = [...]):

# THIS WILL NOT WORK
>>> beatles.members.add(john)
# NEITHER WILL THIS
>>> beatles.members.create(name="George Harrison")
# AND NEITHER WILL THIS
>>> beatles.members = [john, paul, ringo, george]

Why not? This is because you can't just create the relationship between Person and Group, but you also specify all the information you need in the Membership model; simple add, create, and assignment statements can't do that. So they cannot be used in many-to-many relationships using mediation models. At this point, the only way is to create an instance of the mediation model.

The remove() method was banned for the same reason. But the clear() method is available. It can empty all the many-to-many relationships of an instance:

>>> # Beatles have broken up
>>> beatles.members.clear()
>>> # Note that this deletes the intermediate model instances
>>> Membership.objects.all()
[]

 

3. Query optimization

1. Table data

class UserInfo(AbstractUser):
    """
    //User information
    """
    nid = models.BigAutoField(primary_key=True)
    nickname = models.CharField(verbose_name='Nickname?', max_length=32)
    telephone = models.CharField(max_length=11, blank=True, null=True, unique=True, verbose_name='Phone number')
    avatar = models.FileField(verbose_name='Head portrait',upload_to = 'avatar/',default="/avatar/default.png")
    create_time = models.DateTimeField(verbose_name='Creation time', auto_now_add=True)
 
    fans = models.ManyToManyField(verbose_name='Fans',
                                  to='UserInfo',
                                  through='UserFans',
                                  related_name='f',
                                  through_fields=('user', 'follower'))
 
    def __str__(self):
        return self.username
 
class UserFans(models.Model):
    """
    //Mutual Powder Relation Table
    """
    nid = models.AutoField(primary_key=True)
    user = models.ForeignKey(verbose_name='Blogger', to='UserInfo', to_field='nid', related_name='users')
    follower = models.ForeignKey(verbose_name='Fans', to='UserInfo', to_field='nid', related_name='followers')
 
class Blog(models.Model):
 
    """
    //Blog information
    """
    nid = models.BigAutoField(primary_key=True)
    title = models.CharField(verbose_name='Personal Blog Title', max_length=64)
    site = models.CharField(verbose_name='Personal Blog Suffix', max_length=32, unique=True)
    theme = models.CharField(verbose_name='Blog theme', max_length=32)
    user = models.OneToOneField(to='UserInfo', to_field='nid')
    def __str__(self):
        return self.title
 
class Category(models.Model):
    """
    //Blogger's Personal Articles Category
    """
    nid = models.AutoField(primary_key=True)
    title = models.CharField(verbose_name='Classified headings', max_length=32)
 
    blog = models.ForeignKey(verbose_name='Blog', to='Blog', to_field='nid')
 
class Article(models.Model):
 
    nid = models.BigAutoField(primary_key=True)
    title = models.CharField(max_length=50, verbose_name='Article title')
    desc = models.CharField(max_length=255, verbose_name='Article description')
    read_count = models.IntegerField(default=0)
    comment_count= models.IntegerField(default=0)
    up_count = models.IntegerField(default=0)
    down_count = models.IntegerField(default=0)
    category = models.ForeignKey(verbose_name='Article type', to='Category', to_field='nid', null=True)
    create_time = models.DateField(verbose_name='Creation time')
    blog = models.ForeignKey(verbose_name='Blog', to='Blog', to_field='nid')
    tags = models.ManyToManyField(
        to="Tag",
        through='Article2Tag',
        through_fields=('article', 'tag'),
)
 
 
class ArticleDetail(models.Model):
    """
    //Detailed Table of Articles
    """
    nid = models.AutoField(primary_key=True)
    content = models.TextField(verbose_name='Article content', )
 
    article = models.OneToOneField(verbose_name='Subordinate articles', to='Article', to_field='nid')
 
 
class Comment(models.Model):
    """
    //Comment table
    """
    nid = models.BigAutoField(primary_key=True)
    article = models.ForeignKey(verbose_name='Commentary', to='Article', to_field='nid')
    content = models.CharField(verbose_name='Comment content', max_length=255)
    create_time = models.DateTimeField(verbose_name='Creation time', auto_now_add=True)
 
    parent_comment = models.ForeignKey('self', blank=True, null=True, verbose_name='Parent comment')
    user = models.ForeignKey(verbose_name='Commentator', to='UserInfo', to_field='nid')
 
    up_count = models.IntegerField(default=0)
 
    def __str__(self):
        return self.content
 
class ArticleUpDown(models.Model):
    """
    //Point praise table
    """
    nid = models.AutoField(primary_key=True)
    user = models.ForeignKey('UserInfo', null=True)
    article = models.ForeignKey("Article", null=True)
    models.BooleanField(verbose_name='Whether or not')
 
class CommentUp(models.Model):
    """
    //Point praise table
    """
    nid = models.AutoField(primary_key=True)
    user = models.ForeignKey('UserInfo', null=True)
    comment = models.ForeignKey("Comment", null=True)
 
 
class Tag(models.Model):
    nid = models.AutoField(primary_key=True)
    title = models.CharField(verbose_name='Label name', max_length=32)
    blog = models.ForeignKey(verbose_name='Blog', to='Blog', to_field='nid')
 
 
 
class Article2Tag(models.Model):
    nid = models.AutoField(primary_key=True)
    article = models.ForeignKey(verbose_name='Article', to="Article", to_field='nid')
    tag = models.ForeignKey(verbose_name='Label', to="Tag", to_field='nid')
View Code

2,select_related

1) Simple use

For one-to-one field and foreign key fields, you can use select_related to optimize QuerySet.

select_related returns a QuerySet that queries the data of the associated object along the foreign key relationship when it executes its query. It generates a complex query and results in performance degradation, but database queries will not be required in future use of foreign key relationships.

Simply put, after using the select_related() function for QuerySet, Django will get the corresponding object for the foreign key, so that it will not need to query the database when needed later.

The following example illustrates the difference between a normal query and a select_related() query.

Query id=2 for the category name of the article. Here is a standard query:

# Hits the database.
article=models.Article.objects.get(nid=2)
 
# Hits the database again to get the related Blog object.
print(article.category.title)
'''
 
SELECT
    "blog_article"."nid",
    "blog_article"."title",
    "blog_article"."desc",
    "blog_article"."read_count",
    "blog_article"."comment_count",
    "blog_article"."up_count",
    "blog_article"."down_count",
    "blog_article"."category_id",
    "blog_article"."create_time",
     "blog_article"."blog_id",
     "blog_article"."article_type_id"
             FROM "blog_article"
             WHERE "blog_article"."nid" = 2; args=(2,)
 
SELECT
     "blog_category"."nid",
     "blog_category"."title",
     "blog_category"."blog_id"
              FROM "blog_category"
              WHERE "blog_category"."nid" = 4; args=(4,)
 
 
'''

If we use the select_related() function:

articleList=models.Article.objects.select_related("category").all()
    
for article_obj in articleList:
    #  Doesn't hit the database, because article_obj.category
    #  has been prepopulated in the previous query.
    print(article_obj.category.title)
SELECT
     "blog_article"."nid",
     "blog_article"."title",
     "blog_article"."desc",
     "blog_article"."read_count",
     "blog_article"."comment_count",
     "blog_article"."up_count",
     "blog_article"."down_count",
     "blog_article"."category_id",
     "blog_article"."create_time",
     "blog_article"."blog_id",
     "blog_article"."article_type_id",
 
     "blog_category"."nid",
     "blog_category"."title",
     "blog_category"."blog_id"
 
FROM "blog_article"
LEFT OUTER JOIN "blog_category" ON ("blog_article"."category_id" = "blog_category"."nid");
View Code

2) Multiple Foreign Key Query

This is a foreign key query for category, another foreign key:

article=models.Article.objects.select_related("category").get(nid=1)
print(article.articledetail)

Observing the logging results, we found that we still need to query twice, so we need to change it to:

article=models.Article.objects.select_related("category","articledetail").get(nid=1)
print(article.articledetail)

Or:

article=models.Article.objects.select_related("category").select_related("articledetail").get(nid=1)  # django 1.7 Support chain operation
print(article.articledetail)
SELECT
 
    "blog_article"."nid",
    "blog_article"."title",
    ......
 
    "blog_category"."nid",
    "blog_category"."title",
    "blog_category"."blog_id",
 
    "blog_articledetail"."nid",
    "blog_articledetail"."content",
    "blog_articledetail"."article_id"
 
   FROM "blog_article"
   LEFT OUTER JOIN "blog_category" ON ("blog_article"."category_id" = "blog_category"."nid")
   LEFT OUTER JOIN "blog_articledetail" ON ("blog_article"."nid" = "blog_articledetail"."article_id")
   WHERE "blog_article"."nid" = 1; args=(1,)
View Code

3) Deep Search

# query id=1 User Name of Articles
 
    article=models.Article.objects.select_related("blog").get(nid=1)
    print(article.blog.user.username)

Two queries are still needed:

SELECT
    "blog_article"."nid",
    "blog_article"."title",
    ......
 
     "blog_blog"."nid",
     "blog_blog"."title",
 
   FROM "blog_article" INNER JOIN "blog_blog" ON ("blog_article"."blog_id" = "blog_blog"."nid")
   WHERE "blog_article"."nid" = 1;
 
 
 
 
SELECT
    "blog_userinfo"."password",
    "blog_userinfo"."last_login",
    ......
 
FROM "blog_userinfo"
WHERE "blog_userinfo"."nid" = 1;
View Code

This is because the first query did not query to the userInfo table, so modify it as follows:

article=models.Article.objects.select_related("blog__user").get(nid=1)
print(article.blog.user.username)
SELECT
 
"blog_article"."nid", "blog_article"."title",
......
 
 "blog_blog"."nid", "blog_blog"."title",
......
 
 "blog_userinfo"."password", "blog_userinfo"."last_login",
......
 
FROM "blog_article"
 
INNER JOIN "blog_blog" ON ("blog_article"."blog_id" = "blog_blog"."nid")
 
INNER JOIN "blog_userinfo" ON ("blog_blog"."user_id" = "blog_userinfo"."nid")
WHERE "blog_article"."nid" = 1;

4) summary

select_related optimizes the one-to-one and many-to-one relationships of the main pins.

select_related uses JOIN statements of SQL to optimize and improve performance by reducing the number of SQL queries.

You can specify the field name that requires select_related by using variable length parameters. The specified recursive query can also be implemented by using a double underscore "" to connect the field name.

No specified fields will not be cached, no specified depth will not be cached, and Django will do the SQL query again if it wants to access.

You can also specify the recursive depth by using the depth parameter, and Django automatically caches all fields within the specified depth. If you want to access fields outside the specified depth, Django will do the SQL query again.

It also accepts calls without parameters, and Django will recursively query all fields as deep as possible. But note the limitations of Django recursion and the waste of performance.

Django >= 1.7, the select_related of the chain call is equivalent to using variable length parameters. Django < 1.7. Chain calls invalidate the select_related in front, leaving only the last one.

3,prefetch_related()

For ManyToManyField and one-to-many fields, prefetch_related() can be used to optimize.

Preetch_related() and select_related() are designed to reduce the number of SQL queries, but they are implemented in different ways. The latter is to solve the problem in the SQL query through JOIN statement. However, for many-to-many relationships, it is not wise to use SQL statements to solve them, because the tables obtained by JOIN will be very long, which will lead to the increase of running time and memory consumption of SQL statements. If there are n objects, the result tables of (n)Mi rows will be generated if there are many pairs of fields corresponding to Mi bars for each object.

The solution to prefetch_related() is to query each table separately and then use Python to process their relationships.

# Query all tags associated with all articles
    article_obj=models.Article.objects.all()
    for i in article_obj:
 
        print(i.tags.all())  #4 Article: hits database 5

Change to prefetch_related:

# Query all tags associated with all articles
    article_obj=models.Article.objects.prefetch_related("tags").all()
    for i in article_obj:
 
        print(i.tags.all())  #4 Article: hits database 2
SELECT "blog_article"."nid",
               "blog_article"."title",
               ......
 
FROM "blog_article";
 
 
 
SELECT
  ("blog_article2tag"."article_id") AS "_prefetch_related_val_article_id",
  "blog_tag"."nid",
  "blog_tag"."title",
  "blog_tag"."blog_id"
   FROM "blog_tag"
  INNER JOIN "blog_article2tag" ON ("blog_tag"."nid" = "blog_article2tag"."tag_id")
  WHERE "blog_article2tag"."article_id" IN (1, 2, 3, 4);
View Code

 

Four, extra

extra(select=None, where=None, params=None, tables=None, order_by=None, select_params=None)

In some cases, Django's query syntax is difficult to simply express complex WHERE clauses. In this case, Django provides an extra() QuerySet modification mechanism, which can inject new clauses into the SQL clauses generated by QuerySet.

Ext can specify one or more parameters, such as select, where or tables. These parameters are not necessary, but you need to use at least one! Note that these additional methods may have portability problems for different database engines. (Because you are explicitly writing SQL statements), unless you have to, try to avoid doing so.

1. Selection of parameters

The select parameter allows you to add other field information to the SELECT clause. It should be a dictionary that stores the mapping of attribute names to SQL clauses.

queryResult=models.Article.objects.extra(select={'is_recent': "create_time > '2017-09-05'"})

Each Entry object in the result set has an additional attribute is_recent, which is a Boolean value indicating whether the create_time of the Article object is later than 2017-09-05.

Practice:

# in sqlite:
article_obj=models.Article.objects.filter(nid=1).extra(select={"standard_time":"strftime('%%Y-%%m-%%d',create_time)"}).values("standard_time","nid","title")
print(article_obj)
# <QuerySet [{'title': 'MongoDb Introductory course', 'standard_time': '2017-09-03', 'nid': 1}]>

2. where / tables of parameters

You can use where to define explicit SQL WHERE clauses - perhaps to perform non-explicit connections. You can manually add tables to the SQL FROM clause using tables.

Where and tables accept a list of strings. All where parameters are AND or any other search conditions.

For example:

queryResult=models.Article.objects.extra(where=['nid in (1,3) OR title like "py%" ','nid>2'])

Keywords: Python Database Django SQL

Added by eightFX on Sun, 19 May 2019 22:00:31 +0300