python web crawler (Chapter 6: asynchronous programming)

1. Event cycle

It can be understood as an endless loop to detect and execute some code
Principle:

# Asynchronous programming
# Pseudo code
 task list = [Task 1, task 2, task 3,...]

while True:
    Executable task list, completed task list = Go to the task list to check all tasks, and'Executable'and'Completed'Task return

    for Ready task in List of executable tasks:
        Perform ready tasks
    for Completed tasks in List of completed tasks:
        Remove completed tasks from the task list

    If all the tasks in the task list have been completed, the loop is terminated

Actual programming code:

loop = asyncio.get_event_loop()
loop.run_util_complete(task) 

2. Co programming

1. Co process function: async def function name when defining a function.
2. Coprocessor object: the coprocessor object obtained by executing the coprocessor function ().

async def func():
	pass

result = func()

The above async def func() is a coroutine function, and result is the coroutine object with which the coroutine function func() is executed.

Special statement: execute the collaboration object created by the collaboration function, and the internal code of the function will not be executed.

2. Run coplanar function

If you want to run the internal code of the coordination function, you must hand over the coordination object to the time loop.

import asyncio
async def func():
	pass

result = func()

#loop = asyncio.get_event_loop()
#loop.run_util_complete(task)      #loop. run_ util_ Complete (task), such as loop run_ util_ complete(task) = loop. run_ util_ complete(func())
asyncio.run(result)   #python3. The event loop instruction of version 7 and above is simpler and replaces the above two sentences of code

import asyncio
async def func():
    print("Print")

result = func()
asyncio.run(result)
Output results: print ah

3.await

await + objects that can wait (collaboration object, Future, Task object - > IO wait)

import asyncio

asyncio def others():
    print("start")
    await asyncio.sleep(2)
    print("end")
    return 'Return value'

asyncio def fun():
    print("Internal code of execution coprocessor function")

    #When an IO operation suspends the current collaboration (task), continue to execute after the IO operation is completed. When the current collaboration is suspended, the event loop can execute other collaboration (tasks)
    response = await others()

    print("IO The request ended with:",response)

asyncio.run(func())

await is to wait for the value of the object to reach the result before moving on

4.Task object

Task is used for concurrent scheduling collaboration through asyncio create_ Create a task object in the form of task (collaboration object), so that the collaboration can be added to the event loop to wait for the scheduled execution.
asyncio. create_ The task() function is in Python 3 7, in Python 3 Before 7, you can use the low-level asyncio ensure_ The future() function.

Example 1:

import asyncio
async def func():
    print(1)
    await asyncio.sleep(2)
    print(2)
    return "Return value"

async def main():

    print("main start")

    #Create a Task object and add the Task currently executing func function to the event loop
    task1 = asyncio.create_task(func())

    # Create a Task object and add the Task currently executing func function to the event loop
    task2 = asyncio.create_task(func())

    print("main end")

    #When an IO operation is encountered during the execution of a collaboration, it will automatically switch to other tasks
    #The await here is to wait for the response. After all the processes have been executed, the results will be obtained

    ret1 = await task1
    ret2 = await task2
    print(ret1,ret2)

asyncio.run(main())

Return result:
main start
main end
1
1
2
2
Return value return value

#First, enter the main() function and execute the code in the main function. When task1 is encountered, execute the func function, output 1, and encounter await asyncio Sleep (2), in case of IO blocking, switch to task2, output 1, then switch to task1, output 2, then switch to task2, output 2, and then return "return value" together

Example 2:

import asyncio

async def func():
    print(1)
    await asyncio.sleep(2)
    print(2)
    return "Return value"

async def main():

    print("main start")

    task_list = [
        asyncio.create_task(func(),name='n1'),
        asyncio.create_task(func(),name='n2')
    ]

    print("main end")

    done,pending = await asyncio.wait(task_list,timeout=None)   #Await can only be followed by collaboration objects, task and future, so it cannot be directly [await task_list] 2 After the default task is completed, the value will be returned to done, that is, done is a collection
    print(done)
asyncio.run(main())

Example 3:

import asyncio

async def func():
    print(1)
    await asyncio.sleep(2)
    print(2)
    return "Return value"

task_list = [
    func(),
    func()
]

asyncio.run(asyncio.wait(task_list))

Example 2 and example 3 have the same function

5. Future object

Task inherits future, and the processing of await results in task object is based on the future object.

#===================Future object========================
import asyncio

async def main():
    #Get current event loop
    loop = asyncio.get_running_loop()

    #Create a task (future object) that does nothing
    fut = loop.create_future()

    #Wait for the final result of the task (future object). If there is no result, it will wait forever.
    await fut

asyncio.run(main())

# ==================Future example 2=====================
import asyncio

async def set_after(fut):
    await asyncio.sleep(2)
    fut.set_result("666")

async def main():
    #Get current event loop
    loop = asyncio.get_running_loop()

    #Create a task (future object) without binding any behavior, then the task will never know when to end.
    fut = loop.create_future()

    #Create a Task (Task object) and bind the set_after function, the function will assign a value to fut after 2s
    #That is, manually set the final result of the future task, and then the fut can end.
    await loop.create_task( set_after(fut))

    #Wait for the future object to get the final result, otherwise wait forever
    data = await fut
    print(data)

asyncio.run(main())

6. Asynchronous iterator

Realized_ aiter_(0) and_ anext_ () object of the method. anext_ An awaitable object must be returned. async for handles asynchronous iterators_ The object returned by the anext() method can wait until it throws a StopAsyncIteration exception. Introduced by PEP 492.
Asynchronous iteration object:
An object that can be used in an async for statement. Must pass its_ aiter_ The () method returns an asynchronous iterator introduced by PEP 492.
Example 1: asynchronous iterator

import asyncio

class Reader(object):

    #Custom asynchronous iterators (also asynchronous iteratable objects)

    def __init__(self):
        self.count = 0

    async def readline(self):
        #await asyncio.sleep(1)
        self.count + = 1
        if self.count == 100:
            return None
        return self.count

    def __aiter__(self):
        return self

    async def __anext__(self):
        val = await self.readline()
        if val == None:
            raise StopAsyncIteration
        return val

async def func():
    obj = Reader()
    async  for item in obj:
        print(item)

asyncio.run(func())

Example 2: asynchronous context manager
Such objects are defined by__ aenter__ () and__ aexit__ () method to control the environment in the async with statement. Introduced by PEP 492.

#=============Asynchronous context manager========
import asyncio

class AsyncContextManager:
    def __init__(self):
        self.conn = conn

    async def __anext__(self):
        #Asynchronously linked database
        self.conn = await asyncio.sleep(1)
        return self

    async def __aexit__(self, exc_type, exc_val, exc_tb):
        #Closing database links asynchronously
        await asyncio.sleep(1)

    async def func():
        async with AsyncContextManager() as f:
            result = await f.do_something()
            print(result)

    asyncio.run(func())

Keywords: Python crawler

Added by Grayda on Sat, 01 Jan 2022 17:00:52 +0200