Control the concurrency of Promise.all through Iterator

background

Asynchronous is a very important feature of js, but many times, we not only want a series of tasks to be executed in parallel, but also want to control the number of concurrent tasks to be executed at the same time, especially for asynchronous tasks with limited resources, such as file handles, network ports, etc.

Take an example.

function sleep(ms) {
  return new Promise(resolve => setTimeout(resolve, ms));
}

// simulate an async work that takes 1s to finish
async function execute(id) {
  console.log(`start work ${id}`);
  await sleep(1000);
  console.log(`work ${id} done`);
}

Promise.all([1, 2, 3, 4, 5, 6, 7, 8, 9].map(execute));

Output results:

"start work 1"
"start work 2"
"start work 3"
"start work 4"
"start work 5"
"start work 6"
"start work 7"
"start work 8"
"start work 9"
"work 1 done"
"work 2 done"
"work 3 done"
"work 4 done"
"work 5 done"
"work 6 done"
"work 7 done"
"work 8 done"
"work 9 done"

As you can see, all the work starts executing at the same time.

Now, what should we do if we want these work s to execute only two at a time, and then continue to the next two after two are completed, that is, what should we do if the concurrency number is two?

Solution

Controlling the generation of Promise is the key

We know that Promise.all does not trigger the execution of Promise. What really triggers the execution is to create Promise itself. In other words, Promise has been executed in the moment of generation! Therefore, if we want to control the concurrency of Promise, we need to control the generation of Promise.

Control concurrency through Iterator

The common solution is to receive three parameters of concurrent task array, concurrent function and concurrent number through a function. According to the concurrent number, monitor the completion status of Promise and create new Promise in batch, so as to achieve the purpose of controlling the generation of Promise.

Now, let's try another way to control concurrency through Iterator.

What happens when you traverse the same Iterator at the same time?

Let's start with a simplified example.

// Array.values returns an Array Iterator
const iterator = [1, 2, 3].values();

for (const x of iterator) {
  console.log(`loop x: ${x}`);

  for (const y of iterator) {
    console.log(`loop y: ${y}`);
  }
}

Output results:

"loop x: 1"
"loop y: 2"
"loop y: 3"

Did you notice? The y loop continues with the x loop, and both loops end after all elements have been traversed! That's what we're going to take advantage of.
Students unfamiliar with Iterator can refer to MDN article: https://developer.mozilla.org...

An example of transforming work with Iterator

Let's use this feature of Iterator to transform the initial work example.

// generate workers according to concurrency number
// each worker takes the same iterator
const limit = concurrency => iterator => {
  const workers = new Array(concurrency);
  return workers.fill(iterator);
};

// run tasks in an iterator one by one
const run = func => async iterator => {
  for (const item of iterator) {
    await func(item);
  }
};

// wrap limit and run together
function asyncTasks(array, func, concurrency = 1) {
  return limit(concurrency)(array.values()).map(run(func));
}

Promise.all(asyncTasks(tasks, execute, 2));

Output results:

"start work 1"
"start work 2"
"work 1 done"
"start work 3"
"work 2 done"
"start work 4"
"work 3 done"
"start work 5"
"work 4 done"
"start work 6"
"work 5 done"
"start work 7"
"work 6 done"
"start work 8"
"work 7 done"
"start work 9"
"work 8 done"
"work 9 done"

As we expected, only two asynchronous tasks are executed at the same time until all tasks are finished.

However, the program is not perfect. The main problem is that if a worker goes wrong during execution, the rest of the workers will not stop working. That is to say, in the above example, if worker 1 stops abnormally, worker 2 will execute all the remaining tasks alone until all are completed. Therefore, if you want to keep 2 concurrent at any time, the easiest way is to add a catch to each execute method.

Although it is not perfect enough, it is a simple and effective way to control the number of asynchronous concurrent by using Iterator as the control of Promise creation.

Of course, in the actual project, we should try to avoid repeated wheel building, p-limitasync-pool Even to the extent that bluebird All are simple and easy-to-use solutions.

Keywords: Javascript network REST

Added by godfrank on Wed, 13 Nov 2019 12:07:06 +0200