Data oriented programming [23]

5.4 implementation of mediation algorithm

You: what changes need to be made to the code in order to implement the mediation algorithm?

Joe: it just needs to modify systemdata The code of commit () is shown in listing 5.1.

Listing 5.1 system data class

class SystemData {
    systemData;

    get() {
        return this.systemData;
    }

    set(_systemData) {
        this.systemData = _systemData;
    }

    commit(previous, next) {
        if(!SystemValidity.validate(previous, next)) {
            throw "The system data to be committed is not valid!";
        };
        this.systemData = SystemConsistency.reconcile(this.systemData,
                                                      previous,
                                                      next);
    }
}

You: so how does SystemConsistency adjust?

Joe: the SystemConsistency class exposed in listing 5.2 starts the reconciliation process by comparing the previous one with the current one. If they are the same, then we'll fast forward and go back to the next one.

Listing 5.2 mediation process in operation

class SystemConsistency {
    static threeWayMerge(current, previous, next) {
        var previousToCurrent = DataDiff.diff(previous, current);
        var previousToNext = DataDiff.diff(previous, next);
        if(DataDiff.isEmptyIntersection(previousToCurrent, previousToNext)) {
            return DataDiff.patch(current, previousToNext);
        }
        throw "Conflicting concurrent mutations.";
    }
    static reconcile(current, previous, next) {
        if(current == previous) {
            return next;
        }
        return SystemConsistency.threeWayMerge(current,
                                               previous,
                                               next);
    }
}

TIP on the one hand, the code of SystemConsistency class is exquisite. On the other hand, the code is stateless.

You: wait a minute! Why do you compare previous and current by reference? You should compare them with values. If you want to compare all the leaves of two nested hash graphs, the amount of calculation will be very large.

Joe: This is another advantage of immutable data: when the data is not mutated, it is safe to compare references. If they are the same, we can determine that the data is the same.

TIP when the data is immutable, it is safe to compare by reference. This is super fast: when the references are the same, it means that the data is the same.

You: how about the implementation of the three-way merging algorithm?

Joe: when the previous is different from the current, it means that the simultaneous mutation has run. To determine whether there is a conflict, we calculate two differences: previousToCurrent and previousToNext. If the intersection of the two differences is empty, it means that there is no conflict. We can safely incorporate the changes caused by the transition from the previous to the next. In terms of code, we merge previousToNext into current.

take a closer look at the code of the SystemConsistency class in listing 5.2. You try to find out whether these codes are thread safe.

You: I don't think the code of the SystemConsistency class in listing 5.2 is thread safe. If there is a context switch between checking whether the system has changed in the SystemConsistency class and updating the state in the SystemData class, a mutation may overwrite the previous mutation.

Joe: you're absolutely right! Code works well in a single threaded environment like JavaScript, where concurrency is handled through event loops. However, in a multithreaded environment, the code needs to be improved and thread safe using atoms. I'll tell you how to do it in the second part.

Warning SystemConsistency class is not thread safe. We will make it thread safe in the second part.

5.5 semantic differences

NOTE this section deals with the implementation of the DataDiff class used by SystemConsistency. If you don't want to challenge your mind with refined recursive details now, feel free to skip this section. This does not prevent you from enjoying the rest of the discussion with Joe throughout the book. You can look at this section later.

TODO: it is acknowledged that this is a simplified version of semantic differences and does not involve deletion.

TODO: admit it's not new.

TODO: renamed structural difference

You: how do you calculate the difference between the system states of different versions?

Joe: This is the most challenging part of the mediation algorithm. The SystemConsistency class uses the DataDiff class for variance calculation. The code of DataDiff class implements a semantic difference algorithm: This is definitely not a trivial matter. But in the end, it only handles data operations.

Listing 5.3 data diff class

class DataDiff {
    static diffObjects(data1, data2) {
        var emptyObject = _.isArray(data1) ? [] : {};
        if(data1 == data2) {
            return emptyObject;
        }
        var keys = _.union(_.keys(data1), _.keys(data2));
        return _.reduce(keys,
                        function (acc, k) {
                            var res = DataDiff.diff(_.get(data1, k),
                                                    _.get(data2, k));
                            if((_.isObject(res) && _.isEmpty(res)) ||
                               (res == "data-diff:no-diff")) {
                                return acc;
                            }
                            return _.set(acc, [k], res);
                        },
                        emptyObject);
    }

    static diff(data1, data2) {
        if(_.isObject(data1) && _.isObject(data2)) {
            return DataDiff.diffObjects(data1, data2);
        }
        if(data1 !== data2) {
            return data2;
        }
        return "data-diff:no-diff";
    }

    static patch(data, diff) {
        return _.merge(data, diff);
    }

    static leaves(obj, prefix = '') {
        return _.reduce(obj,
                        function(acc, v, k) {
                            if (_.isObject(v)) {
                                return _.concat(acc,
                                                DataDiff.leaves(v,
                                                                prefix + k + "."));
                            }
                            return _.concat(acc, [prefix + k]);
                        },
                        []);
    }
    static isEmptyIntersection(delta1, delta2) {
        return _.isEmpty(_.intersection(DataDiff.leaves(delta1),
                                        DataDiff.leaves(delta2)));
    }
}

You: I want to say: refined data processing. It involves a Recursion in reduce()!

Joe: on the one hand, it's tricky, but on the other hand, it's very easy to write unit test code. After writing a few use cases, you can be sure that your code is error free!

You: datadiff What will the unit test of diff () look like?

Joe: similar to the code in listing 5.4. I'll tell you more about writing unit tests for data manipulation functions in the second part.

Listing 5.4 datadiff A basic unit test of diff()

var data1 = {
    g: {
        c: 3
    },
    x: 2,
    y: {
        z: 1
    },
    w: [5]
}

var data2 = {
    g: {
        c:3
    },
    x: 2,
    y: {
        z: 2
    },
    w: [4]
}

_.isEqual(DataDiff.diff(data1, data2),
          {
              "w":  [
                  4
              ],
              "y":  {
                  "z": 2
              }
          })

TIP for a piece of code that only deals with data operations, it is easy to write unit tests!

You: I think people can also easily write unit tests for system consistency.

Joe: it must be!

You: how about the performance of DataDiff? Do we have to check all the leaves of the two pieces of data?

Joe: in general, yes. However, in the case of operating system data in the way of structure sharing, the efficiency of code is much higher.

You: what do you mean?

JOE: in the case of structural sharing, most nested objects are shared between two versions of system state. So most of the time, when the code enters datadiff Diffobjects(), it will return immediately because data1 and data2 are the same.

TIP is effective in calculating the difference between the states of the two versions, because two hash graphs created from the same hashgraph through structural sharing have most of the common nodes.

Added by hellouthere on Fri, 28 Jan 2022 15:06:32 +0200

Programming VIP

Data oriented programming [23]

5.4 implementation of mediation algorithm

5.5 semantic differences

Popular Keywords