Scheduler 16 - core_ctl

Linux-5.10 based on MTK

1, Related document interface

1. parameters file interface

/sys/module/mtk_core_ctl/parameters # ls -l
-rw------- 1 root   root   debug_enable //control core_ctl.c in core_ctl_debug() Printing of, TAG by"core_ctl"
-rw-rw---- 1 system system policy_enable

(1) debug_enable

The default value is false, which controls the core_ Core in CTL. C_ ctl_ Print debug(), TAG is "core_ctl"

(2) policy_enable

The default is false, from demand_eval(), if policy_ If the enable file is not enabled, then need_cpus directly takes Cluster - > max_ CPUs. At this time, the isolate/ioslate of the cpu core is only controlled by the user space through the core_ Whether the file node and boost under CTL are set.

 

2. core_ctl file interface

/sys/devices/system/cpu/cpu0/core_ctl # ls -l
-rw------- 1 root root core_ctl_boost
-rw------- 1 root root enable
-r-------- 1 root root global_state
-rw-rw-r-- 1 root root max_cpus
-rw-rw-r-- 1 root root min_cpus
-rw------- 1 root root not_preferred
-rw------- 1 root root offline_throttle_ms
-r-------- 1 root root ppm_state //Show a table
-r-------- 1 root root thermal_up_thres
-rw------- 1 root root up_thres

(1) core_ctl_boost

Corresponding cluster_data::boost member, which is false by default and set to 1 to boost all clusters, on demand_eval(), if it is in the boot state, need_cpus directly takes Cluster - > max_ CPUs, that is, it no longer performs the actual isolate action. If there is an isolate cpu, it needs to be unified.

(2) enable

Corresponding cluster_data::enable member. The default value is true. It is set in demand_eval(), if there is no enable, need_cpus directly takes Cluster - > max_ CPUs, that is, it no longer performs the actual isolate action. If there is an isolate cpu, it needs to be unified.

(3) global_state

Print the number of active CPUs, the number of needed CPUs, the number of paid CPUs in the cluster, and the oneline, pause, busy, and prefer ence status of each cpu in the cluster. See cat below.

(4) max_cpus

Corresponding cluster_data::max_cpus member, thread do_ core_ In CTL (), apply is executed before the actual isolate / unify is executed_ Limits (cluster, Cluster - > need_cpus)_ CPUs clamped in Cluster - > min_ CPUs and   cluster->max_ Between CPUs, that is, the default logic will respect the user space limit on the number of cpu cores. The user space limit has the highest priority and is higher than the core_ctl_tick() executes the estimated number of cores in the logic. But through eas_ The IOCTL file does not respect the cpu limit of user space.

(5) min_cpus

Corresponding cluster_data::min_cpus member. The default value of each cluster is default_min_cpus[MAX_CLUSTERS] = {4, 2, 0}. Wake up the "core_ctl_v2/X" thread to perform isolate / unify operation immediately after setting.

(6) not_preferred

Corresponding cluster_data::not_preferred member. Not if some CPUs are marked_ Preferred, then try_ to_ When you isolate the CPU in pause(), you will give priority to these not_preferred CPU, if not_ The preferred CPUs have been isolated and have not yet reached active_cpus == need, then continue. Isolate is not marked as not_preferred CPU.

 

2, core_ctl set path

1. scheduler_ The tick cycle updates the cpu core count requirement and triggers isolate / unify

(1) Call path

scheduler_tick //core.c
    core_ctl_tick //core_ctl.c trace_android_vh_scheduler_tick(rq) take per_cpu Of 4 ms Convert windows to global 4 ms Windows, every 4 ms Actual call once
        if (enable_policy)
            core_ctl_main_algo(); //Update through a certain algorithm cluster->new_need_cpus
        apply_demand //core_ctl.c For each cluster All call
            for_each_cluster(cluster, index)
                apply_demand(cluster) //core_ctl.c
                    if (demand_eval(cluster))
                        wake_up_core_ctl_thread(cluster); //awaken per-cluster Kernel thread"core_ctl_v2/X"
                            try_core_ctl //core_ctl.c per-cluster Kernel thread"core_ctl_v2/X",Kernel priority 0 RT Thread, usually dormant, with core control Wake it up when you need it
                                do_core_ctl

(2) Correlation function

static void __ref do_core_ctl(struct cluster_data *cluster) //core_ctl.c
{
    ...
    //Return will cluster->need_cpus Clamp in cluster->min_cpus and cluster->max_cpus Values between
    need = apply_limits(cluster, cluster->need_cpus);
    //need less than cluster->active_cpus or need greater than cluster->active_cpus also cluster->nr_paused_cpus Not 0
    if (adjustment_possible(cluster, need)) {
        if (cluster->active_cpus > need)
            try_to_pause(cluster, need);
        else if (cluster->active_cpus < need)
            try_to_resume(cluster, need);
    }
    ...
}            
                        
try_to_pause //core_ctl.c Keep going pause,until cluster->active_cpus Equal parameter need,Real time update in process cluster->active_cpus and cluster->nr_paused_cpus
    sched_pause_cpu //core_pause.c pause One cpu
        pause_cpus //kernel/cpu.c

try_to_resume //core_ctl.c Keep going resume,until cluster->active_cpus Equal parameter need,Real time update in process cluster->active_cpus and cluster->nr_paused_cpus
    sched_resume_cpu //core_pause.c resume One cpu
        resume_cpus //kernel/cpu.c


static void try_to_pause(struct cluster_data *cluster, int need)
{
    unsigned long flags;
    unsigned int num_cpus = cluster->num_cpus;
    //Check this cluster Is there a mark in the not_preferred cpu
    bool check_not_prefer = cluster->nr_not_preferred_cpus;
    bool check_busy = true;

again:
    for (cpu = nr_cpu_ids-1; cpu >= 0; cpu--) {
        struct cpu_data *c;

        success = false;
        if (!cpumask_test_cpu(cpu, &cluster->cpu_mask))
            continue;

        if (!num_cpus--)
            break;

        c = &per_cpu(cpu_state, cpu);
        if (!is_active(c))
            continue;

        //if that's the case cluster Just one of them cpu Percentage of computing power used c->cpu_util_pct Not less than cluster->cpu_busy_up_thres I think so busy
        if (check_busy && c->is_busy)
            continue;

        //per_ioctl force isolate of cpu
        if (c->force_paused)
            continue;

        //until active==need Just quit pause,Otherwise, keep trying pause
        if (cluster->active_cpus == need)
            break;

        //only Pause not_preferred of CPU,without CPU Selected as not_preferred,Then all CPU All meet the isolation conditions.
        if (check_not_prefer && !c->not_preferred)
            continue;

        //implement isolate cpu operation
        if (!sched_pause_cpu(c->cpu)) {
            if (cpu_online(c->cpu))
                //Records are made by core_ctl isolate of
                c->paused_by_cc = true;
        }
        cluster->active_cpus = get_active_cpu_count(cluster);
    }

    cluster->nr_paused_cpus += nr_paused;

    if (check_busy || (check_not_prefer && cluster->active_cpus != need)) {
        num_cpus = cluster->num_cpus;
        check_not_prefer = false; //Change to false Try again
        check_busy = false;
        goto again;
    }
}

sched_pause_cpu --> pause_cpus

//The parameter is to pause of cpu of mask
int pause_cpus(struct cpumask *cpus) //kernel/cpu.c
{
    ...
    if (cpu_hotplug_disabled) { //Need not prohibit cpu_hotplug can pause
        err = -EBUSY;
        goto err_cpu_maps_update;
    }

    //Only to active of cpu conduct pause
    cpumask_and(cpus, cpus, cpu_active_mask);

    for_each_cpu(cpu, cpus) {
        //cpu yes offline Yes, or dl The bandwidth of the task is not enough, so it cannot be used pasue of
        if (!cpu_online(cpu) || dl_cpu_busy(cpu) || get_cpu_device(cpu)->offline_disabled == true) {
            err = -EBUSY;
            goto err_cpu_maps_update;
        }
    }

    //No pause be-all active of cpu
    if (cpumask_weight(cpus) >= num_active_cpus()) {
        err = -EBUSY;
        goto err_cpu_maps_update;
    }

    //will pause of cpu Set to non active The state of is from cpu_active_mask Remove from
    for_each_cpu(cpu, cpus)
        set_cpu_active(cpu, false); //cover isolate of cpu Will not appear in cpu_active_mask in ######
    
    //conduct pause
    err = __pause_drain_rq(cpus);

    trace_cpuhp_pause(cpus, start_time, 1);

    return err;
}

 

2. perf_ Enforce core in IOCTL_ CTL interface

/proc/perfmgr/eas_ioctl will force core here_ ctl

static long eas_ioctl_impl(struct file *filp, unsigned int cmd, unsigned long arg, void *pKM) //perf_ioctl.c
{
    struct _CORE_CTL_PACKAGE msgKM = {0};
    ...
    switch (cmd) {
    case CORE_CTL_FORCE_PAUSE_CPU: //This is mandatory nuclear isolation
        if (perfctl_copy_from_user(&msgKM, ubuf, sizeof(struct _CORE_CTL_PACKAGE)))
            return -1;

        bval = !!msgKM.is_pause;
        ret = core_ctl_force_pause_cpu(msgKM.cpu, bval);
        break;
    ...
    }
    ...
}

//is_pause: 1 pause, 0 resume
int core_ctl_force_pause_cpu(unsigned int cpu, bool is_pause)
{
    int ret;
    struct cpu_data *c;
    struct cluster_data *cluster;
    ...

    if (!cpu_online(cpu))
        return -EBUSY;

    c = &per_cpu(cpu_state, cpu);
    cluster = c->cluster;

    //Perform actual pause and resume
    if (is_pause)
        ret = sched_pause_cpu(cpu);
    else
        ret = sched_resume_cpu(cpu);

    //Mark is force Interface pause of
    c->force_paused = is_pause;
    if (c->paused_by_cc) {
        c->paused_by_cc = false;
        cluster->nr_paused_cpus--;
    }
    cluster->active_cpus = get_active_cpu_count(cluster);

    return ret;
}

If through perf_ The IOCTL interface forces the isolated CPU to_ data::force_ Paused will be set to 1 and sched will be called directly_ pause_ cpu/sched_ resume_ CPU isolation and de isolation. In the "core_ctl_v2/X" thread of the original path, the setting of C - > force will be skipped in the isolate / unify execution process_ The CPU of the paused flag bit, that is, the CPU of the force isolate must have the force interface unify!

 

3. Pass max_cpus/min_cpus file interface settings

By setting / sys / devices / system / CPU / cpux / core_ Max under CTL_ cpus,min_cpus file interface,

static void set_min_cpus(struct cluster_data *cluster, unsigned int val)
{
    ...
    cluster->min_cpus = min(val, cluster->max_cpus);
    ...
    //awaken"core_ctl_v2/X"thread 
    wake_up_core_ctl_thread(cluster);
}

static void set_max_cpus(struct cluster_data *cluster, unsigned int val) //core_ctl.c
{
    ...
    val = min(val, cluster->num_cpus);
    cluster->max_cpus = val;
    //This effect is to limit the core, just go to max_cpus In a file echo Just a value
    cluster->min_cpus = min(cluster->min_cpus, cluster->max_cpus);
    ...
    //awaken"core_ctl_v2/X"thread 
    wake_up_core_ctl_thread(cluster);
}

/sys/devices/system/cpu/cpu0/core_ctl # cat min_cpus
4
/sys/devices/system/cpu/cpu0/core_ctl # echo 1 > max_cpus
/sys/devices/system/cpu/cpu0/core_ctl # cat max_cpus
1
/sys/devices/system/cpu/cpu0/core_ctl # cat min_cpus
1

Summary: core_ctl_tick and max_cpus/min_cpus sets paths to perform core isolation and de isolation by waking up the RT thread "core_ctl_v2/X" with priority 0, but the former updates the core requirement new_need_cpus parameter, which is to increase the number of cores limit. The force path directly calls the pause interface for isolation and de isolation, and its operated cpu is not affected by the "core_ctl_v2/X" thread. resume_cpus is the opposite operation.

 

3, Debug log

1. Relevant trace

(1) trace_core_ctl_demand_eval

//Call parameters:
demand_eval
    trace_core_ctl_demand_eval(cluster->cluster_id, old_need, new_need, cluster->active_cpus,
        cluster->min_cpus, cluster->max_cpus, cluster->boost, cluster->enable, ret && need_flag);

//trace Print:
        <idle>-0       [006] d.h3  2007.792026: core_ctl_demand_eval: cid=0, old=2, new=4, act=2 min=0 max=4 bst=0 enbl=1 update=1
core_ctl_v2/0-463      [006] d.h3  2007.796037: core_ctl_demand_eval: cid=0, old=4, new=4, act=3 min=0 max=4 bst=0 enbl=1 update=1

The printing sequence is the incoming parameters. Only update=1 will wake up the core_ctl thread to perform further isolate / unify operations.

(2) trace_core_ctl_algo_info

//Call parameters:
core_ctl_main_algo
    trace_core_ctl_algo_info(big_cpu_ts, heaviest_thres, max_util, cpumask_bits(cpu_active_mask)[0], orig_need_cpu);

//trace Print:
sh-18178   [004] d.h2 18903.565478: core_ctl_algo_info: big_cpu_ts=67692 heaviest_thres=770 max_util=786 active_cpus=f1 orig_need_cpus=4|9|6

big_cpu_ts: it's the temperature of large core cpu7, 67.692 degrees
heaviest_thres: as the util threshold for judging whether the large core needs to be turned on, it is the medium core up when the temperature is lower than 65 ℃_ thres/100 * max_ Capacity, thermal above 65 degrees_ up_thres/100 * max_capacity
max_util: records the util of the largest task on all CPUs and the sched executed every 8ms_ max_util_ task_ Updated in tracking ().
active_cpus: the cpu is printed_ active_ Mask, through which you can see which CPUs are isolated or set offline. The measured isolated or offline will be reflected in the cpu_ active_ On the mask.
orig_need_cpus: it is an array, which prints the cluster - > New of each cluster in turn_ need_ CPUs members are the number of cpu cores required by each cluster evaluated.

Note: you can see the new of MTK_ need_ The evaluation algorithm of CPUs is obviously not good. It is estimated that each cluster needs 4|9|6 cores in the flight screen out scenario.

(3) trace_core_ctl_update_nr_over_thres

//Call parameters:
scheduler_tick //core.c
    core_ctl_tick //core_ctl.c
        core_ctl_main_algo
            get_nr_running_big_task
                trace_core_ctl_update_nr_over_thres(nr_up, nr_down, max_nr)

//trace Print:
sh-18174   [006] dNh2 23927.901480: core_ctl_update_nr_over_thres: nr_up=1|0|0 nr_down=0|5|0 max_nr=2|4|4

The cluster of each cluster is printed separately_ NR in data_ up, nr_ down, max_ NR, in core_ ctl_ main_ Evaluate Cluster - > New in algo()_ need_ CPUs.

From "dNh2", it can be seen that this function is executed in the context of hard interrupt. At this time, the preemption count is 2.


2. Open the debug log

If something goes wrong, you can echo 1 > / sys / module / MTK_ core_ ctl/parameters/debug_ Enable open the debug log to see the code execution process.


3. Summary: is the lack of force_ debug log of paused.

 

4, CPU online/offline process

Execution: echo 0 / 1 > / sys / devices / system / CPU / cpux / Online

correlation function

struct bus_type cpu_subsys = { //driver/base/cpu.c
    .name = "cpu",
    .dev_name = "cpu",
    .match = cpu_subsys_match,
#ifdef CONFIG_HOTPLUG_CPU
    .online = cpu_subsys_online,
    .offline = cpu_subsys_offline,
#endif
};

static ssize_t online_store(struct device *dev, struct device_attribute *attr, const char *buf, size_t count) //driver/base/core.c
{
    ...
    ret = strtobool(buf, &val);
    ret = val ? device_online(dev) : device_offline(dev); 
    ...
}

Call path:

device_online
    dev->bus->online(dev) //that is cpu_subsys.online
        cpu_device_up(dev)
            cpu_up(dev->id, CPUHP_ONLINE) 
    kobject_uevent(&dev->kobj, KOBJ_ONLINE);
    dev->offline = false;

device_offline
    dev->bus->offline(dev); //that is cpu_subsys.offline
        cpu_device_down(dev);
            cpu_down(dev->id, CPUHP_OFFLINE)
    kobject_uevent(&dev->kobj, KOBJ_OFFLINE);
    dev->offline = true;

The struct device structure has only offline members and no online members. In the offline call path, it will be judged that the only active cpu will not be offline, and the measured offline cpu will set the cpu_active_mask, but the tracking code has not been set yet.

 

Added by davemwohio on Tue, 07 Dec 2021 02:04:53 +0200