Fat details

Fat introduction

Fat (Federated AI Technology Enabler) is an open source project initiated by the AI Department of Weizhong bank, which aims to provide a secure computing framework to support the federal AI ecosystem. It implements a secure computing protocol based on homomorphic encryption and multiparty computing (MPC). It supports the secure computing of Federated learning architecture and various machine learning algorithms, including logical regression, tree based algorithm, deep learning and migration learning.

Fat technical framework

FederatedML

Algorithm functional components, including the federated implementation of common machine learning algorithms. All modules are developed in a modular decoupling way to enhance scalability.

FATE_Flow

Fat flow is the job scheduling system of the federal learning framework fat, which realizes the complete management of the life cycle of federal learning jobs, including data input, training job scheduling, index tracking, model center and other functions.

FATE-Board

The visualization tool of Federated learning modeling visualizes and measures the whole process of model training for end users. It supports the tracking, statistics and monitoring of the whole process of model training, and provides rich visual presentation for model operation status, model output and log tracking, so as to help users explore and understand the model simply and efficiently.

FATE-Serving

High performance scalable federated learning online model service.

role

Guest

Guest refers to the data application party. In the vertical algorithm, guest is often the party with label y. Generally, the modeling process is initiated by guest.

Host

Host is the data provider.

arbiter

Arbiter is used to assist multiple parties to complete joint modeling. Its main function is to aggregate gradients or models. For example, in the vertical lr, each party sends half of its gradient to arbiter, and then arbiter performs joint optimization. Arbiter also participates in and distributes public and private keys for encryption and decryption services.

Fat environment Deployment Guide

Stand alone deployment

reference resources: https://fate.readthedocs.io/en/latest/_ build_ Temp / standalone deploy / readme.html #install-fat-using-docker-recommended, including Using Docker to install fat * (recommended)* and Installing fat on the host

Cluster deployment

reference resources: https://fate.readthedocs.io/en/latest/_build_temp/cluster-deploy/README.html

The host enters the docker fat image command

Host ip:192.168.1.75

The fat version is 1.6.0, and the host executes the following commands

CONTAINER_ID=`docker ps -aqf "name=fate"`
docker exec -t -i ${CONTAINER_ID} bash

General guidelines for quick start fat

reference resources: https://fate.readthedocs.io/en/latest/_build_temp/examples/pipeline/README.html

Since fat has been started, the following commands do not need to be executed

  1. (optional) create a virtual environment

    python -m venv venv
    source venv/bin/activate
    pip install -U pip
    
  2. Installing the fat client

    pip install fate_client
    pipeline init --help
    
  3. Provides the ip/port information of the deployed fat flow server

    # The default ip: port is 127.0.0.1:8080
    pipeline init --ip 127.0.0.1 --port 9380
    # Optional, set the Pipeline directory
    pipeline init --ip 127.0.0.1 --port 9380 --log-directory {desired log path}
    
  4. Upload data using fat pipeline

    Before starting the modeling task, you should upload the data to be used. Typically, one side is a cluster that contains multiple nodes. Therefore, when we upload these data, the data will be assigned to these nodes.

    Refer to test case 1 in this document.

Preliminary knowledge of fat use

Upload data guide

reference resources: https://fate.readthedocs.io/en/latest/_build_temp/doc/upload_data_guide_zh.html

Accepted data type

Dense,svm-light,tag,tag:value

Define upload data profile

{
  "file": "examples/data/breast_hetero_guest.csv",
  "table_name": "hetero_breast_guest",
  "namespace": "experiment",
  "head": 1,
  "partition": 8,
  "work_mode": 0,
  "backend": 0
}

Field Description:

  1. File: file path

  2. table_ Name & namespace: identifier of the storage data table

  3. head: Specifies whether the data file contains a header

  4. partition: Specifies the number of partitions used to store data

  5. work_mode: Specifies the working mode. 0 represents stand-alone version and 1 represents cluster version

  6. Backend: Specifies the backend. 0 represents eggrovel, 1 represents SPARK plus rabbit MQ, and 2 represents SPARK plus Pulsar

Upload command

This step is required for each cluster that provides data, that is, guest and host

1. Upload data using fat flow:

flow data upload -c dsl_test/upload_data.json

upload_data.json content:

{
    "file": "examples/data/breast_hetero_guest.csv",
    "head": 1,
    "partition": 16,
    "work_mode": 0,
    "table_name": "breast_hetero_guest",
    "namespace": "experiment"
}

2. Upload data using legacy python scripts:

python /fate/python/fate_flow/fate_flow_client.py -f upload -c /fate/examples/dsl/v1/upload_data.json

3. Upload data using python script: refer to test case 1

DSL & task submit runtime conf setting V2

In order to make the construction of task model more flexible, fat uses a set of self-defined domain specific language (DSL) to describe tasks. In DSL, various modules (such as data read / write, data_io, feature engineering, regression region, classification) can be organized into a directed acyclic graph (DAG). Through various ways, users can flexibly combine various algorithm modules according to their own needs.

In addition, each module has different parameters to be configured, and different parties may have different parameters for the same module. In order to simplify this situation, for each module, fat will save different parameters of all parties to the same Submit Runtime Conf, and all parties will share this configuration file. This guide will show you how to create a DSL configuration file.

V2 configuration reference: https://fate.readthedocs.io/en/latest/_build_temp/doc/dsl_conf_v2_setting_guide_zh.html

DSL configuration description

1. General

The configuration file of DSL adopts json format. In fact, the whole configuration file is a json object (dict).

2.Components

The first level of this dict is "components", which is used to represent each module that will be used in this task. Each independent module is defined under "components". All data needs to be taken from the data store through the Reader module. Note that the Reader module only output s

3.module

Used to specify the module to use. Refer to fat ML algorithm list for module name( https://fate.readthedocs.io/en/latest/_build_temp/python/federatedml/README_zh.html ), and / fat / Python / federatedml / conf / setting_ The file names of all modules under conf are consistent (excluding the. json suffix).

4.input

There are two input types: Data and Model.

Data input

There are three input types:

  1. Data: generally used for data_io module, feature_engineering module or evaluation module
  2. train_data: commonly used for homo_lr, hetero_lr and secure_boost module. If train appears_ Data field, then this task will be recognized as a fit task
  3. validate_data: if train exists_ Data field, this field is optional. If you choose to keep this field, the data pointed to will be used as validation set
  4. test_data: used as prediction data. If provided, model input shall be provided at the same time.
Model input

There are two input types

1.model: used for model input of components of the same type.

​ 2.isometric_model: used to specify model input that inherits upstream components

5.output
data output

There are four output types:

  1. Data: general module data output
  2. train_data: Data Split only
  3. validate_data: Data Split only
  4. test_data: Data Split only
Model output

Use model only

DSL configuration example

In the training mode, users can use other algorithm modules to replace heterosecurebost. Note the module name hetero_secureboost_0 will also be changed together

 {
    "components": {
        "reader_0": {
            "module": "Reader",
            "output": {
                "data": [
                    "data"
                ]
            }
        },
        "dataio_0": {
            "module": "DataIO",
            "input": {
                "data": {
                    "data": [
                        "reader_0.data"
                    ]
                }
            },
            "output": {
                "data": [
                    "data"
                ],
                "model": [
                    "model"
                ]
            }
        },
        "intersection_0": {
            "module": "Intersection",
            "input": {
                "data": {
                    "data": [
                        "dataio_0.data"
                    ]
                }
            },
            "output": {
                "data": [
                    "data"
                ]
            }
        },
        "hetero_secureboost_0": {
            "module": "HeteroSecureBoost",
            "input": {
                "data": {
                    "train_data": [
                        "intersection_0.data"
                    ]
                }
            },
            "output": {
                "data": [
                    "data"
                ],
                "model": [
                    "model"
                ]
            }
        },
        "evaluation_0": {
            "module": "Evaluation",
            "input": {
                "data": {
                    "data": [
                        "hetero_secureboost_0.data"
                    ]
                }
            },
            "output": {
                "data": [
                    "data"
                ]
            }
        }
    }
}

Create profile Submit Runtime Conf

For the new format of version 1.5.x, Job Runtime Conf is used to set the information of each participant, the parameters of the job and the parameters of each component. The contents include the following

1. DSL version

Configuration version: 1 is not configured by default, and 2 is recommended

"dsl_version": "2"
2. Job Participants

The user needs to define the initiator.

1. Initiator, including the role and Party of the task initiator_ ID, for example:

"initiator": {
    "role": "guest",
    "party_id": 9999
}

2. All participants: contains the information of each participant. In the role field, each element represents a role and the party undertaking the role_ id. Party for each role_ ID exists as a list, because a task may involve multiple parties playing the same role. For example:

"role": {
    "guest": [9999],
    "host": [10000],
    "arbiter": [10000]
}
3.algorithm_parameters (system operating parameters)

Configure the main system parameters when the job runs

Parameter application range policy settings
  • Applies to all parties, using the common scope identifier

  • Apply only to a party, use the role range identifier, and use role:party_index locates the specified party. The directly specified parameter takes precedence over the common parameter

    "common": {
    }
    
    "role": {
      "guest": {
        "0": {
        }
      }
    }
    

Among them, the parameters under common are applied to all participants, and the parameters under role-guest-0 configuration are applied to the participants under the subscript 0 of the guest role. Note that the system operation parameters in the current version are not strictly tested for only one participant, so it is recommended to choose common first

Supported system parameters
Configuration itemDefault valueSupport valueexplain
job_typetraintrain, predictTask type
work_mode00, 10 represents single party stand-alone version and 1 represents multi-party distributed version
backend00, 1, 20 represents EGGROLL, 1 represents SPARK plus rabbit MQ, and 2 represents SPARK plus Pulsar
model_id--Model id, which needs to be filled in for prediction task
model_version--Model version, the forecast task needs to be filled in
task_cores4positive integerTotal cpu cores for job requests
task_parallelism1positive integertask parallelism
computing_partitionsNumber of cpu cores allocated to taskpositive integerThe number of partitions in the data table at the time of calculation
eggroll_runnothingprocessors_per_node et alThe configuration parameters related to the eggroll computing engine generally do not need to be configured and are controlled by task_ The cores are calculated automatically. If configured, the task_ The cores parameter does not take effect
spark_runnothingNum executors, executor cores, etcThe related configuration parameters of spark computing engine generally do not need to be configured and are determined by task_ The cores are calculated automatically. If configured, the task_ The cores parameter does not take effect
rabbitmq_runnothingqueue, exchange, etcrabbitmq creates the relevant configuration parameters of queue and exchange. Generally, it does not need to be configured, and the system default value is adopted
pulsar_runnothingproducer, consumer, etcpulsar is configured when creating producer and consumer. Generally, it does not need to be configured.
federated_status_collect_typePUSHPUSH, PULLMulti party operation status collection mode, PUSH means that each participant actively reports to the initiator, and PULL means that the initiator regularly pulls from each participant
timeout259200 (3 days)positive integerTask timeout in seconds
  1. The three types of engines have certain support dependencies. For example, Spark computing engine currently only supports HDFS as an intermediate data storage engine
  2. work_mode + backend will automatically generate the corresponding three engine configurations computing, storage and federation according to the support dependencies
  3. Developers can implement the adaptive engine by themselves and configure the engine in runtime config
Unopened parameters

reference resources: https://fate.readthedocs.io/en/latest/_build_temp/doc/dsl_conf_v2_setting_guide_zh.html#id17

Reference configuration

There are four kinds

  1. Use eggroll as the backend and adopt the configuration when the default cpu allocation calculation strategy is adopted

  2. eggroll is used as the backend, and the configuration is adopted when parameters such as cpu are directly specified

  3. Use spark and rabbitMQ as the backend, and adopt the configuration when directly specifying parameters such as cpu

    "job_parameters": {
      "common": {
        "job_type": "train",
        "work_mode": 1,
        "backend": 1,
        "spark_run": {
          "num-executors": 1,
          "executor-cores": 2
        },
        "task_parallelism": 2,
        "computing_partitions": 8,
        "timeout": 36000,
        "rabbitmq_run": {
          "queue": {
            "durable": true
          },
          "connection": {
            "heartbeat": 10000
          }
        }
      }
    }
    
  4. Use spark plus pulsar as the backend

For details, refer to https://fate.readthedocs.io/en/latest/_build_temp/doc/dsl_conf_v2_setting_guide_zh.html

Resource management details

Since version 1.5.0, in order to further manage resources, fateflow enables a finer grained cpu cores management policy, removing the policy of limiting the number of jobs running at the same time in the previous version.

Including: total resource allocation, operation resource calculation and resource scheduling. For details, refer to: https://fate.readthedocs.io/en/latest/_build_temp/doc/dsl_conf_v2_setting_guide_zh.html#id19

4. Component operating parameters

For details, refer to: https://fate.readthedocs.io/en/latest/_build_temp/doc/dsl_conf_v2_setting_guide_zh.html#id23

Parameter application range policy settings
  • Applies to all parties, using the common scope identifier
  • Apply only to a party, use the role range identifier, and use role:party_index locates the specified party. The directly specified parameter takes precedence over the common parameter
"commom": {
}

"role": {
  "guest": {
    "0": {
    }
  }
}

The parameters under the common configuration are applied to all participants, and the parameters under the role-guest-0 configuration are applied to the participants under the subscript 0 of the guest role. Note that the component running parameters of the current version support two application range policies

Reference configuration
  • intersection_0 and hetero_lr_0. The operating parameters of the two components are placed in the common range and applied to all participants
  • For reader_0 and dataio_0 the operating parameters of the two components are configured according to different participants. This is because the input parameters of different participants are usually inconsistent. All the two components are generally set according to the participants

The above component names are defined in the DSL configuration file

"component_parameters": {
  "common": {
    "intersection_0": {
      "intersect_method": "raw",
      "sync_intersect_ids": true,
      "only_output_key": false
    },
    "hetero_lr_0": {
      "penalty": "L2",
      "optimizer": "rmsprop",
      "alpha": 0.01,
      "max_iter": 3,
      "batch_size": 320,
      "learning_rate": 0.15,
      "init_param": {
        "init_method": "random_uniform"
      }
    }
  },
  "role": {
    "guest": {
      "0": {
        "reader_0": {
          "table": {"name": "breast_hetero_guest", "namespace": "experiment"}
        },
        "dataio_0":{
          "with_label": true,
          "label_name": "y",
          "label_type": "int",
          "output_format": "dense"
        }
      }
    },
    "host": {
      "0": {
        "reader_0": {
          "table": {"name": "breast_hetero_host", "namespace": "experiment"}
        },
        "dataio_0":{
          "with_label": false,
          "output_format": "dense"
        }
      }
    }
  }
}
Multi Host configuration

Including multi host tasks, all host information shall be listed under the role, and different configurations of each host shall be listed under their corresponding modules. For details, refer to: https://fate.readthedocs.io/en/latest/_build_temp/doc/dsl_conf_v2_setting_guide_zh.html#host

Forecast task configuration

DSL V2 does not automatically generate predictive DSLs for training tasks. Users need to first deploy the modules in the required model using Flow Client. For detailed command description, please refer to fat flow document < / fat / Python / fat_ client/flow_ client/README_ Deploy module of zh.rst >

The command is as follows:

flow model deploy 
--model-id $model_id #Model ID is required
--model-version $model_version #Model version is required
# --CPN list... Component list, non essential parameter
5. Basic principle of fat-flow operation job
  1. After submitting the job, fateflow obtains the job dsl and job config and stores them in the database t_ Corresponding fields of the job table and / fat / jobs / $jobid / directory,
  2. Parse the job dsl and job config, generate fine-grained parameters based on the merged parameters (three engine parameters will be generated corresponding to the backend & work_mode mentioned above), and process the default values of the parameters
  3. Distribute and store the common configuration to all participants, and generate a job according to the actual information of the participants_ runtime_ on_ party_ conf
  4. When each participant receives a task, it is based on the job_runtime_on_party_conf execution

The $jobid directory includes the following files:

job_dsl.json  job_runtime_conf.json  local  pipeline_dsl.json  train_runtime_conf.json

Submit Runtime Conf configuration example

Training and validation model dsl example

{
    "dsl_version": 2,
    "initiator": {
        "role": "guest",
        "party_id": 9999
    },
    "role": {
        "host": [
            10000
        ],
        "guest": [
            9999
        ]
    },
    "job_parameters": {
        "job_type": "train",
        "work_mode": 0,
        "backend": 0,
        "computing_engine": "STANDALONE",
        "federation_engine": "STANDALONE",
        "storage_engine": "STANDALONE",
        "engines_address": {
            "computing": {
                "nodes": 1,
                "cores_per_node": 20
            },
            "federation": {
                "nodes": 1,
                "cores_per_node": 20
            },
            "storage": {
                "nodes": 1,
                "cores_per_node": 20
            }
        },
        "federated_mode": "SINGLE",
        "task_parallelism": 1,
        "computing_partitions": 4,
        "federated_status_collect_type": "PULL",
        "model_id": "guest-9999#host-10000#model",
        "model_version": "202108310831349550536",
        "eggroll_run": {
            "eggroll.session.processors.per.node": 4
        },
        "spark_run": {},
        "rabbitmq_run": {},
        "pulsar_run": {},
        "adaptation_parameters": {
            "task_nodes": 1,
            "task_cores_per_node": 4,
            "task_memory_per_node": 0,
            "request_task_cores": 4,
            "if_initiator_baseline": false
        }
    },
    "component_parameters": {
        "role": {
            "guest": {
                "0": {
                    "reader_0": {
                        "table": {
                            "name": "breast_hetero_guest",
                            "namespace": "experiment"
                        }
                    }
                }
            },
            "host": {
                "0": {
                    "reader_0": {
                        "table": {
                            "name": "breast_hetero_host",
                            "namespace": "experiment"
                        }
                    },
                    "dataio_0": {
                        "with_label": false
                    }
                }
            }
        },
        "common": {
            "dataio_0": {
                "with_label": true
            },
            "hetero_secureboost_0": {
                "task_type": "classification",
                "objective_param": {
                    "objective": "cross_entropy"
                },
                "num_trees": 5,
                "bin_num": 16,
                "encrypt_param": {
                    "method": "iterativeAffine"
                },
                "tree_param": {
                    "max_depth": 3
                }
            },
            "evaluation_0": {
                "eval_type": "binary"
            }
        }
    }
}

algorithm

FATE ML

Fat currently supports three types of Federated learning algorithms: horizontal federated learning, vertical federated learning and transfer learning.

  • Horizontal federated learning: when the user characteristics of the two data sets overlap more and the user overlap less

  • Longitudinal federated learning: when the users of the two data sets overlap more and the user features overlap less

  • Federated migration learning: when the user and user characteristics of the two data sets overlap less, the data is not segmented, but migration learning can be used to overcome the lack of data or labels

reference resources: https://fate.readthedocs.io/en/latest/_build_temp/python/federatedml/README_zh.html

Federated ml implements many common machine learning algorithms on Federated learning. All modules are developed by decoupling modular method to enhance scalability. FederatedML provides:

  1. Joint statistics: PSI, Union, Pearson Correlation, etc.
  2. Joint feature Engineering: including federal sampling, federal feature box, federal feature selection, etc.
  3. Joint machine learning algorithm: including horizontal and vertical federated LR, GBDT, DNN, migration learning, etc.
  4. Model evaluation: provide binary classification, multi classification, regression evaluation, cluster evaluation, federal and unilateral comparative evaluation.
  5. Security protocol: it provides a variety of security protocols for more secure multi-party interactive computing.

Federal machine learning framework: Reference https://fate.readthedocs.io/en/latest/_build_temp/python/federatedml/README_zh.html#id4

Algorithm list

List of algorithms used in the test example:

algorithmModule namedescribedata inputdata outputModel inputModel output
ReaderReaderWhen the storage engine of the input data is not supported by the current computing engine, it will be automatically transferred to the component output storage engine of fat cluster adaptive computing engine; When the storage format of input data is not fat supported, the format will be automatically converted and stored in the component output storage engine of fat clusterUser original storage dataRaw data after conversion
DataIODataIOThis component converts the original data into an Instance object (fat-v1.7 will be gradually discarded and datatransform will be used).Table, the value is the original dataThe converted data table, whose value is the instance of Data Instance defined in federatedml/feature/instance.pyDataIO model
DataTransform_DataTransformThis component converts the original data into an Instance object.Table, the value is the original dataThe converted data table, whose value is the instance of Data Instance defined in federatedml/feature/instance.pyDataTransform model
IntersectIntersectionCalculate the intersecting data sets of the two parties without disclosing any information of the different data sets. It is mainly used for vertical tasks.TableThe intersecting part of two tablesIntersect model
Hetero Secure BoostingHeteroSecureBoostBuild vertical Secure Boost module through multiple parties.Table with value of InstanceThe SecureBoost model consists of the model itself and model parameters

The detailed algorithm list is shown in: https://fate.readthedocs.io/en/latest/_build_temp/python/federatedml/README_zh.html#id2

Security protocol

reference resources: https://fate.readthedocs.io/en/latest/_build_temp/python/federatedml/README_zh.html#id3

parameter

reference resources: https://fate.readthedocs.io/en/latest/_build_temp/python/federatedml/README_zh.html#module-federatedml.param

Example

For details, see: https://fate.readthedocs.io/en/latest/_build_temp/examples/README.html

Example usage guide

Test cases 2 and 3 in this document

FATE-FLOW

reference resources: https://fate.readthedocs.io/en/latest/_build_temp/python/fate_flow/README_zh.html

Fat flow is the job scheduling system of the federal learning framework fat, which realizes the complete management of the life cycle of federal learning jobs, including data input, training job scheduling, index tracking, model center and so on

Fat flow key

  • Use DAG to define Pipeline;
  • Fat-dsl in JSON format is used to describe DAG and support automatic docking of the system;
  • Advanced scheduling framework, based on global state and optimistic lock scheduling, single DAG scheduling, multi-party coordinated scheduling, and supports multi schedulers
  • Flexible scheduling strategy, support start / stop / rerun, etc
  • The fine-grained resource scheduling capability supports core count, memory and work node strategies according to different computing engines
  • Real time tracker, real-time tracking data, parameters, models and indicators during operation
  • Federated model center, model management, federated consistency, import and export, and inter cluster migration
  • Provides CLI, HTTP API, Python SDK

framework

reference resources: https://fate.readthedocs.io/en/latest/_build_temp/python/fate_flow/README_zh.html#id3

deploy

reference resources: https://fate.readthedocs.io/en/latest/_build_temp/python/fate_flow/README_zh.html#id4

usage

Fat flow client command line

reference resources: https://fate.readthedocs.io/en/latest/_build_temp/python/fate_client/flow_client/README_zh.html

In the new fat flow command line console, the commands are divided into multiple classes, including job, data, model, component, etc. All commands will have a common call entry

[IN]
flow

[OUT]
Usage: flow [OPTIONS] COMMAND [ARGS]...

  Fate Flow Client

Options:
  -h, --help  Show this message and exit.

Commands:
  component   Component Operations
  data        Data Operations
  job         Job Operations
  model       Model Operations
  queue       Queue Operations
  table       Table Operations
  task        Task Operations

Fat flow command line interface reference: https://fate.readthedocs.io/en/latest/_build_temp/python/fate_flow/doc/fate_flow_cli.html

python fate_flow_client.py -f $command
# 
#cd /fate/python/fate_flow/
#python fate_flow_client.py -f submit_job -c /fate/python/fate_flow/examples/test_hetero_lr_job_conf.json -d /fate/python/fate_flow/examples/test_hetero_lr_job_dsl.json

Fat flow CLIENT SDK Guide

For details, refer to: https://fate.readthedocs.io/en/latest/_build_temp/python/fate_client/flow_sdk/README_zh.html

contain

  • Job operation

  • Component operation

  • Data operation

  • Task action

  • Model operation

  • Tag operation

  • Table operation

  • Queue operation

The usage is similar:

from flow_sdk.client import FlowClient
# use real ip address to initialize SDK
client = FlowClient('127.0.0.1', 9000, 'v1')
 
#client.job.submit(conf_path, dsl_path)

FATE-Flow REST API

reference resources: https://fate.readthedocs.io/en/latest/_build_temp/python/fate_flow/doc/fate_flow_http_api.html

Fat model publishing and online federated reasoning Guide

reference resources: https://fate.readthedocs.io/en/latest/_build_temp/doc/model_publish_with_serving_guide_zh.html

Fat pipeline use

Pipeline introduction reference: https://fate.readthedocs.io/en/latest/_build_temp/python/fate_client/pipeline/README.html

Pipeline is an advanced Python API that allows users to design, start, and query fat jobs in a sequential manner. Fat pipeline is designed to be user-friendly and consistent with the behavior of fat command line tools. Users can customize the job workflow by adding components to the pipeline, and then start the job with one call. In addition, pipeline provides the function of running prediction and querying information after fitting the pipeline.

A fat job is a directed acyclic graph

Fat job is a dag composed of algorithm component nodes. The fat pipeline provides easy-to-use tools to configure the sequence and settings of tasks.

Fat is written in a modular style. The module is designed to have input and output data and models. Therefore, when the output of one module is set to the input of another module, the two modules are connected together. By tracking how a dataset is processed through the fat module, we can see that a fat job is actually composed of a series of subtasks.

pipeline interface

reference resources: https://fate.readthedocs.io/en/latest/_build_temp/python/fate_client/pipeline/README.html#interface-of-pipeline

Component

The fat module is encapsulated in the Pipeline API. Each component can receive and output Data and Model. Component parameters can be easily set during initialization. Unspecified parameters will have default values. All components have a name, which can be set arbitrarily. The name of a component is its identifier, so it must be unique in Pipeline. We recommend that each component name include a number as a suffix for easy tracking.

Each component may have input and / or output Data and / or Model. For more information on how to use components, see this guide.

Example of initializing a component with a specified parameter value:

hetero_lr_0 = HeteroLR(name="hetero_lr_0", early_stop="weight_diff", max_iter=10,
                       early_stopping_rounds=2, validation_freqs=2)
Input

Input Encapsulates all inputs of a component, including Data and Model inputs. To access the input component, reference its input attribute:

input_all = dataio_0.input
Output

output Encapsulated components, including all output results Data, and Model output. To access from an output component, reference its output property:

output_all = dataio_0.output
Data

All data type inputs and outputs of the data wrapper component. Fat pipeline includes five types of data, and each type is used for different scenarios. For more information, see here

Model

Model defines the model inputs and outputs of components. Like Data, these two types of models are used for different purposes. For more information, see here.

Build pipeline

After initializing the pipeline, specify the job participant and initiator. The following is an example of the initial setup of a pipe:

pipeline = PipeLine()
pipeline.set_initiator(role='guest', party_id=9999)
pipeline.set_roles(guest=9999, host=10000, arbiter=10000)

The Reader needs to read in the data source so that other components can process the data. Define a Reader component:

reader_0 = Reader(name="reader_0")

In most cases, the DataIO Reader converts data into DataInstance format, which can then be used for data engineering and model training. Some components, such as Union and Intersection, can run directly on non DataInstance tables.

You can configure all pipeline components individually by setting them to different roles_ party_ instance. For example, DataIO can specifically configure components for guest s like this:

dataio_0 = DataIO(name="dataio_0")
guest_component_instance = dataio_0.get_party_instance(role='guest', party_id=9999)
guest_component_instance.component_param(with_label=True, output_format="dense")

To include components in a pipe, use add_component. To add a DataIO component to a previously created pipeline, try the following:

pipeline.add_component(dataio_0, data=Data(data=reader_0.output.data))
Build the fat NN model in Keras style

reference resources: https://fate.readthedocs.io/en/latest/_build_temp/python/fate_client/pipeline/README.html#build-fate-nn-model-in-keras-style

Initialize runtime job parameters

In order to fit or predict, the user needs to initialize the runtime environment, such as "backend" and "work_mode",

from pipeline.runtime.entity import JobParameters
job_parameters = JobParameters(backend=Backend.EGGROLL, work_mode=WorkMode.STANDALONE)

Run pipeline

After adding all the components, the user needs to compile the pipeline before running the designed job. After compilation, the appropriate Backend and can be used to fit (run training jobs) the pipeline WorkMode.

pipeline.compile()
pipeline.fit(job_parameters)

Task query

Fat pipeline provides API to query component information, including data, model and summary. All query APIs have FlowPy And Pipeline retrieves the query result and returns it directly to the user.

summary = pipeline.get_component("hetero_lr_0").get_summary()

Deploy components

Once the pipeline fitting is completed, the prediction can be run on the new data set. Before forecasting, you need to deploy the necessary components first. This step marks the selected components to be used by the prediction pipeline.

# deploy select components
pipeline.deploy_component([dataio_0, hetero_lr_0])
# deploy all components
# note that Reader component cannot be deployed. Always deploy pipeline with Reader by specified component list.
pipeline.deploy_component()

Prediction using pipeline

First, start a new pipeline, and then specify the data source for prediction.

predict_pipeline = PipeLine()
predict_pipeline.add_component(reader_0)
predict_pipeline.add_component(pipeline,
                               data=Data(predict_input={pipeline.dataio_0.input.data: reader_0.output.data}))

You can then start the prediction on the new pipeline.

predict_pipeline.predict(job_parameters)

In addition, because the pipeline is modular, users can add new components to the original pipeline before running the prediction.

predict_pipeline.add_component(evaluation_0, data=Data(data=pipeline.hetero_lr_0.output.data))
predict_pipeline.predict(job_parameters)

Save and restore pipeline

To save the pipeline, simply use the dump interface.

pipeline.dump("pipeline_saved.pkl")

To restore the pipeline, use load_model_from_file interface.

from pipeline.backend.pipeline import PineLine
PipeLine.load_model_from_file("pipeline_saved.pkl")

pipeline summary information

To get the details of the pipeline, use the describe interface, which prints the "creation time" fit or prediction status and the built dsl, if any.

pipeline.describe()

Pipes and CLI

In previous versions, users interacted with fat through the command line interface, usually using manually configured conf and dsl json files. Manual configuration is tedious and error prone. Fat pipeline automatically forms a task configuration file at compile time, allowing you to quickly try task design.

journal

Fat flow service log: / fat / logs / fat_ flow/

Task log: / fat / logs / $job_ id/

common problem

reference resources: https://fate.readthedocs.io/en/latest/_build_temp/python/fate_flow/README_zh.html#id10

Test sample

Fat board task viewing interface: http://192.168.1.75:8080/#/history

reference resources: https://fate.readthedocs.io/en/latest/_build_temp/examples/experiment_template/user_usage/dsl_v2_predict_tutorial.html

Test case 1 - upload file

python /fate/examples/pipeline/demo/pipeline-upload.py --base  /fate

The content of pipeline-upload.py is as follows:

Set examples / data / beast_ hetero_ Change guest.csv to the file you want to upload

import os
import argparse

from pipeline.backend.config import Backend, WorkMode
from pipeline.backend.pipeline import PipeLine

# path to data
# default fate installation path
DATA_BASE = "/data/projects/fate"

# site-package ver
# import site
# DATA_BASE = site.getsitepackages()[0]


def main(data_base=DATA_BASE):
    # parties config
    guest = 9999
    # 0 for eggroll, 1 for spark
    backend = Backend.EGGROLL
    # 0 for standalone, 1 for cluster
    work_mode = WorkMode.STANDALONE
    # use the work mode below for cluster deployment
    # work_mode = WorkMode.CLUSTER

    # partition for data storage
    partition = 4

    # table name and namespace, used in FATE job configuration
    dense_data = {"name": "breast_hetero_guest", "namespace": f"experiment"}
    tag_data = {"name": "breast_hetero_host", "namespace": f"experiment"}

    pipeline_upload = PipeLine().set_initiator(role="guest", party_id=guest).set_roles(guest=guest)
    # add upload data info
    # path to csv file(s) to be uploaded, modify to upload designated data
    pipeline_upload.add_upload_data(file=os.path.join(data_base, "examples/data/breast_hetero_guest.csv"),
                                    table_name=dense_data["name"],             # table name
                                    namespace=dense_data["namespace"],         # namespace
                                    head=1, partition=partition)               # data info

    pipeline_upload.add_upload_data(file=os.path.join(data_base, "examples/data/breast_hetero_host.csv"),
                                    table_name=tag_data["name"],
                                    namespace=tag_data["namespace"],
                                    head=1, partition=partition)

    # upload data
    pipeline_upload.upload(work_mode=work_mode, backend=backend, drop=1)


if __name__ == "__main__":
    parser = argparse.ArgumentParser("PIPELINE DEMO")
    parser.add_argument("--base", "-b", type=str,
                        help="data base, path to directory that contains examples/data")

    args = parser.parse_args()
    if args.base is not None:
        main(args.base)
    else:
        main()

Test case 2 - training and evaluation model

pipeline construction:

python /fate/examples/pipeline/demo/pipeline-quick-demo.py

pipeline-quick-demo.py is as follows:

import json
from pipeline.backend.config import Backend, WorkMode
from pipeline.backend.pipeline import PipeLine
from pipeline.component import Reader, DataIO, Intersection, HeteroSecureBoost, Evaluation
from pipeline.interface import Data
from pipeline.runtime.entity import JobParameters

# table name & namespace in data storage
# data should be uploaded before running modeling task
guest_train_data = {"name": "breast_hetero_guest", "namespace": "experiment"}
host_train_data = {"name": "breast_hetero_host", "namespace": "experiment"}

# initialize pipeline
pipeline = PipeLine().set_initiator(role="guest", party_id=9999).set_roles(guest=9999, host=10000)

# define components
reader_0 = Reader(name="reader_0")
reader_0.get_party_instance(role="guest", party_id=9999).component_param(table=guest_train_data)
reader_0.get_party_instance(role="host", party_id=10000).component_param(table=host_train_data)
dataio_0 = DataIO(name="dataio_0", with_label=True)
dataio_0.get_party_instance(role="host", party_id=10000).component_param(with_label=False)
intersect_0 = Intersection(name="intersection_0")
hetero_secureboost_0 = HeteroSecureBoost(name="hetero_secureboost_0",
                                         num_trees=5,
                                         bin_num=16,
                                         task_type="classification",
                                         objective_param={"objective": "cross_entropy"},
                                         encrypt_param={"method": "iterativeAffine"},
                                         tree_param={"max_depth": 3})
evaluation_0 = Evaluation(name="evaluation_0", eval_type="binary")

# add components to pipeline, in order of task execution
pipeline.add_component(reader_0)\
    .add_component(dataio_0, data=Data(data=reader_0.output.data))\
    .add_component(intersect_0, data=Data(data=dataio_0.output.data))\
    .add_component(hetero_secureboost_0, data=Data(train_data=intersect_0.output.data))\
    .add_component(evaluation_0, data=Data(data=hetero_secureboost_0.output.data))

# compile & fit pipeline
pipeline.compile().fit(JobParameters(backend=Backend.EGGROLL, work_mode=WorkMode.STANDALONE))
# to run this task with cluster deployment, use the following setting instead
# may change data engine backend according to actual environment
# pipeline.compile().fit(JobParameters(backend=Backend.EGGROLL, work_mode=WorkMode.CLUSTER))


# query component summary
print(f"Evaluation summary:\n{json.dumps(pipeline.get_component('evaluation_0').get_summary(), indent=4)}")

Fat flow setup:

(users can go to / fat / examples / DSL / V2 / to find the appropriate algorithm and configuration file replacement)

flow job submit -c /fate/examples/dsl/v2/hetero_secureboost/test_secureboost_train_complete_secure_conf.json -d /fate/examples/dsl/v2/hetero_secureboost/test_secureboost_train_dsl.json

Test case 3 - model training, deployment and prediction

Build a hetero secureboost model and use the model to predict

pipeline construction

pipeline

reference resources: https://fate.readthedocs.io/en/latest/_build_temp/examples/experiment_template/user_usage/pipeline_predict_tutorial.html

1. Upload files (test case 1 has been uploaded)

2. Create files in the host environment:

cd /home/docker_standalone_fate_1.6.0/fate_job/
vim fit_Hetero_SecureBoost_model.py

The contents of the file are as follows: (you can refer to the corresponding algorithm in / fat / examples / pipeline / to list the code and algorithm https://fate.readthedocs.io/en/latest/_build_temp/python/federatedml/README.html#algorithm-list corresponding algorithm Introduction (combination is helpful for understanding)

from pipeline.backend.config import Backend, WorkMode # configs
from pipeline.backend.pipeline import PipeLine # Pipeline 
from pipeline.component import Reader, DataIO, Intersection, HeteroSecureBoost # fate components
from pipeline.interface import Data  # data flow
from pipeline.runtime.entity import JobParameters # parameter class

# define dataset name and namespace
guest_train_data = {"name": "breast_hetero_guest", "namespace": "experiment"}
host_train_data = {"name": "breast_hetero_host", "namespace": "experiment"}

# initialize pipeline, set guest as initiator and set guest/host party id
pipeline = PipeLine().set_initiator(role="guest", party_id=9999).set_roles(guest=9999, host=10000)

# define components
# reader read raw data 
reader_0 = Reader(name="reader_0")
reader_0.get_party_instance(role="guest", party_id=9999).component_param(table=guest_train_data)
reader_0.get_party_instance(role="host", party_id=10000).component_param(table=host_train_data)
# data_io transform data
dataio_0 = DataIO(name="dataio_0", with_label=True)
dataio_0.get_party_instance(role="host", party_id=10000).component_param(with_label=False)
# find sample intersection using Intersection components
intersect_0 = Intersection(name="intersection_0")
# hetero secureboost components, setting algorithm parameters
hetero_secureboost_0 = HeteroSecureBoost(name="hetero_secureboost_0",
				     num_trees=5,
				     bin_num=16,
				     task_type="classification",
				     objective_param={"objective": "cross_entropy"},
				     encrypt_param={"method": "iterativeAffine"},
				     tree_param={"max_depth": 3})

# add components to pipeline, in the order of task execution
pipeline.add_component(reader_0)\
.add_component(dataio_0, data=Data(data=reader_0.output.data))\
.add_component(intersect_0, data=Data(data=dataio_0.output.data))\
.add_component(hetero_secureboost_0, data=Data(train_data=intersect_0.output.data))

# compile & fit pipeline
pipeline.compile().fit(JobParameters(backend=Backend.EGGROLL, work_mode=WorkMode.STANDALONE))

# save train pipeline
pipeline.dump("pipeline_saved.pkl")

Execute in docker environment:

cd  /fate/fate_job
python fit_Hetero_SecureBoost_model.py

3. Create files in the host environment

vim predict_instances_by_Hetero_SecureBoost_model.py

The contents of the file are as follows: (the user changes the toast_hetero_guest to the uploaded table name)

from pipeline.backend.pipeline import PipeLine
from pipeline.component.reader import Reader
from pipeline.interface.data import Data
from pipeline.backend.config import Backend, WorkMode # configs
from pipeline.runtime.entity import JobParameters # parameter class

# load train pipeline
pipeline = PipeLine.load_model_from_file('pipeline_saved.pkl')
# deploy components in training step
pipeline.deploy_component([pipeline.dataio_0, pipeline.intersection_0, pipeline.
                           hetero_secureboost_0])
# set new instances to predict
# new dataset
guest_train_data = {"name": "breast_hetero_guest", "namespace": "experiment"}
host_train_data = {"name": "breast_hetero_host", "namespace": "experiment"}

# set new reader
reader_0 = Reader(name="reader_0")
reader_0.get_party_instance(role="guest", party_id=9999).component_param(table=guest_train_data)
reader_0.get_party_instance(role="host", party_id=10000).component_param(table=host_train_data)

# new predict pipeline
predict_pipeline = PipeLine()
# update reader
predict_pipeline.add_component(reader_0)
# add selected components from train pipeline onto predict pipeline
predict_pipeline.add_component(pipeline,data=Data(predict_input={pipeline.dataio_0.input.data: reader_0.output.data}))
# run predict model
predict_pipeline.predict(JobParameters(backend=Backend.EGGROLL, work_mode=WorkMode.STANDALONE))

Execute in docker environment, and view the results in fat board

python predict_instances_by_Hetero_SecureBoost_model.py

Build with flow command

reference resources: https://fate.readthedocs.io/en/latest/_build_temp/examples/experiment_template/user_usage/dsl_v2_predict_tutorial.html

(1) 1. Training model

cd /fate/examples/
flow job submit -c dsl/v2/hetero_secureboost/test_secureboost_train_binary_conf.json -d dsl/v2/hetero_secureboost/test_secureboost_train_dsl.json

(2) . use flow_client to deploy the components required in the prediction task

Remember to modify model ID and model version

flow model deploy --model-id guest-9999#host-9998#model --model-version 2021090109322084026031 --cpn-list "reader_0, dataio_0, intersection_0, hetero_secure_boost_0"

If execution returns:

{
    "retcode": 100,
    "retmsg": "'Pipeline'"
}

Because the model has not been built yet, you can view it in the board and deploy it after it is completed

(3) 1. Forecast documents

Model returned with deployment file_ id ,model_version, dataset name replacement * * / fat / examples / DSL / V2 / * * hetero_ secureboost/test_ predict_ Model in conf.json content_ id ,model_version, and the replaced content is saved to / fate/fate_job/new_test_predict_conf.json (new file)

implement

flow job submit -c /fate/fate_job/lxy/new_test_predict_conf.json

If the following contents are returned, it is because of host,guest party_id and upload file host, guest party_ Inconsistent ID

{
	'data':'No such file or directory',
	'retcode':100,
	'retmsg':"2"
}

If the submitted task returns the following content, it is because the deployment operation has not been performed:

{
    "retcode": 100,
    "retmsg": "Model arbiter-10000#guest-9999#host-10000#model 20210908033432743389158 has not been deployed yet."
}

Test case 4 - build the model independently using pipeline

1. Select the appropriate algorithm and data set

Download fat source code

git clone https://github.com/FederatedAI/FATE.git

Open the data/READMA.md file in the examples directory to view the test data set information, use the linear regression model HeteroLinR, and the student whose label is a continuous value_ Hetero dataset

2. Upload dataset

Refer to test case 1

3. Write training and verification pipeline code

Based on the pipeline code of test case 3,

The details of HeteroLinR algorithm can be combined https://fate.readthedocs.io/en/latest/_ build_ temp/python/federatedml/linear_ model/linear_ Region / readme.html also has the source code fat / fat master / examples / pipeline / hetero_ linear_ Region / pipeline-hetero-linr.py two files to understand the algorithm.

It is found that the construction of HeteroLinR pipeline requires more than one arbiter to be configured compared with the construction of heterosecurebost model in test case 3. Other operations that need to be changed include: modifying the dataset name. The parameters of HeteroLinR model can refer to the model parameters in pipeline-hetero-linr.py. Only the name is set here

Create linr_model_train_and_evaluation.py file

The code is as follows:

from pipeline.backend.config import Backend, WorkMode # configs
from pipeline.backend.pipeline import PipeLine # Pipeline
from pipeline.component import Reader, DataIO, Intersection, HeteroSecureBoost, HeteroLinR ,Evaluation# fate components
from pipeline.interface import Data  # data flow
from pipeline.runtime.entity import JobParameters # parameter class

# define dataset name and namespace
guest_train_data = {"name": "student_hetero_guest", "namespace": "experiment"}
host_train_data = {"name": "student_hetero_host", "namespace": "experiment"}

# initialize pipeline, set guest as initiator and set guest/host party id
pipeline = PipeLine().set_initiator(role="guest", party_id=9999).set_roles(guest=9999, host=10000, arbiter=10000)
reader_0 = Reader(name="reader_0")
reader_0.get_party_instance(role='guest', party_id=9999).component_param(table=guest_train_data)
reader_0.get_party_instance(role='host', party_id=10000).component_param(table=host_train_data)

dataio_0 = DataIO(name="dataio_0")
dataio_0.get_party_instance(role='guest', party_id=9999).component_param(with_label=True, label_name="y",
																		 label_type="int", output_format="dense")
dataio_0.get_party_instance(role='host', party_id=10000).component_param(with_label=False)

intersection_0 = Intersection(name="intersection_0")
hetero_linr_0 = HeteroLinR(name="hetero_linr_0")

evaluation_0 = Evaluation(name="evaluation_0", eval_type="regression", pos_label=1)
 
pipeline.add_component(reader_0)
pipeline.add_component(dataio_0, data=Data(data=reader_0.output.data))
pipeline.add_component(intersection_0, data=Data(data=dataio_0.output.data))
pipeline.add_component(hetero_linr_0, data=Data(train_data=intersection_0.output.data))
pipeline.add_component(evaluation_0, data=Data(data=hetero_linr_0.output.data))

pipeline.compile()

job_parameters = JobParameters(backend=Backend.EGGROLL, work_mode=WorkMode.STANDALONE)
pipeline.fit(job_parameters)

pipeline.dump("/fate/fate_job/lxy/hetero_linr_pipeline_saved.pkl")#Save model

docker executes the file:

cd /fate/fate_job/lxy
python linr_model_train_and_evaluation.py

4. Write prediction pipeline code

Create linr_ model_ The predict.py file is modified based on the pipeline code of test case 3. Only the table name and model name need to be modified

from pipeline.backend.pipeline import PipeLine
from pipeline.component.reader import Reader
from pipeline.interface.data import Data
from pipeline.backend.config import Backend, WorkMode # configs
from pipeline.runtime.entity import JobParameters # parameter class

# load train pipeline
pipeline = PipeLine.load_model_from_file('/fate/fate_job/lxy/hetero_linr_pipeline_saved.pkl')
# deploy components in training step
pipeline.deploy_component([pipeline.dataio_0, pipeline.intersection_0, pipeline.
                           hetero_linr_0])
# set new instances to predict
# new dataset
guest_train_data = {"name": "student_hetero_guest", "namespace": "experiment"}
host_train_data = {"name": "student_hetero_host", "namespace": "experiment"}

# set new reader
reader_0 = Reader(name="reader_0")
reader_0.get_party_instance(role="guest", party_id=9999).component_param(table=guest_train_data)
reader_0.get_party_instance(role="host", party_id=10000).component_param(table=host_train_data)

# new predict pipeline
predict_pipeline = PipeLine()
# update reader
predict_pipeline.add_component(reader_0)
# add selected components from train pipeline onto predict pipeline
predict_pipeline.add_component(pipeline,data=Data(predict_input={pipeline.dataio_0.input.data: reader_0.output.data}))
# run predict model
predict_pipeline.predict(JobParameters(backend=Backend.EGGROLL, work_mode=WorkMode.STANDALONE))

FATE TEST

A collection of useful tools for running fat tests.

quick start

reference resources: https://fate.readthedocs.io/en/latest/_build_temp/python/fate_test/README.html#quick-start

Edit default fat_test_config.yaml

Change / usr / local / lib / Python 3.6/site-packages/fate_ test/fate_ test_ Config.yaml file,

data_base_dir: path(FATE)

Change to

data_base_dir: /fate/

Run fate_test Suite

Suite: used to run the test suite and collect fat jobs. Testsuite test suite is used to run a group of jobs in sequence. The data used for the job can be uploaded before submitting the job or cleaned up after the job is completed. This tool is very useful for fat release testing.

#fate_test suite -i <path contains *testsuite.json>
fate_test suite -i /fate/examples/dsl/v1/homo_nn/testsuite.json
#fate_test suite -i /fate/examples/dsl/v1/hetero_pearson/testsuite.json

Run some fat_ Test benchmark

Benchmark quality for comparing modeling quality between fat and other machine learning systems

#fate_test benchmark-quality -i <path contains *benchmark.json>
fate_test benchmark-quality -i /fate/examples/benchmark_quality/hetero_linear_regression/hetero_linr_benchmark.json

developer's guide

Develop algorithm module. For details, refer to: https://fate.readthedocs.io/en/latest/_build_temp/doc/develop_guide_zh.html

To develop a module, you need to perform the following five steps.

  1. Defines the python parameter object that will be used in this module.

  2. Define the Setting conf json configuration file for the module.

  3. If the module needs Federation, the transfer variable configuration file needs to be defined.

  4. Your algorithm module needs to inherit the model_base class and complete several specified functions.

  5. Define the protobuf file required to save the model.

  6. If you want to start a component directly through a python script, you need to start it in fat_ Define Pipeline components in client.

API

Computing API

Initialize a computing session. Refer to: https://fate.readthedocs.io/en/latest/_build_temp/doc/api/computing.html

Federation API

Including low-level API s FederationABC () and user interface secure_add_example_transfer_variable, reference: https://fate.readthedocs.io/en/latest/_build_temp/doc/api/federation.html

parameter

For details of class parameters, refer to: https://fate.readthedocs.io/en/latest/_build_temp/doc/api/params.html

Error reporting information and solutions encountered

Federated schedule error, Please check rollSite and fateflow

Error message: Federated schedule error, please check rollsite and fastflow network connectivity RPC request error:<_ InactiveRpcError of RPC that terminated…

Cause: cluster communication problem

Solution: check whether the work mode setting of the configuration file corresponds to the fat deployment method (single machine or cluster)

Keywords: Machine Learning neural networks Deep Learning

Added by eXpertPHP on Sun, 19 Sep 2021 22:37:29 +0300