Fat introduction
Fat (Federated AI Technology Enabler) is an open source project initiated by the AI Department of Weizhong bank, which aims to provide a secure computing framework to support the federal AI ecosystem. It implements a secure computing protocol based on homomorphic encryption and multiparty computing (MPC). It supports the secure computing of Federated learning architecture and various machine learning algorithms, including logical regression, tree based algorithm, deep learning and migration learning.
Fat technical framework
FederatedML
Algorithm functional components, including the federated implementation of common machine learning algorithms. All modules are developed in a modular decoupling way to enhance scalability.
FATE_Flow
Fat flow is the job scheduling system of the federal learning framework fat, which realizes the complete management of the life cycle of federal learning jobs, including data input, training job scheduling, index tracking, model center and other functions.
FATE-Board
The visualization tool of Federated learning modeling visualizes and measures the whole process of model training for end users. It supports the tracking, statistics and monitoring of the whole process of model training, and provides rich visual presentation for model operation status, model output and log tracking, so as to help users explore and understand the model simply and efficiently.
FATE-Serving
High performance scalable federated learning online model service.
role
Guest
Guest refers to the data application party. In the vertical algorithm, guest is often the party with label y. Generally, the modeling process is initiated by guest.
Host
Host is the data provider.
arbiter
Arbiter is used to assist multiple parties to complete joint modeling. Its main function is to aggregate gradients or models. For example, in the vertical lr, each party sends half of its gradient to arbiter, and then arbiter performs joint optimization. Arbiter also participates in and distributes public and private keys for encryption and decryption services.
Fat environment Deployment Guide
Stand alone deployment
reference resources: https://fate.readthedocs.io/en/latest/_ build_ Temp / standalone deploy / readme.html #install-fat-using-docker-recommended, including Using Docker to install fat * (recommended)* and Installing fat on the host
Cluster deployment
reference resources: https://fate.readthedocs.io/en/latest/_build_temp/cluster-deploy/README.html
The host enters the docker fat image command
Host ip:192.168.1.75
The fat version is 1.6.0, and the host executes the following commands
CONTAINER_ID=`docker ps -aqf "name=fate"` docker exec -t -i ${CONTAINER_ID} bash
General guidelines for quick start fat
reference resources: https://fate.readthedocs.io/en/latest/_build_temp/examples/pipeline/README.html
Since fat has been started, the following commands do not need to be executed
-
(optional) create a virtual environment
python -m venv venv source venv/bin/activate pip install -U pip
-
Installing the fat client
pip install fate_client pipeline init --help
-
Provides the ip/port information of the deployed fat flow server
# The default ip: port is 127.0.0.1:8080 pipeline init --ip 127.0.0.1 --port 9380 # Optional, set the Pipeline directory pipeline init --ip 127.0.0.1 --port 9380 --log-directory {desired log path}
-
Upload data using fat pipeline
Before starting the modeling task, you should upload the data to be used. Typically, one side is a cluster that contains multiple nodes. Therefore, when we upload these data, the data will be assigned to these nodes.
Refer to test case 1 in this document.
Preliminary knowledge of fat use
Upload data guide
reference resources: https://fate.readthedocs.io/en/latest/_build_temp/doc/upload_data_guide_zh.html
Accepted data type
Dense,svm-light,tag,tag:value
Define upload data profile
{ "file": "examples/data/breast_hetero_guest.csv", "table_name": "hetero_breast_guest", "namespace": "experiment", "head": 1, "partition": 8, "work_mode": 0, "backend": 0 }
Field Description:
-
File: file path
-
table_ Name & namespace: identifier of the storage data table
-
head: Specifies whether the data file contains a header
-
partition: Specifies the number of partitions used to store data
-
work_mode: Specifies the working mode. 0 represents stand-alone version and 1 represents cluster version
-
Backend: Specifies the backend. 0 represents eggrovel, 1 represents SPARK plus rabbit MQ, and 2 represents SPARK plus Pulsar
Upload command
This step is required for each cluster that provides data, that is, guest and host
1. Upload data using fat flow:
flow data upload -c dsl_test/upload_data.json
upload_data.json content:
{ "file": "examples/data/breast_hetero_guest.csv", "head": 1, "partition": 16, "work_mode": 0, "table_name": "breast_hetero_guest", "namespace": "experiment" }
2. Upload data using legacy python scripts:
python /fate/python/fate_flow/fate_flow_client.py -f upload -c /fate/examples/dsl/v1/upload_data.json
3. Upload data using python script: refer to test case 1
DSL & task submit runtime conf setting V2
In order to make the construction of task model more flexible, fat uses a set of self-defined domain specific language (DSL) to describe tasks. In DSL, various modules (such as data read / write, data_io, feature engineering, regression region, classification) can be organized into a directed acyclic graph (DAG). Through various ways, users can flexibly combine various algorithm modules according to their own needs.
In addition, each module has different parameters to be configured, and different parties may have different parameters for the same module. In order to simplify this situation, for each module, fat will save different parameters of all parties to the same Submit Runtime Conf, and all parties will share this configuration file. This guide will show you how to create a DSL configuration file.
V2 configuration reference: https://fate.readthedocs.io/en/latest/_build_temp/doc/dsl_conf_v2_setting_guide_zh.html
DSL configuration description
1. General
The configuration file of DSL adopts json format. In fact, the whole configuration file is a json object (dict).
2.Components
The first level of this dict is "components", which is used to represent each module that will be used in this task. Each independent module is defined under "components". All data needs to be taken from the data store through the Reader module. Note that the Reader module only output s
3.module
Used to specify the module to use. Refer to fat ML algorithm list for module name( https://fate.readthedocs.io/en/latest/_build_temp/python/federatedml/README_zh.html ), and / fat / Python / federatedml / conf / setting_ The file names of all modules under conf are consistent (excluding the. json suffix).
4.input
There are two input types: Data and Model.
Data input
There are three input types:
- Data: generally used for data_io module, feature_engineering module or evaluation module
- train_data: commonly used for homo_lr, hetero_lr and secure_boost module. If train appears_ Data field, then this task will be recognized as a fit task
- validate_data: if train exists_ Data field, this field is optional. If you choose to keep this field, the data pointed to will be used as validation set
- test_data: used as prediction data. If provided, model input shall be provided at the same time.
Model input
There are two input types
1.model: used for model input of components of the same type.
2.isometric_model: used to specify model input that inherits upstream components
5.output
data output
There are four output types:
- Data: general module data output
- train_data: Data Split only
- validate_data: Data Split only
- test_data: Data Split only
Model output
Use model only
DSL configuration example
In the training mode, users can use other algorithm modules to replace heterosecurebost. Note the module name hetero_secureboost_0 will also be changed together
{ "components": { "reader_0": { "module": "Reader", "output": { "data": [ "data" ] } }, "dataio_0": { "module": "DataIO", "input": { "data": { "data": [ "reader_0.data" ] } }, "output": { "data": [ "data" ], "model": [ "model" ] } }, "intersection_0": { "module": "Intersection", "input": { "data": { "data": [ "dataio_0.data" ] } }, "output": { "data": [ "data" ] } }, "hetero_secureboost_0": { "module": "HeteroSecureBoost", "input": { "data": { "train_data": [ "intersection_0.data" ] } }, "output": { "data": [ "data" ], "model": [ "model" ] } }, "evaluation_0": { "module": "Evaluation", "input": { "data": { "data": [ "hetero_secureboost_0.data" ] } }, "output": { "data": [ "data" ] } } } }
Create profile Submit Runtime Conf
For the new format of version 1.5.x, Job Runtime Conf is used to set the information of each participant, the parameters of the job and the parameters of each component. The contents include the following
1. DSL version
Configuration version: 1 is not configured by default, and 2 is recommended
"dsl_version": "2"
2. Job Participants
The user needs to define the initiator.
1. Initiator, including the role and Party of the task initiator_ ID, for example:
"initiator": { "role": "guest", "party_id": 9999 }
2. All participants: contains the information of each participant. In the role field, each element represents a role and the party undertaking the role_ id. Party for each role_ ID exists as a list, because a task may involve multiple parties playing the same role. For example:
"role": { "guest": [9999], "host": [10000], "arbiter": [10000] }
3.algorithm_parameters (system operating parameters)
Configure the main system parameters when the job runs
Parameter application range policy settings
-
Applies to all parties, using the common scope identifier
-
Apply only to a party, use the role range identifier, and use role:party_index locates the specified party. The directly specified parameter takes precedence over the common parameter
"common": { } "role": { "guest": { "0": { } } }
Among them, the parameters under common are applied to all participants, and the parameters under role-guest-0 configuration are applied to the participants under the subscript 0 of the guest role. Note that the system operation parameters in the current version are not strictly tested for only one participant, so it is recommended to choose common first
Supported system parameters
Configuration item | Default value | Support value | explain |
---|---|---|---|
job_type | train | train, predict | Task type |
work_mode | 0 | 0, 1 | 0 represents single party stand-alone version and 1 represents multi-party distributed version |
backend | 0 | 0, 1, 2 | 0 represents EGGROLL, 1 represents SPARK plus rabbit MQ, and 2 represents SPARK plus Pulsar |
model_id | - | - | Model id, which needs to be filled in for prediction task |
model_version | - | - | Model version, the forecast task needs to be filled in |
task_cores | 4 | positive integer | Total cpu cores for job requests |
task_parallelism | 1 | positive integer | task parallelism |
computing_partitions | Number of cpu cores allocated to task | positive integer | The number of partitions in the data table at the time of calculation |
eggroll_run | nothing | processors_per_node et al | The configuration parameters related to the eggroll computing engine generally do not need to be configured and are controlled by task_ The cores are calculated automatically. If configured, the task_ The cores parameter does not take effect |
spark_run | nothing | Num executors, executor cores, etc | The related configuration parameters of spark computing engine generally do not need to be configured and are determined by task_ The cores are calculated automatically. If configured, the task_ The cores parameter does not take effect |
rabbitmq_run | nothing | queue, exchange, etc | rabbitmq creates the relevant configuration parameters of queue and exchange. Generally, it does not need to be configured, and the system default value is adopted |
pulsar_run | nothing | producer, consumer, etc | pulsar is configured when creating producer and consumer. Generally, it does not need to be configured. |
federated_status_collect_type | PUSH | PUSH, PULL | Multi party operation status collection mode, PUSH means that each participant actively reports to the initiator, and PULL means that the initiator regularly pulls from each participant |
timeout | 259200 (3 days) | positive integer | Task timeout in seconds |
- The three types of engines have certain support dependencies. For example, Spark computing engine currently only supports HDFS as an intermediate data storage engine
- work_mode + backend will automatically generate the corresponding three engine configurations computing, storage and federation according to the support dependencies
- Developers can implement the adaptive engine by themselves and configure the engine in runtime config
Unopened parameters
reference resources: https://fate.readthedocs.io/en/latest/_build_temp/doc/dsl_conf_v2_setting_guide_zh.html#id17
Reference configuration
There are four kinds
-
Use eggroll as the backend and adopt the configuration when the default cpu allocation calculation strategy is adopted
-
eggroll is used as the backend, and the configuration is adopted when parameters such as cpu are directly specified
-
Use spark and rabbitMQ as the backend, and adopt the configuration when directly specifying parameters such as cpu
"job_parameters": { "common": { "job_type": "train", "work_mode": 1, "backend": 1, "spark_run": { "num-executors": 1, "executor-cores": 2 }, "task_parallelism": 2, "computing_partitions": 8, "timeout": 36000, "rabbitmq_run": { "queue": { "durable": true }, "connection": { "heartbeat": 10000 } } } }
-
Use spark plus pulsar as the backend
For details, refer to https://fate.readthedocs.io/en/latest/_build_temp/doc/dsl_conf_v2_setting_guide_zh.html
Resource management details
Since version 1.5.0, in order to further manage resources, fateflow enables a finer grained cpu cores management policy, removing the policy of limiting the number of jobs running at the same time in the previous version.
Including: total resource allocation, operation resource calculation and resource scheduling. For details, refer to: https://fate.readthedocs.io/en/latest/_build_temp/doc/dsl_conf_v2_setting_guide_zh.html#id19
4. Component operating parameters
For details, refer to: https://fate.readthedocs.io/en/latest/_build_temp/doc/dsl_conf_v2_setting_guide_zh.html#id23
Parameter application range policy settings
- Applies to all parties, using the common scope identifier
- Apply only to a party, use the role range identifier, and use role:party_index locates the specified party. The directly specified parameter takes precedence over the common parameter
"commom": { } "role": { "guest": { "0": { } } }
The parameters under the common configuration are applied to all participants, and the parameters under the role-guest-0 configuration are applied to the participants under the subscript 0 of the guest role. Note that the component running parameters of the current version support two application range policies
Reference configuration
- intersection_0 and hetero_lr_0. The operating parameters of the two components are placed in the common range and applied to all participants
- For reader_0 and dataio_0 the operating parameters of the two components are configured according to different participants. This is because the input parameters of different participants are usually inconsistent. All the two components are generally set according to the participants
The above component names are defined in the DSL configuration file
"component_parameters": { "common": { "intersection_0": { "intersect_method": "raw", "sync_intersect_ids": true, "only_output_key": false }, "hetero_lr_0": { "penalty": "L2", "optimizer": "rmsprop", "alpha": 0.01, "max_iter": 3, "batch_size": 320, "learning_rate": 0.15, "init_param": { "init_method": "random_uniform" } } }, "role": { "guest": { "0": { "reader_0": { "table": {"name": "breast_hetero_guest", "namespace": "experiment"} }, "dataio_0":{ "with_label": true, "label_name": "y", "label_type": "int", "output_format": "dense" } } }, "host": { "0": { "reader_0": { "table": {"name": "breast_hetero_host", "namespace": "experiment"} }, "dataio_0":{ "with_label": false, "output_format": "dense" } } } } }
Multi Host configuration
Including multi host tasks, all host information shall be listed under the role, and different configurations of each host shall be listed under their corresponding modules. For details, refer to: https://fate.readthedocs.io/en/latest/_build_temp/doc/dsl_conf_v2_setting_guide_zh.html#host
Forecast task configuration
DSL V2 does not automatically generate predictive DSLs for training tasks. Users need to first deploy the modules in the required model using Flow Client. For detailed command description, please refer to fat flow document < / fat / Python / fat_ client/flow_ client/README_ Deploy module of zh.rst >
The command is as follows:
flow model deploy --model-id $model_id #Model ID is required --model-version $model_version #Model version is required # --CPN list... Component list, non essential parameter
5. Basic principle of fat-flow operation job
- After submitting the job, fateflow obtains the job dsl and job config and stores them in the database t_ Corresponding fields of the job table and / fat / jobs / $jobid / directory,
- Parse the job dsl and job config, generate fine-grained parameters based on the merged parameters (three engine parameters will be generated corresponding to the backend & work_mode mentioned above), and process the default values of the parameters
- Distribute and store the common configuration to all participants, and generate a job according to the actual information of the participants_ runtime_ on_ party_ conf
- When each participant receives a task, it is based on the job_runtime_on_party_conf execution
The $jobid directory includes the following files:
job_dsl.json job_runtime_conf.json local pipeline_dsl.json train_runtime_conf.json
Submit Runtime Conf configuration example
Training and validation model dsl example
{ "dsl_version": 2, "initiator": { "role": "guest", "party_id": 9999 }, "role": { "host": [ 10000 ], "guest": [ 9999 ] }, "job_parameters": { "job_type": "train", "work_mode": 0, "backend": 0, "computing_engine": "STANDALONE", "federation_engine": "STANDALONE", "storage_engine": "STANDALONE", "engines_address": { "computing": { "nodes": 1, "cores_per_node": 20 }, "federation": { "nodes": 1, "cores_per_node": 20 }, "storage": { "nodes": 1, "cores_per_node": 20 } }, "federated_mode": "SINGLE", "task_parallelism": 1, "computing_partitions": 4, "federated_status_collect_type": "PULL", "model_id": "guest-9999#host-10000#model", "model_version": "202108310831349550536", "eggroll_run": { "eggroll.session.processors.per.node": 4 }, "spark_run": {}, "rabbitmq_run": {}, "pulsar_run": {}, "adaptation_parameters": { "task_nodes": 1, "task_cores_per_node": 4, "task_memory_per_node": 0, "request_task_cores": 4, "if_initiator_baseline": false } }, "component_parameters": { "role": { "guest": { "0": { "reader_0": { "table": { "name": "breast_hetero_guest", "namespace": "experiment" } } } }, "host": { "0": { "reader_0": { "table": { "name": "breast_hetero_host", "namespace": "experiment" } }, "dataio_0": { "with_label": false } } } }, "common": { "dataio_0": { "with_label": true }, "hetero_secureboost_0": { "task_type": "classification", "objective_param": { "objective": "cross_entropy" }, "num_trees": 5, "bin_num": 16, "encrypt_param": { "method": "iterativeAffine" }, "tree_param": { "max_depth": 3 } }, "evaluation_0": { "eval_type": "binary" } } } }
algorithm
FATE ML
Fat currently supports three types of Federated learning algorithms: horizontal federated learning, vertical federated learning and transfer learning.
-
Horizontal federated learning: when the user characteristics of the two data sets overlap more and the user overlap less
-
Longitudinal federated learning: when the users of the two data sets overlap more and the user features overlap less
-
Federated migration learning: when the user and user characteristics of the two data sets overlap less, the data is not segmented, but migration learning can be used to overcome the lack of data or labels
reference resources: https://fate.readthedocs.io/en/latest/_build_temp/python/federatedml/README_zh.html
Federated ml implements many common machine learning algorithms on Federated learning. All modules are developed by decoupling modular method to enhance scalability. FederatedML provides:
- Joint statistics: PSI, Union, Pearson Correlation, etc.
- Joint feature Engineering: including federal sampling, federal feature box, federal feature selection, etc.
- Joint machine learning algorithm: including horizontal and vertical federated LR, GBDT, DNN, migration learning, etc.
- Model evaluation: provide binary classification, multi classification, regression evaluation, cluster evaluation, federal and unilateral comparative evaluation.
- Security protocol: it provides a variety of security protocols for more secure multi-party interactive computing.
Federal machine learning framework: Reference https://fate.readthedocs.io/en/latest/_build_temp/python/federatedml/README_zh.html#id4
Algorithm list
List of algorithms used in the test example:
algorithm | Module name | describe | data input | data output | Model input | Model output |
---|---|---|---|---|---|---|
Reader | Reader | When the storage engine of the input data is not supported by the current computing engine, it will be automatically transferred to the component output storage engine of fat cluster adaptive computing engine; When the storage format of input data is not fat supported, the format will be automatically converted and stored in the component output storage engine of fat cluster | User original storage data | Raw data after conversion | ||
DataIO | DataIO | This component converts the original data into an Instance object (fat-v1.7 will be gradually discarded and datatransform will be used). | Table, the value is the original data | The converted data table, whose value is the instance of Data Instance defined in federatedml/feature/instance.py | DataIO model | |
DataTransform_ | DataTransform | This component converts the original data into an Instance object. | Table, the value is the original data | The converted data table, whose value is the instance of Data Instance defined in federatedml/feature/instance.py | DataTransform model | |
Intersect | Intersection | Calculate the intersecting data sets of the two parties without disclosing any information of the different data sets. It is mainly used for vertical tasks. | Table | The intersecting part of two tables | Intersect model | |
Hetero Secure Boosting | HeteroSecureBoost | Build vertical Secure Boost module through multiple parties. | Table with value of Instance | The SecureBoost model consists of the model itself and model parameters |
The detailed algorithm list is shown in: https://fate.readthedocs.io/en/latest/_build_temp/python/federatedml/README_zh.html#id2
Security protocol
reference resources: https://fate.readthedocs.io/en/latest/_build_temp/python/federatedml/README_zh.html#id3
parameter
reference resources: https://fate.readthedocs.io/en/latest/_build_temp/python/federatedml/README_zh.html#module-federatedml.param
Example
For details, see: https://fate.readthedocs.io/en/latest/_build_temp/examples/README.html
Example usage guide
Test cases 2 and 3 in this document
FATE-FLOW
reference resources: https://fate.readthedocs.io/en/latest/_build_temp/python/fate_flow/README_zh.html
Fat flow is the job scheduling system of the federal learning framework fat, which realizes the complete management of the life cycle of federal learning jobs, including data input, training job scheduling, index tracking, model center and so on
Fat flow key
- Use DAG to define Pipeline;
- Fat-dsl in JSON format is used to describe DAG and support automatic docking of the system;
- Advanced scheduling framework, based on global state and optimistic lock scheduling, single DAG scheduling, multi-party coordinated scheduling, and supports multi schedulers
- Flexible scheduling strategy, support start / stop / rerun, etc
- The fine-grained resource scheduling capability supports core count, memory and work node strategies according to different computing engines
- Real time tracker, real-time tracking data, parameters, models and indicators during operation
- Federated model center, model management, federated consistency, import and export, and inter cluster migration
- Provides CLI, HTTP API, Python SDK
framework
reference resources: https://fate.readthedocs.io/en/latest/_build_temp/python/fate_flow/README_zh.html#id3
deploy
reference resources: https://fate.readthedocs.io/en/latest/_build_temp/python/fate_flow/README_zh.html#id4
usage
Fat flow client command line
reference resources: https://fate.readthedocs.io/en/latest/_build_temp/python/fate_client/flow_client/README_zh.html
In the new fat flow command line console, the commands are divided into multiple classes, including job, data, model, component, etc. All commands will have a common call entry
[IN] flow [OUT] Usage: flow [OPTIONS] COMMAND [ARGS]... Fate Flow Client Options: -h, --help Show this message and exit. Commands: component Component Operations data Data Operations job Job Operations model Model Operations queue Queue Operations table Table Operations task Task Operations
Fat flow command line interface reference: https://fate.readthedocs.io/en/latest/_build_temp/python/fate_flow/doc/fate_flow_cli.html
python fate_flow_client.py -f $command # #cd /fate/python/fate_flow/ #python fate_flow_client.py -f submit_job -c /fate/python/fate_flow/examples/test_hetero_lr_job_conf.json -d /fate/python/fate_flow/examples/test_hetero_lr_job_dsl.json
Fat flow CLIENT SDK Guide
For details, refer to: https://fate.readthedocs.io/en/latest/_build_temp/python/fate_client/flow_sdk/README_zh.html
contain
-
Job operation
-
Component operation
-
Data operation
-
Task action
-
Model operation
-
Tag operation
-
Table operation
-
Queue operation
The usage is similar:
from flow_sdk.client import FlowClient # use real ip address to initialize SDK client = FlowClient('127.0.0.1', 9000, 'v1') #client.job.submit(conf_path, dsl_path)
FATE-Flow REST API
reference resources: https://fate.readthedocs.io/en/latest/_build_temp/python/fate_flow/doc/fate_flow_http_api.html
Fat model publishing and online federated reasoning Guide
reference resources: https://fate.readthedocs.io/en/latest/_build_temp/doc/model_publish_with_serving_guide_zh.html
Fat pipeline use
Pipeline introduction reference: https://fate.readthedocs.io/en/latest/_build_temp/python/fate_client/pipeline/README.html
Pipeline is an advanced Python API that allows users to design, start, and query fat jobs in a sequential manner. Fat pipeline is designed to be user-friendly and consistent with the behavior of fat command line tools. Users can customize the job workflow by adding components to the pipeline, and then start the job with one call. In addition, pipeline provides the function of running prediction and querying information after fitting the pipeline.
A fat job is a directed acyclic graph
Fat job is a dag composed of algorithm component nodes. The fat pipeline provides easy-to-use tools to configure the sequence and settings of tasks.
Fat is written in a modular style. The module is designed to have input and output data and models. Therefore, when the output of one module is set to the input of another module, the two modules are connected together. By tracking how a dataset is processed through the fat module, we can see that a fat job is actually composed of a series of subtasks.
pipeline interface
reference resources: https://fate.readthedocs.io/en/latest/_build_temp/python/fate_client/pipeline/README.html#interface-of-pipeline
Component
The fat module is encapsulated in the Pipeline API. Each component can receive and output Data and Model. Component parameters can be easily set during initialization. Unspecified parameters will have default values. All components have a name, which can be set arbitrarily. The name of a component is its identifier, so it must be unique in Pipeline. We recommend that each component name include a number as a suffix for easy tracking.
Each component may have input and / or output Data and / or Model. For more information on how to use components, see this guide.
Example of initializing a component with a specified parameter value:
hetero_lr_0 = HeteroLR(name="hetero_lr_0", early_stop="weight_diff", max_iter=10, early_stopping_rounds=2, validation_freqs=2)
Input
Input Encapsulates all inputs of a component, including Data and Model inputs. To access the input component, reference its input attribute:
input_all = dataio_0.input
Output
output Encapsulated components, including all output results Data, and Model output. To access from an output component, reference its output property:
output_all = dataio_0.output
Data
All data type inputs and outputs of the data wrapper component. Fat pipeline includes five types of data, and each type is used for different scenarios. For more information, see here
Model
Model defines the model inputs and outputs of components. Like Data, these two types of models are used for different purposes. For more information, see here.
Build pipeline
After initializing the pipeline, specify the job participant and initiator. The following is an example of the initial setup of a pipe:
pipeline = PipeLine() pipeline.set_initiator(role='guest', party_id=9999) pipeline.set_roles(guest=9999, host=10000, arbiter=10000)
The Reader needs to read in the data source so that other components can process the data. Define a Reader component:
reader_0 = Reader(name="reader_0")
In most cases, the DataIO Reader converts data into DataInstance format, which can then be used for data engineering and model training. Some components, such as Union and Intersection, can run directly on non DataInstance tables.
You can configure all pipeline components individually by setting them to different roles_ party_ instance. For example, DataIO can specifically configure components for guest s like this:
dataio_0 = DataIO(name="dataio_0") guest_component_instance = dataio_0.get_party_instance(role='guest', party_id=9999) guest_component_instance.component_param(with_label=True, output_format="dense")
To include components in a pipe, use add_component. To add a DataIO component to a previously created pipeline, try the following:
pipeline.add_component(dataio_0, data=Data(data=reader_0.output.data))
Build the fat NN model in Keras style
reference resources: https://fate.readthedocs.io/en/latest/_build_temp/python/fate_client/pipeline/README.html#build-fate-nn-model-in-keras-style
Initialize runtime job parameters
In order to fit or predict, the user needs to initialize the runtime environment, such as "backend" and "work_mode",
from pipeline.runtime.entity import JobParameters job_parameters = JobParameters(backend=Backend.EGGROLL, work_mode=WorkMode.STANDALONE)
Run pipeline
After adding all the components, the user needs to compile the pipeline before running the designed job. After compilation, the appropriate Backend and can be used to fit (run training jobs) the pipeline WorkMode.
pipeline.compile() pipeline.fit(job_parameters)
Task query
Fat pipeline provides API to query component information, including data, model and summary. All query APIs have FlowPy And Pipeline retrieves the query result and returns it directly to the user.
summary = pipeline.get_component("hetero_lr_0").get_summary()
Deploy components
Once the pipeline fitting is completed, the prediction can be run on the new data set. Before forecasting, you need to deploy the necessary components first. This step marks the selected components to be used by the prediction pipeline.
# deploy select components pipeline.deploy_component([dataio_0, hetero_lr_0]) # deploy all components # note that Reader component cannot be deployed. Always deploy pipeline with Reader by specified component list. pipeline.deploy_component()
Prediction using pipeline
First, start a new pipeline, and then specify the data source for prediction.
predict_pipeline = PipeLine() predict_pipeline.add_component(reader_0) predict_pipeline.add_component(pipeline, data=Data(predict_input={pipeline.dataio_0.input.data: reader_0.output.data}))
You can then start the prediction on the new pipeline.
predict_pipeline.predict(job_parameters)
In addition, because the pipeline is modular, users can add new components to the original pipeline before running the prediction.
predict_pipeline.add_component(evaluation_0, data=Data(data=pipeline.hetero_lr_0.output.data)) predict_pipeline.predict(job_parameters)
Save and restore pipeline
To save the pipeline, simply use the dump interface.
pipeline.dump("pipeline_saved.pkl")
To restore the pipeline, use load_model_from_file interface.
from pipeline.backend.pipeline import PineLine PipeLine.load_model_from_file("pipeline_saved.pkl")
pipeline summary information
To get the details of the pipeline, use the describe interface, which prints the "creation time" fit or prediction status and the built dsl, if any.
pipeline.describe()
Pipes and CLI
In previous versions, users interacted with fat through the command line interface, usually using manually configured conf and dsl json files. Manual configuration is tedious and error prone. Fat pipeline automatically forms a task configuration file at compile time, allowing you to quickly try task design.
journal
Fat flow service log: / fat / logs / fat_ flow/
Task log: / fat / logs / $job_ id/
common problem
reference resources: https://fate.readthedocs.io/en/latest/_build_temp/python/fate_flow/README_zh.html#id10
Test sample
Fat board task viewing interface: http://192.168.1.75:8080/#/history
reference resources: https://fate.readthedocs.io/en/latest/_build_temp/examples/experiment_template/user_usage/dsl_v2_predict_tutorial.html
Test case 1 - upload file
python /fate/examples/pipeline/demo/pipeline-upload.py --base /fate
The content of pipeline-upload.py is as follows:
Set examples / data / beast_ hetero_ Change guest.csv to the file you want to upload
import os import argparse from pipeline.backend.config import Backend, WorkMode from pipeline.backend.pipeline import PipeLine # path to data # default fate installation path DATA_BASE = "/data/projects/fate" # site-package ver # import site # DATA_BASE = site.getsitepackages()[0] def main(data_base=DATA_BASE): # parties config guest = 9999 # 0 for eggroll, 1 for spark backend = Backend.EGGROLL # 0 for standalone, 1 for cluster work_mode = WorkMode.STANDALONE # use the work mode below for cluster deployment # work_mode = WorkMode.CLUSTER # partition for data storage partition = 4 # table name and namespace, used in FATE job configuration dense_data = {"name": "breast_hetero_guest", "namespace": f"experiment"} tag_data = {"name": "breast_hetero_host", "namespace": f"experiment"} pipeline_upload = PipeLine().set_initiator(role="guest", party_id=guest).set_roles(guest=guest) # add upload data info # path to csv file(s) to be uploaded, modify to upload designated data pipeline_upload.add_upload_data(file=os.path.join(data_base, "examples/data/breast_hetero_guest.csv"), table_name=dense_data["name"], # table name namespace=dense_data["namespace"], # namespace head=1, partition=partition) # data info pipeline_upload.add_upload_data(file=os.path.join(data_base, "examples/data/breast_hetero_host.csv"), table_name=tag_data["name"], namespace=tag_data["namespace"], head=1, partition=partition) # upload data pipeline_upload.upload(work_mode=work_mode, backend=backend, drop=1) if __name__ == "__main__": parser = argparse.ArgumentParser("PIPELINE DEMO") parser.add_argument("--base", "-b", type=str, help="data base, path to directory that contains examples/data") args = parser.parse_args() if args.base is not None: main(args.base) else: main()
Test case 2 - training and evaluation model
pipeline construction:
python /fate/examples/pipeline/demo/pipeline-quick-demo.py
pipeline-quick-demo.py is as follows:
import json from pipeline.backend.config import Backend, WorkMode from pipeline.backend.pipeline import PipeLine from pipeline.component import Reader, DataIO, Intersection, HeteroSecureBoost, Evaluation from pipeline.interface import Data from pipeline.runtime.entity import JobParameters # table name & namespace in data storage # data should be uploaded before running modeling task guest_train_data = {"name": "breast_hetero_guest", "namespace": "experiment"} host_train_data = {"name": "breast_hetero_host", "namespace": "experiment"} # initialize pipeline pipeline = PipeLine().set_initiator(role="guest", party_id=9999).set_roles(guest=9999, host=10000) # define components reader_0 = Reader(name="reader_0") reader_0.get_party_instance(role="guest", party_id=9999).component_param(table=guest_train_data) reader_0.get_party_instance(role="host", party_id=10000).component_param(table=host_train_data) dataio_0 = DataIO(name="dataio_0", with_label=True) dataio_0.get_party_instance(role="host", party_id=10000).component_param(with_label=False) intersect_0 = Intersection(name="intersection_0") hetero_secureboost_0 = HeteroSecureBoost(name="hetero_secureboost_0", num_trees=5, bin_num=16, task_type="classification", objective_param={"objective": "cross_entropy"}, encrypt_param={"method": "iterativeAffine"}, tree_param={"max_depth": 3}) evaluation_0 = Evaluation(name="evaluation_0", eval_type="binary") # add components to pipeline, in order of task execution pipeline.add_component(reader_0)\ .add_component(dataio_0, data=Data(data=reader_0.output.data))\ .add_component(intersect_0, data=Data(data=dataio_0.output.data))\ .add_component(hetero_secureboost_0, data=Data(train_data=intersect_0.output.data))\ .add_component(evaluation_0, data=Data(data=hetero_secureboost_0.output.data)) # compile & fit pipeline pipeline.compile().fit(JobParameters(backend=Backend.EGGROLL, work_mode=WorkMode.STANDALONE)) # to run this task with cluster deployment, use the following setting instead # may change data engine backend according to actual environment # pipeline.compile().fit(JobParameters(backend=Backend.EGGROLL, work_mode=WorkMode.CLUSTER)) # query component summary print(f"Evaluation summary:\n{json.dumps(pipeline.get_component('evaluation_0').get_summary(), indent=4)}")
Fat flow setup:
(users can go to / fat / examples / DSL / V2 / to find the appropriate algorithm and configuration file replacement)
flow job submit -c /fate/examples/dsl/v2/hetero_secureboost/test_secureboost_train_complete_secure_conf.json -d /fate/examples/dsl/v2/hetero_secureboost/test_secureboost_train_dsl.json
Test case 3 - model training, deployment and prediction
Build a hetero secureboost model and use the model to predict
pipeline construction
pipeline
reference resources: https://fate.readthedocs.io/en/latest/_build_temp/examples/experiment_template/user_usage/pipeline_predict_tutorial.html
1. Upload files (test case 1 has been uploaded)
2. Create files in the host environment:
cd /home/docker_standalone_fate_1.6.0/fate_job/ vim fit_Hetero_SecureBoost_model.py
The contents of the file are as follows: (you can refer to the corresponding algorithm in / fat / examples / pipeline / to list the code and algorithm https://fate.readthedocs.io/en/latest/_build_temp/python/federatedml/README.html#algorithm-list corresponding algorithm Introduction (combination is helpful for understanding)
from pipeline.backend.config import Backend, WorkMode # configs from pipeline.backend.pipeline import PipeLine # Pipeline from pipeline.component import Reader, DataIO, Intersection, HeteroSecureBoost # fate components from pipeline.interface import Data # data flow from pipeline.runtime.entity import JobParameters # parameter class # define dataset name and namespace guest_train_data = {"name": "breast_hetero_guest", "namespace": "experiment"} host_train_data = {"name": "breast_hetero_host", "namespace": "experiment"} # initialize pipeline, set guest as initiator and set guest/host party id pipeline = PipeLine().set_initiator(role="guest", party_id=9999).set_roles(guest=9999, host=10000) # define components # reader read raw data reader_0 = Reader(name="reader_0") reader_0.get_party_instance(role="guest", party_id=9999).component_param(table=guest_train_data) reader_0.get_party_instance(role="host", party_id=10000).component_param(table=host_train_data) # data_io transform data dataio_0 = DataIO(name="dataio_0", with_label=True) dataio_0.get_party_instance(role="host", party_id=10000).component_param(with_label=False) # find sample intersection using Intersection components intersect_0 = Intersection(name="intersection_0") # hetero secureboost components, setting algorithm parameters hetero_secureboost_0 = HeteroSecureBoost(name="hetero_secureboost_0", num_trees=5, bin_num=16, task_type="classification", objective_param={"objective": "cross_entropy"}, encrypt_param={"method": "iterativeAffine"}, tree_param={"max_depth": 3}) # add components to pipeline, in the order of task execution pipeline.add_component(reader_0)\ .add_component(dataio_0, data=Data(data=reader_0.output.data))\ .add_component(intersect_0, data=Data(data=dataio_0.output.data))\ .add_component(hetero_secureboost_0, data=Data(train_data=intersect_0.output.data)) # compile & fit pipeline pipeline.compile().fit(JobParameters(backend=Backend.EGGROLL, work_mode=WorkMode.STANDALONE)) # save train pipeline pipeline.dump("pipeline_saved.pkl")
Execute in docker environment:
cd /fate/fate_job python fit_Hetero_SecureBoost_model.py
3. Create files in the host environment
vim predict_instances_by_Hetero_SecureBoost_model.py
The contents of the file are as follows: (the user changes the toast_hetero_guest to the uploaded table name)
from pipeline.backend.pipeline import PipeLine from pipeline.component.reader import Reader from pipeline.interface.data import Data from pipeline.backend.config import Backend, WorkMode # configs from pipeline.runtime.entity import JobParameters # parameter class # load train pipeline pipeline = PipeLine.load_model_from_file('pipeline_saved.pkl') # deploy components in training step pipeline.deploy_component([pipeline.dataio_0, pipeline.intersection_0, pipeline. hetero_secureboost_0]) # set new instances to predict # new dataset guest_train_data = {"name": "breast_hetero_guest", "namespace": "experiment"} host_train_data = {"name": "breast_hetero_host", "namespace": "experiment"} # set new reader reader_0 = Reader(name="reader_0") reader_0.get_party_instance(role="guest", party_id=9999).component_param(table=guest_train_data) reader_0.get_party_instance(role="host", party_id=10000).component_param(table=host_train_data) # new predict pipeline predict_pipeline = PipeLine() # update reader predict_pipeline.add_component(reader_0) # add selected components from train pipeline onto predict pipeline predict_pipeline.add_component(pipeline,data=Data(predict_input={pipeline.dataio_0.input.data: reader_0.output.data})) # run predict model predict_pipeline.predict(JobParameters(backend=Backend.EGGROLL, work_mode=WorkMode.STANDALONE))
Execute in docker environment, and view the results in fat board
python predict_instances_by_Hetero_SecureBoost_model.py
Build with flow command
reference resources: https://fate.readthedocs.io/en/latest/_build_temp/examples/experiment_template/user_usage/dsl_v2_predict_tutorial.html
(1) 1. Training model
cd /fate/examples/
flow job submit -c dsl/v2/hetero_secureboost/test_secureboost_train_binary_conf.json -d dsl/v2/hetero_secureboost/test_secureboost_train_dsl.json
(2) . use flow_client to deploy the components required in the prediction task
Remember to modify model ID and model version
flow model deploy --model-id guest-9999#host-9998#model --model-version 2021090109322084026031 --cpn-list "reader_0, dataio_0, intersection_0, hetero_secure_boost_0"
If execution returns:
{ "retcode": 100, "retmsg": "'Pipeline'" }
Because the model has not been built yet, you can view it in the board and deploy it after it is completed
(3) 1. Forecast documents
Model returned with deployment file_ id ,model_version, dataset name replacement * * / fat / examples / DSL / V2 / * * hetero_ secureboost/test_ predict_ Model in conf.json content_ id ,model_version, and the replaced content is saved to / fate/fate_job/new_test_predict_conf.json (new file)
implement
flow job submit -c /fate/fate_job/lxy/new_test_predict_conf.json
If the following contents are returned, it is because of host,guest party_id and upload file host, guest party_ Inconsistent ID
{ 'data':'No such file or directory', 'retcode':100, 'retmsg':"2" }
If the submitted task returns the following content, it is because the deployment operation has not been performed:
{ "retcode": 100, "retmsg": "Model arbiter-10000#guest-9999#host-10000#model 20210908033432743389158 has not been deployed yet." }
Test case 4 - build the model independently using pipeline
1. Select the appropriate algorithm and data set
Download fat source code
git clone https://github.com/FederatedAI/FATE.git
Open the data/READMA.md file in the examples directory to view the test data set information, use the linear regression model HeteroLinR, and the student whose label is a continuous value_ Hetero dataset
2. Upload dataset
Refer to test case 1
3. Write training and verification pipeline code
Based on the pipeline code of test case 3,
The details of HeteroLinR algorithm can be combined https://fate.readthedocs.io/en/latest/_ build_ temp/python/federatedml/linear_ model/linear_ Region / readme.html also has the source code fat / fat master / examples / pipeline / hetero_ linear_ Region / pipeline-hetero-linr.py two files to understand the algorithm.
It is found that the construction of HeteroLinR pipeline requires more than one arbiter to be configured compared with the construction of heterosecurebost model in test case 3. Other operations that need to be changed include: modifying the dataset name. The parameters of HeteroLinR model can refer to the model parameters in pipeline-hetero-linr.py. Only the name is set here
Create linr_model_train_and_evaluation.py file
The code is as follows:
from pipeline.backend.config import Backend, WorkMode # configs from pipeline.backend.pipeline import PipeLine # Pipeline from pipeline.component import Reader, DataIO, Intersection, HeteroSecureBoost, HeteroLinR ,Evaluation# fate components from pipeline.interface import Data # data flow from pipeline.runtime.entity import JobParameters # parameter class # define dataset name and namespace guest_train_data = {"name": "student_hetero_guest", "namespace": "experiment"} host_train_data = {"name": "student_hetero_host", "namespace": "experiment"} # initialize pipeline, set guest as initiator and set guest/host party id pipeline = PipeLine().set_initiator(role="guest", party_id=9999).set_roles(guest=9999, host=10000, arbiter=10000) reader_0 = Reader(name="reader_0") reader_0.get_party_instance(role='guest', party_id=9999).component_param(table=guest_train_data) reader_0.get_party_instance(role='host', party_id=10000).component_param(table=host_train_data) dataio_0 = DataIO(name="dataio_0") dataio_0.get_party_instance(role='guest', party_id=9999).component_param(with_label=True, label_name="y", label_type="int", output_format="dense") dataio_0.get_party_instance(role='host', party_id=10000).component_param(with_label=False) intersection_0 = Intersection(name="intersection_0") hetero_linr_0 = HeteroLinR(name="hetero_linr_0") evaluation_0 = Evaluation(name="evaluation_0", eval_type="regression", pos_label=1) pipeline.add_component(reader_0) pipeline.add_component(dataio_0, data=Data(data=reader_0.output.data)) pipeline.add_component(intersection_0, data=Data(data=dataio_0.output.data)) pipeline.add_component(hetero_linr_0, data=Data(train_data=intersection_0.output.data)) pipeline.add_component(evaluation_0, data=Data(data=hetero_linr_0.output.data)) pipeline.compile() job_parameters = JobParameters(backend=Backend.EGGROLL, work_mode=WorkMode.STANDALONE) pipeline.fit(job_parameters) pipeline.dump("/fate/fate_job/lxy/hetero_linr_pipeline_saved.pkl")#Save model
docker executes the file:
cd /fate/fate_job/lxy python linr_model_train_and_evaluation.py
4. Write prediction pipeline code
Create linr_ model_ The predict.py file is modified based on the pipeline code of test case 3. Only the table name and model name need to be modified
from pipeline.backend.pipeline import PipeLine from pipeline.component.reader import Reader from pipeline.interface.data import Data from pipeline.backend.config import Backend, WorkMode # configs from pipeline.runtime.entity import JobParameters # parameter class # load train pipeline pipeline = PipeLine.load_model_from_file('/fate/fate_job/lxy/hetero_linr_pipeline_saved.pkl') # deploy components in training step pipeline.deploy_component([pipeline.dataio_0, pipeline.intersection_0, pipeline. hetero_linr_0]) # set new instances to predict # new dataset guest_train_data = {"name": "student_hetero_guest", "namespace": "experiment"} host_train_data = {"name": "student_hetero_host", "namespace": "experiment"} # set new reader reader_0 = Reader(name="reader_0") reader_0.get_party_instance(role="guest", party_id=9999).component_param(table=guest_train_data) reader_0.get_party_instance(role="host", party_id=10000).component_param(table=host_train_data) # new predict pipeline predict_pipeline = PipeLine() # update reader predict_pipeline.add_component(reader_0) # add selected components from train pipeline onto predict pipeline predict_pipeline.add_component(pipeline,data=Data(predict_input={pipeline.dataio_0.input.data: reader_0.output.data})) # run predict model predict_pipeline.predict(JobParameters(backend=Backend.EGGROLL, work_mode=WorkMode.STANDALONE))
FATE TEST
A collection of useful tools for running fat tests.
quick start
reference resources: https://fate.readthedocs.io/en/latest/_build_temp/python/fate_test/README.html#quick-start
Edit default fat_test_config.yaml
Change / usr / local / lib / Python 3.6/site-packages/fate_ test/fate_ test_ Config.yaml file,
data_base_dir: path(FATE)
Change to
data_base_dir: /fate/
Run fate_test Suite
Suite: used to run the test suite and collect fat jobs. Testsuite test suite is used to run a group of jobs in sequence. The data used for the job can be uploaded before submitting the job or cleaned up after the job is completed. This tool is very useful for fat release testing.
#fate_test suite -i <path contains *testsuite.json> fate_test suite -i /fate/examples/dsl/v1/homo_nn/testsuite.json #fate_test suite -i /fate/examples/dsl/v1/hetero_pearson/testsuite.json
Run some fat_ Test benchmark
Benchmark quality for comparing modeling quality between fat and other machine learning systems
#fate_test benchmark-quality -i <path contains *benchmark.json> fate_test benchmark-quality -i /fate/examples/benchmark_quality/hetero_linear_regression/hetero_linr_benchmark.json
developer's guide
Develop algorithm module. For details, refer to: https://fate.readthedocs.io/en/latest/_build_temp/doc/develop_guide_zh.html
To develop a module, you need to perform the following five steps.
-
Defines the python parameter object that will be used in this module.
-
Define the Setting conf json configuration file for the module.
-
If the module needs Federation, the transfer variable configuration file needs to be defined.
-
Your algorithm module needs to inherit the model_base class and complete several specified functions.
-
Define the protobuf file required to save the model.
-
If you want to start a component directly through a python script, you need to start it in fat_ Define Pipeline components in client.
API
Computing API
Initialize a computing session. Refer to: https://fate.readthedocs.io/en/latest/_build_temp/doc/api/computing.html
Federation API
Including low-level API s FederationABC () and user interface secure_add_example_transfer_variable, reference: https://fate.readthedocs.io/en/latest/_build_temp/doc/api/federation.html
parameter
For details of class parameters, refer to: https://fate.readthedocs.io/en/latest/_build_temp/doc/api/params.html
Error reporting information and solutions encountered
Federated schedule error, Please check rollSite and fateflow
Error message: Federated schedule error, please check rollsite and fastflow network connectivity RPC request error:<_ InactiveRpcError of RPC that terminated…
Cause: cluster communication problem
Solution: check whether the work mode setting of the configuration file corresponds to the fat deployment method (single machine or cluster)