In kubeflow, TensorFlow Serving is used as the official tensorflow model interface. TensorFlow Serving is an open-source service system of GOOGLE, which is suitable for deploying machine learning model. It is flexible, high-performance and can be used in production environment. TensorFlow Serving can easily deploy new algorithms and experiments while maintaining the same server architecture and API.
Tensorflow Serving can directly load the model to generate the interface, but the only model supported by serving is SaveModel. Therefore, SaveModel is mainly introduced here.
SaveModel
SaveModel is specially used for tf model topology and weights. Based on SaveModel, there is no need to run the original model building code, which is very conducive to sharing or deploying the model. Therefore, SaveModel is generally used for model deployment.
- Topology: This is a file that describes the structure of the model (for example, what operations it uses). It contains references to model weights stored externally.
- Weights: These are binaries that store the given model weights in a valid format. They are usually stored in the same folder as the topology.
SaveModel file directory:
assets saved_model.pb variables
View MetaGraphDefs and SignatureDefs:
saved_model_cli show --dir <SaveModel Route> --all
To generate a model, we need the MetaGraphDefs and SignatureDefs of the model. MetaGraphDefs is our common meta graph, which contains four main types of information:
- MetaInfoDef: stores some meta information, such as version and other user information;
- GraphDef: the Graph serialized from the described Graph, which is composed of Protocol Buffer;
- SaverDef: the Saver information of the graph, such as the maximum number of checkpoint s saved at the same time, the name of the Tensor to be saved, etc., does not save the actual content in the Tensor;
- CollectionDef: any python object that requires special attention needs special annotation to facilitate retrieval after import meta graph, such as "prediction".
Signature defs is the signature definition of the model, which defines the input and output functions.
SignatureDefs
SignatureDef defines the signature calculated by TensorFlow graph, input and output functions, and SignatureDef structure:
inputs
as a map of string to TensorInfo.
outputs as a map of string to TensorInfo.
method_name (which corresponds to a supported method name in the loading tool/system).
Example of Classification SignatureDef
You must have one input Tensors inputs and two outputs Tensors: classes and scores
signature_def: { key : "my_classification_signature" value: { inputs: { key : "inputs" value: { name: "tf_example:0" dtype: DT_STRING tensor_shape: ... } } outputs: { key : "classes" value: { name: "index_to_string:0" dtype: DT_STRING tensor_shape: ... } } outputs: { key : "scores" value: { name: "TopKV2:0" dtype: DT_FLOAT tensor_shape: ... } } method_name: "tensorflow/serving/classify" } }
Example of Predict SignatureDef
signature_def: { key : "my_prediction_signature" value: { inputs: { key : "images" value: { name: "x:0" dtype: ... tensor_shape: ... } } outputs: { key : "scores" value: { name: "y:0" dtype: ... tensor_shape: ... } } method_name: "tensorflow/serving/predict" } }
Example of renewal signaturedef
signature_def: { key : "my_regression_signature" value: { inputs: { key : "inputs" value: { name: "x_input_examples_tensor_0" dtype: ... tensor_shape: ... } } outputs: { key : "outputs" value: { name: "y_outputs_0" dtype: DT_FLOAT tensor_shape: ... } } method_name: "tensorflow/serving/regress" } }
Generate SaveModel file
How to generate a SaveModel file:
- (1) TF. Saved? Model? Is the most direct and simple
- (2) Export of Estimator? Savedmodel? Export of advanced API Estimator model
classifier = classifier = tf.estimator.Estimator( model_fn=conv_model, model_dir=args.tf_model_dir, config=training_config, params=model_params) classifier.export_savedmodel(args.tf_export_dir, serving_input_receiver_fn=serving_fn)
- (3)keras.Model.save(output_path)
Change checkpoint model file to SaveModel file
import sys, os, io import tensorflow as tf model_version = "1" model_name = "object" def restore_and_save(input_checkpoint, export_path_base): checkpoint_file = tf.train.latest_checkpoint(input_checkpoint) graph = tf.Graph() with graph.as_default(): session_conf = tf.ConfigProto(allow_soft_placement=True, log_device_placement=False) sess = tf.Session(config=session_conf) with sess.as_default(): # Load the saved meta graph, recover the variables in the graph, and save the deployable model through SavedModelBuilder saver = tf.train.import_meta_graph("{}.meta".format(checkpoint_file)) saver.restore(sess, checkpoint_file) print ("name scope: ",graph.get_name_scope()) export_path_base = export_path_base export_path = os.path.join( tf.compat.as_bytes(export_path_base), tf.compat.as_bytes(model_name+"/"+model_version)) print('Exporting trained model to', export_path) builder = tf.saved_model.builder.SavedModelBuilder(export_path) # The various operator s of the model can be obtained through graph. Get ﹣ operations() # Input is the input layer operator inputs = tf.saved_model.utils.build_tensor_info(graph.get_operation_by_name("Placeholder").outputs[0]) print(inputs) # Output is the output layer operator, where the output layer type is Softmax outputs = tf.saved_model.utils.build_tensor_info(graph.get_operation_by_name("final_result").outputs[0]) print(outputs) """ signature_constants: SavedModel Save and restore signature constants for operations. //In the task of sequence annotation, the method "name here is" tensorflow/serving/predict " """ # Define the input and output of the model, and establish the mapping between the calling interface and the sensor signature labeling_signature = ( tf.saved_model.signature_def_utils.build_signature_def( inputs={ "Placeholder": inputs, }, outputs={ "final_result": outputs, }, method_name="tensorflow/serving/predict")) """ tf.group : Create an operation that groups multiple operations and return an operation that can perform all inputs """ legacy_init_op = tf.group(tf.tables_initializer(), name='legacy_init_op') """ add_meta_graph_and_variables: Set up a Saver To preserve session Variables in, //Output the definition of the corresponding original graph. This function assumes that the saved variables have been initialized; //For a SavedModelBuilder, the API must be called once to save the meta graph; //For the figure structure added later, you can use the function add > meta > graph() to add it """ # Mapping model name to model signature builder.add_meta_graph_and_variables( sess, [tf.saved_model.tag_constants.SERVING], signature_def_map={ tf.saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY: labeling_signature}, legacy_init_op=legacy_init_op ) builder.save() print("Build Done")
Run server
After the SaveModel model file is generated, you can directly run serving to implement the model service:
(1) Run with DOCKER:
docker run --rm -it -p 8500:8500 \ --mount type=bind,source=/root/inception/models,target=/models \ -e MODEL_NAME=1 tensorflow/serving
The defau lt directory for mounting is two-level directory:. / < model name > / < version number > / save_model.pb, and the version number must be a number.
(2) Or you can use k8s to run deployment(kubeflow):
apiVersion: extensions/v1beta1 kind: Deployment metadata: labels: app: inception name: inception-service-local namespace: kubeflow spec: template: metadata: labels: app: inception version: v1 spec: containers: - args: - --port=9000 - --rest_api_port=8500 - --model_name=1 - --model_base_path=/mnt/export command: - /usr/bin/tensorflow_model_server env: - name: modelBasePath value: /mnt/export image: tensorflow/serving:1.11.1 imagePullPolicy: IfNotPresent livenessProbe: initialDelaySeconds: 30 periodSeconds: 30 tcpSocket: port: 9000 name: mnist ports: - containerPort: 9000 - containerPort: 8500 volumeMounts: - mountPath: /mnt name: local-storage
Build request test
Test model interface
import requests from PIL import Image import numpy as np filename = "./CES/astra_mini/1576210854440.png" # picture img=Image.open(filename) img_arr=np.array(img,dtype=np.uint8) print(img_arr.shape) # (299, 299, 3) data = json.dumps({"instances": [img_arr.tolist()]}) headers = {"content-type": "application/json"} json_response = requests.post('http://127.0.0.1:8501/v1/models/object:predict', data=data, headers=headers) print(json_response.text)
Reference
https://www.tensorflow.org/tfx/tutorials/serving/rest_simple