How to parse the output received by gRPC stub client from tensorflow serving server?

tensorflow serving grpc
tensorflow serving client

I have exported a DNNClassifier model and run it on tensorflow-serving server using docker. After that I have written a python client to interact with that tensorflow-serving for new prediction.

I have written the following code to get the response from tensorflow-serving server.

host, port = FLAGS.server.split(':')
  channel = implementations.insecure_channel(host, int(port))
  stub = prediction_service_pb2.beta_create_PredictionService_stub(channel)

  request = predict_pb2.PredictRequest() = FLAGS.model
  request.model_spec.signature_name = 'serving_default'

  feature_dict = {'a': _float_feature(value=400),
                  'b': _float_feature(value=5),
                  'c': _float_feature(value=200),
                  'd': _float_feature(value=30),
                  'e': _float_feature(value=60),
                  'f': _float_feature(value=5),
                  'g': _float_feature(value=7500),
                  'h': _int_feature(value=1),
                  'i': _int_feature(value=1234),
                  'j': _int_feature(value=1),
                  'k': _int_feature(value=4),
                  'l': _int_feature(value=1),
                  'm': _int_feature(value=0)}
  example= tf.train.Example(features=tf.train.Features(feature=feature_dict))
  serialized = example.SerializeToString()

        tf.contrib.util.make_tensor_proto(serialized, shape=[1]))

  result_future = stub.Predict.future(request, 5.0)

You can do the following

result = stub.Predict(request, 5.0)
float_val = result.outputs['outputs'].float_val

Note that this method calls stub.Predict instead of stub.Predict.future

Writing a Generic TensorFlow Serving Client for Tensorflow Serving , Shows a framework to develop a generic Tensorflow serving client create a custom REST or GRPC interface and write custom clients. Depending on the model that you have used, your server and client will vary. result_future = stub. print("Response Received Exiting") We need to parse this output. Here the gRPC server is our docker container running the TensorFlow serving service and our client is in python that requests this service for inference. This article describes how RPC works in a

In case you have more than one outputs, you do something like the following which basically creates a dictionary with keys corresponding to the outputs and values corresponding to a list of whatever the model returns.

results = dict()
for output in output_names:
    results[output] = response.outputs[output].float_val

tensorflow/serving, gRPC report a tensor with empty name not found error #1096 I build the serving based on the mnist client and inception client import os import argparse import tensorflow as tf from tensorflow.python.saved_model import DEFINE_string('output' channel = grpc.insecure_channel(FLAGS.server) stub  Shows a framework to develop a generic Tensorflow serving client for TF serving models. Sign in. We need to parse this output. Snippet below Using Tensorflow Serving GRPC.

This is in addition to the answer given by @Maxime De Bruyn,

In predict API with multiple prediction outputs using mobilenet/inception model, the following code segment didn't work for me.

result = stub.Predict(request, 5.0)

float_val = result.outputs['outputs'].float_val

print("Output: ", float_val)

Output: []

Instead, I had to use the "prediction" key in the output.

result = stub.Predict(request, 5.0)
predictions = result.outputs['prediction'].float_val
print("Output: ", predictions)

Output: [0.016111543402075768, 0.2446805089712143, 0.06016387417912483, 0.12880375981330872, 0.035926613956689835, 0.026000071316957474, 0.04009509086608887, 0.35264086723327637, 0.0762331634759903, 0.019344471395015717]

Export nmt trained model to tensorflow serving · Issue #712 , I exported the model but fails when call serving from the client. Create .pb/.pbtxt for serving tensorflow/nmt#294 help="model server host") parser.​add_argument("--port", type=int, default=9000, help="model @ptamas88 I think that same inference output in different inputs is not related to the exporting  A Java Client for TenforFlow Serving gRPC API. Contribute to junwan01/tensorflow-serve-client development by creating an account on GitHub.

What you are looking for is probably tf.make_ndarray, which creates a numpy array from a TensorProto (i.e. is the inverse of tf.make_tensor_proto). This way your output recovers the shape it is supposed to have, so building upon Jasmine's answer you can store multiple outputs in a dictionary with:

response = prediction_service.Predict(request, 5.0)

results = {}
for output in response.outputs.keys():
    results[output] = tf.make_ndarray(response.outputs[output])

How to deploy TensorFlow models to production using TF Serving , But, if you only want to know about TensorFlow Serving, you can an inputs tensor (to receive data) and at least one of two possible output For running the client code using the TF Serving python API, we use Since our server implements the TensorFlow Predict API, we need to parse a Predict request. serving / tensorflow_serving / example / Find file Copy path gautamvasudevan Remove dependence on bazel for python examples 793fd90 Oct 23, 2018

Serving a TensorFlow Model, This tutorial shows you how to use TensorFlow Serving components to export how the server internals work, see the TensorFlow Serving advanced tutorial. discovers new exported models and runs a gRPC service for serving them. the input tensor (image) as x and output tensor (Softmax score) as y . System information OS Platform and Distribution: Linux Ubuntu 16.04 TensorFlow Serving installed from (source or binary): Binary TensorFlow Serving version: TensorFlow ModelServer: 1.13.0-rc1+dev.sha.f16e777 TensorFlow Library: 1.13.1 De

RESTful API | TFX, See Encoding binary values section below for details on how to represent a binary (stream of bytes) value. This format is similar to gRPC's  Since the two streams are independent, the client and server can read and write messages in any order. For example, a server can wait until it has received all of a client’s messages before writing its messages, or the server and client can play “ping-pong” – the server gets a request, then sends back a response, then the client sends

How We Increased Tensorflow Serving Productivity by 70% / Sudo , Tensorflow Serving provides a flexible server architecture for deploying and to serve output requests for the model resnet at the gRPC and HTTP endpoints: FLAGS = parser.parse_args() defmain():# create prediction service client stub Having received a JPEG input, a working client will produce the following result: OpenVINO™ Model Server gRPC API is documented in the proto buffer files in tensorflow_serving_api. Note: The implementations for Predict, GetModelMetadata and GetModelStatus function calls are currently available. These are the most generic function calls and should address most of the usage scenarios.

  • Did it solve your question ? If so, could you accept the answer ?
  • What if you have multiple values?
  • Can you please clarify, what is the variable "output_names" on which you are iterating.
  • Yes - the output names is a list with your model's outputs. For example, in the screenshot provided in the question it would be something like this output_name=['outputs']. In case your model has more than one outputs it would be something like output_names=['output_1', 'output_2']. Personally I use the metadata endpoint provided by tensorflow serving to retrieve the metadata (including output names), i store the output names in a list and the I iterate over this list.