1

I have already asked this question over on Google Ai for Devs, but have not received an answer, so I will try my luck here. I am currently working on testing inference speeds of different object detection models and my code works for RT-DETRv2, Yolo11 and Yolo8. Sadly Yolox does not work, I have recreated the inference pipeline of a working Python inference script in my Android project, while the values of the input buffer for the Android model are the same as the raveled Numpy array in the Python version(confirmed visually and checking positions in the array), the output is vastly different. So much so that the model does not detect anything on the test image in Android, while producing the correct bounding boxes in Python. The entire code with environment specifications is available on Github , below is an abridged version of the code. Python:

if len(img.shape) == 3:
        padded_img = np.ones((input_size[0], input_size[1], 3), dtype=np.uint8) * 114
    else:
        padded_img = np.ones(input_size, dtype=np.uint8) * 114

 r = min(input_size[0] / img.shape[0], input_size[1] / img.shape[1])
 resized_img = cv2.resize(img,(int(img.shape[1] * r), int(img.shape[0] * r)),
                              interpolation=cv2.INTER_LINEAR,).astype(np.uint8)

 padded_img[: int(img.shape[0] * r), : int(img.shape[1] * r)] = resized_img
 padded_img = np.ascontiguousarray(padded_img, dtype=np.float32)

 interpreter = tf.lite.Interpreter(model_path=MODEL_PATH)
 interpreter.allocate_tensors()
 input_details = interpreter.get_input_details()
 output_details = interpreter.get_output_details()
 interpreter.set_tensor(input_details[0]['index'], img[None, :, :, :])

Android (Kotlin)

val imgmat = Mat()
Utils.bitmapToMat(decode,imgmat)
val imgmat3 = Mat()
Imgproc.cvtColor(imgmat,imgmat3,Imgproc.COLOR_RGBA2BGR)
val resizedmat =  Mat()
val paddedmat = Mat()

val size = Size((1920F*ratio).toDouble(),(1080F*ratio).toDouble())
val scalar = Scalar(114.0,114.0,114.0)
Imgproc.resize(imgmat3,resizedmat,size, 0.0, 0.0,INTER_LINEAR)
Core.copyMakeBorder(resizedmat,paddedmat,0,(imsize- (1080*ratio)).toInt(),0,0,Core.BORDER_CONSTANT,scalar)                           
val bitmap = createBitmap(paddedmat.cols(),paddedmat.width(),Bitmap.Config.ARGB_8888)
Imgproc.cvtColor(paddedmat,argbmat,Imgproc.COLOR_RGB2RGBA)
Utils.matToBitmap(argbmat,bitmap)

val image = TensorImage(DataType.UINT8)
image.load(bitmap)
val tensorproc = ImageProcessor.Builder().add(CastOp(INPUT_IMAGE_TYPE)).build()
val proctensor = tensorproc.process(image)
val imageBuffer = proctensor.buffer
val output = TensorBuffer.createFixedSize(intArrayOf(numChannel, numElements), OUTPUT_IMAGE_TYPE)
interpreter.run(imageBuffer, output.buffer)
   

Most of the related topics are about issues with the endianness of the input buffer, this does not apply here, due to the TensorImage datatype being used, which handles this automatically. My current suspicion is that Tensorflow does not use input values in the same order as they would be displayed if the array was raveled, but I do not know how to check this easily. I would appreciate any insight on this topic.

Thanks in Advance

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.