How to simply route requests with the same request id to the same model instance? #7861
Unanswered
fighterhit
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
How can I simply route all inference requests with the same request id to the same model instance, and then execute inference using dynamic_batching? It sounds like this can be achieved by using a stateful model to change the request id to a sequence id, but it feels too complicated and requires additional control input, because I only need the ability to route to the same instance and dynamic batches. Is there an easy solution? Thanks!
Beta Was this translation helpful? Give feedback.
All reactions