This is cool, but only the first part in extracting a ML model for usage. The second part is reverse engineering the tokenizer and input transformations that are needed to before passing the data to the model, and outputting a human readable format.
Would be interesting if someone could detail the approach to decode the pre-post processing steps before it enters the model, and how to find the correct input encoding.
Boils down to "use Frida to find the arguments to the TensorFlow call beyond the model file"
Key here is, a binary model is just a bag-of-floats with primitively typed inputs and outputs.
It's ~impossible to write up more than what's here because either:
A) you understand reverse engineering and model basics, and thus the current content is clear you'd use Frida to figure out how the arguments are passed to TensorFlow
or
B) you don't understand this is a binary reverse engineering problem, even when shown Frida. If more content was provided, you'd see it as specific to a particular problem. Which it has to be. You'd also need a walkthrough by hand about batching, tokenization, so on and so forth, too much for a write up, and it'd be too confusing to follow for another model.
TL;Dr a request for more content is asking for a reverse engineering article to give you a full education on modal inference
> It's ~impossible to write up more than what's here
Except you just did - or at least you wrote an outline for it, which is 80% of the value already.
The more impolite version of this basically says "If you can't figure out you're supposed to also use Frida to check the other arguments, you have no business trying." I agree, though, wrote a more polite version.
- [deleted]
> TL;Dr a request for more content is asking for a reverse engineering article to give you a full education on modal inference
I don't understand what you mean: I have no clue about anything related to reverse engineering, but I ported the mistral tokenizer to Rust and also wrote a basic CPU Llama training and inference implementation in Rust, so I definitely wouldn't need an intro to model inference…
You're also not the person I'm replying to, nor do you appear in any of this comment chain, so I've definitely not implied you need an intro to inference, so I'm even more confused than you :)
I share the sentiment of the person you're responding to, and I didn't understand your response, that's it.
This is a good comment, but only in the sense it documents a model file doesn't run the model by itself.
An analogous situation is seeing a blog that purports to "show you code", and the code returns an object, and commenting "This is cool, but doesn't show you how to turn a function return value into a human readable format" More noise, than signal.
The techniques in the article are trivially understood to also apply to discovering the input tokenization format, and Netron shows you the types of inputs and outputs.
Thanks for the article OP, really fascinating.
Just having the shape of the input and output are not sufficient, the image (in this example) needs to be normalized. It's presumably not difficult to find the exact numbers, but it is a source of errors when reverse engineering a ML model.
Right, you get it: it's a Frida problem.
If you can't fix this with a little help from chatgpt or Google you shouldn't be building the models frankly let alone mucking with other people's...
- [deleted]