FastVLM: Efficient vision encoding for vision language models

(github.com)

367 points | by nhod 7 months ago ago

79 comments