MiniCPM-V 2.6
https://github.com/OpenBMB/MiniCPM-V
MiniCPM-V is a series of end-side multimodal LLMs (MLLMs) designed for vision-language understanding. The models take image, video and text as inputs and provide high-quality text outputs. Since February 2024, we have released 5 versions of the model, aiming to achieve strong performance and efficient deployment. The most notable models in this series currently include: