view post Post 744 New open Vision Language Model by @Google : PaliGemma ππ€π Comes in 3B, pretrained, mix and fine-tuned models in 224, 448 and 896 resolutionπ§© Combination of Gemma 2B LLM and SigLIP image encoderπ€ Supported in transformersPaliGemma can do..π§© Image segmentation and detection! π€―π Detailed document understanding and reasoningπ Visual question answering, captioning and any other VLM task!Read our blog π hf.co/blog/paligemmaTry the demo πͺ hf.co/spaces/google/paligemmaCheck out the Spaces and the models all in the collection π google/paligemma-release-6643a9ffbf57de2ae0448ddaCollection of fine-tuned PaliGemma models google/paligemma-ft-models-6643b03efb769dad650d2dda
view post Post 1740 two new VLM benchmarks! π€©BLINK: evaluates tasks that humans can solve within a blink π BLINK-Benchmark/BLINKSEED-2-Plus: multichoice questions on charts, maps, webs π AILab-CVC/SEED-Bench-2-plus
Fun Spaces π€ΉββοΈ Running on CPU Upgrade 6.56k π©βπ¨ AI Comic Factory Create your own AI comic with a single prompt Running on A100 851 π πΌοΈ LoRA the Explorer
Running on CPU Upgrade 6.56k π©βπ¨ AI Comic Factory Create your own AI comic with a single prompt
LLM Playgrounds π Running 934 π¬ Falcon-180B Demo Sleeping 161 βπΌ Co Write With Llama2 Runtime error 80 π¦π LLMs As Chatbot Running on CPU Upgrade 1.2k π» Explore Llamav2 With TGI
Running on Zero 33 π BLIP2 with transformers BLIP2 (cutting edge image captioning) in π€transformers