Skip to main content

One doc tagged with "vision"

8 Multi-modal Prompting

Vision-text prompting, document understanding, video analysis, and cross-modal reasoning with GPT-4V, Claude, and Gemini