关于
This skill empowers Claude with the Google Gemini API suite, enabling superior image reasoning, long-form audio transcription, video scene analysis, and high-fidelity media generation. It bypasses standard vision limitations by providing access to specialized models like Imagen 4 for text-to-image and Veo 3 for text-to-video. Whether you are extracting structured data from complex PDFs, generating marketing assets from text prompts, or performing temporal analysis on hours of video footage, this tool provides the scripts and reference patterns needed to handle massive context windows up to 2 million tokens efficiently with built-in API key rotation and media optimization.