Published Date:

April 22, 2025

Beyond seeing: ChatGPT now reasons visually

Yes, the former models of ChatGPT have been able to see and read from images, but this is different. This week, OpenAI introduced o3 and o4‑mini, new ChatGPT models that pause to think more deeply and can even “see” and work with images on their own.

Evelyn Le

Strategic Product Lead, Stay Ahead, FlexOS

Beyond seeing: ChatGPT now reasons visually

o3 and o4‑mini don’t just “see” images, they think through them, transforming raw visuals into multi‑step insights such as converting a hand‑drawn flowchart into executable pseudocode, extracting key metrics from a sales chart to build forecasts, or turning a sketched floor plan into step‑by‑step navigation instructions.

Deeper thinking across tasks: o3 tackles complex coding, math, science, and real‑world problems by reasoning longer before replying. In expert tests, it makes about 20 percent fewer major mistakes than earlier models on tough challenges.
Think with images: Upload a photo of a whiteboard, diagram, or sketch, and the model factors that visual into its reasoning. It can rotate, zoom, or enhance blurry or reversed images to understand exactly what’s shown.
Automatic tool use: Without extra instructions, these models decide when to search the web, run Python code, analyze files, or generate images—chaining all those steps into one seamless answer in under a minute.
Speed and affordability: o4‑mini is a lean version tuned for faster replies and lower cost. It excelled in math, coding, and visual tasks, scoring 99.5 percent on the 2025 AIME when allowed to run Python, while supporting higher usage limits.

These models initiate a new trend online. By giving ChatGPT true “vision” and autonomy, o3 and o4‑mini can now turn any photo into a reverse location hunt. They’ll crop, rotate or zoom into subtle visual hints, building façades, street signs or skyline profiles, and then seamlessly trigger web searches to match those clues against online maps and landmark databases.

The geoguessing power of o3 is a really good sample of its agentic abilities. Between its smart guessing and its ability to zoom into images, to do web searches, and read text, the results can be very freaky.

I stripped location info from the photo & prompted “geoguess this” pic.twitter.com/KaQiXHUvYL
— Ethan Mollick (@emollick) April 17, 2025

Prompt to try in ChatGPT with o3:

“Here’s a photo of our Q1 sales dashboard from the whiteboard. Can you extract the key trends, highlight any anomalies, and suggest 2 actions we should consider for Q2?”

Your AI Team: Claude’s Google Integration, Gemini 2.5 Flash, and ChatGPT’s personalized web search.

Every week, I report on the top updates to your favorite AI tools. This week:

Claude’s integration with Google Workspace

Anthropic introduced Claude’s new Research mode, a feature that lets Claude run its own web searches and pull in Google Workspace files to deliver source‑linked answers, no extra plugins required.

Here’s what changed and why it matters:

Automated, multi‑step web research – Claude now asks follow‑up questions to itself, opens pages, and returns a stitched‑together answer with clickable citations. This moves it closer to a junior analyst rather than a simple chatbot.
One search across public web + Workspace – When enabled, Claude can read Gmail, Docs, Sheets, and Calendar alongside public sites, so context lives in one place and answers reflect both external facts and your internal knowledge base.

Research is available this week for Claude Max, Team, and Enterprise users in the US, Japan, and Brazil, with Pro‑tier access “coming soon,” so plan pilot tests before scaling.

Quick Hits from your favorite AI tools:

ChatGPT’s Memory now enhances web search. With Memory + Browse, ChatGPT personalized its real-time web browsing based on your saved context, giving more relevant and consistent answers across sessions.
Microsoft Excel Copilot can now find, analyze, and insert data from the web directly into spreadsheets. Now you can directly fill in competitor pricing, market stats, and products information. For example: “Find the stock price for Apple in the last 3 months” and insert it in your sheet.
Perplexity integrates GPT-4.1 and enhances mobile features. Pro users can now select OpenAI’s GPT-4.1 model for more accurate and detailed responses.
Google Docs is adding audio overviews that let you listen to a summary of your document, making it easier to review and refine your writing on the go.
Google Sheets now comes with Gemini AI that can clean, visualize, and analyze data, generate charts, and even suggest next steps.

Read more news at the end.

Tutorial: Turn a Single Image into a Social Recap Post

In 5 Steps: Turn a Single Image into a Social Recap Post with o3

Write a contextual, informative, and discussion-worthy LinkedIn post from just one image, no extra briefing, just visual input. Powered by OpenAI o3’s new visual reasoning.

In this guide, we’ll walk through five simple steps to do it.

New AI Tools to Try: Wondersites, Shotup AI, and Currents

Looking for something fresh to add to your creative or business toolkit? These three AI tools caught my attention this week:

Wondersites: From Notion doc to stunning website, instantly.

If you can write in Notion, you can build a website with Wondersites. Just describe your business, and Wondersites turns your words into a clean, professional site, no code, no design skills, no stress. Perfect for creators, consultants, and founders who want an impressive online presence in minutes.

Try Wondersites now >>>

Shotup AI: Your screenshots as a smart memory assistant

Shotup.ai transforms your screenshot collection into an intelligent, searchable archive. By analyzing the context and visuals of each image, it allows you to effortlessly retrieve information, set reminders, and receive AI-generated insights based on your saved content.

Try Shotup AI now >>>

Currents: See what the world is thinking, right now.

Currents is your real-time radar for what people are talking about online. It scans global news, social posts, forums, and blogs to surface emerging trends as they happen before they hit the mainstream.

Start trying Currents >>>

OpenAI’s CPO on must-have skills in the age of AI, along with more crucial AI stories

Every day, Daan, Wendy, and I read all the AI news so that you don't have to.

Here are the must-read stories of the week:

OpenAI’s CPO on how AI changes must-have skills, moats, coding, startup playbooks, more | Kevin Weil

OpenAI’s Chief Product Officer Kevin Weil breaks down how AI is reshaping the skills we need, the startup playbook, and what “defensibility” looks like in a world of fast-moving models.

Famed AI researcher launches controversial startup to replace all human workers everywhere

A high-profile AI researcher has launched a bold startup aiming to fully automate human labor, reigniting debates on ethics, regulation, and the role of humanity in the age of AI.

Want to Use AI as a Career Coach? Use These Prompts.

Practical prompts to turn AI into your personal career coach, helping you reflect on goals, navigate challenges, and plan your next move with clarity.

AI for Strategy, Responsible Adoption, and Prototyping: From the Community

Every day, Lead with AI PRO members discuss practical ways to benefit from AI in their work and organizations. This week's highlights include:

Can AI help you make better everyday decisions? Emma Eichbaum explored using ChatGPT-4o to weigh whether she should book her next flight with cash or miles. She shared two distinct prompts and their very different outputs. Review her prompts and ChatGPT’s responses HERE.
Wendy McEwan turned to Canva to create her video intro and outro for a conference speech and was blown away by the “Ask Canva” feature powered by AI. Her verdict: it’s an exceptional coach and co-creator for non-designers.
Can AI replace jobs? Daan opened this big question to the community and most said yes. Elissa Shorrock emphasized that AI skills are no longer optional - as roles continue to evolve, proactivity is key.
Henrik Jarleskog and Johan Grönstedt brought in insights from the World Economic Forum’s Future of Jobs 2025 report and the forecasted 2030 skills quadrant. Their takeaway: what we once called “soft skills” are now the gold standard. Johan shared a personal reflection: “The more I’ve worked with AI, the more I’m investing in what’s fundamentally human - that’s the value I bring to the table.”

Don't want to miss more insights and conversations like these?

Then it's time to upgrade to PRO:

Join The Leading Business AI Community

If you made it this far, reply and tell me what you'd love AI to take over in your daily workflow.

Also, please forward this newsletter to a colleague and ask them to subscribe.

If you have any other questions or feedback, just reply here or inbox me.

See you next week,

Daan van Rossum

Host, Lead with AI

Evelyn Le

Strategic Product Lead, Stay Ahead, FlexOS

Unlock the Full “In 5 Steps” Series

This step-by-step guide is exclusively available for Lead with AI PRO membership.
‍
🚀 With Lead with AI PRO, you’ll get:
‍✅ Access to expert-crafted step-by-step guides
✅ AI-powered workflows to boost productivity
✅ Exclusive tools and resources for smarter work
‍
Upgrade to Lead with AI PRO and access all premium content instantly.

Already a member?

Access on our Members' Site

Beyond seeing: ChatGPT now reasons visually

Deeper thinking across tasks: o3 tackles complex coding, math, science, and real‑world problems by reasoning longer before replying. In expert tests, it makes about 20 percent fewer major mistakes than earlier models on tough challenges.
Think with images: Upload a photo of a whiteboard, diagram, or sketch, and the model factors that visual into its reasoning. It can rotate, zoom, or enhance blurry or reversed images to understand exactly what’s shown.
Automatic tool use: Without extra instructions, these models decide when to search the web, run Python code, analyze files, or generate images—chaining all those steps into one seamless answer in under a minute.
Speed and affordability: o4‑mini is a lean version tuned for faster replies and lower cost. It excelled in math, coding, and visual tasks, scoring 99.5 percent on the 2025 AIME when allowed to run Python, while supporting higher usage limits.

The geoguessing power of o3 is a really good sample of its agentic abilities. Between its smart guessing and its ability to zoom into images, to do web searches, and read text, the results can be very freaky.

I stripped location info from the photo & prompted “geoguess this” pic.twitter.com/KaQiXHUvYL
— Ethan Mollick (@emollick) April 17, 2025

Prompt to try in ChatGPT with o3:

“Here’s a photo of our Q1 sales dashboard from the whiteboard. Can you extract the key trends, highlight any anomalies, and suggest 2 actions we should consider for Q2?”

Practical Tips for the AI-Driven Workplace

Get real strategies AND implementation guides from business leaders delivered to your inbox every Tuesday.

Beyond seeing: ChatGPT now reasons visually

Practical Tips for the AI-Driven Workplace

Your AI Team: Claude’s Google Integration, Gemini 2.5 Flash, and ChatGPT’s personalized web search.

Claude’s integration with Google Workspace

Quick Hits from your favorite AI tools:

Practical Tips for the AI-Driven Workplace

Practical Tips for the AI-Driven Workplace

Tutorial: Turn a Single Image into a Social Recap Post

​​In 5 Steps: Turn a Single Image into a Social Recap Post with o3​

New AI Tools to Try: Wondersites, Shotup AI, and Currents

OpenAI’s CPO on must-have skills in the age of AI, along with more crucial AI stories

Practical Tips for the AI-Driven Workplace

AI for Strategy, Responsible Adoption, and Prototyping: From the Community

Daan van Rossum​

Evelyn Le

Practical Tips for the AI-Driven Workplace

Unlock the Full “In 5 Steps” Series

Beyond seeing: ChatGPT now reasons visually

Practical Tips for the AI-Driven Workplace

In 5 Steps: Turn a Single Image into a Social Recap Post with o3

Daan van Rossum