Generative AI is Having its Multimodal Moment – So is Search

March 14, 2023

Team Objective

GPT-4 came out today and set a new milestone in Large Language Model (LLM) performance across the board. It's more creative, reliable, and can understand more nuanced instructions, but GPT-4 is especially notable for two things:

Multimodality. It can understand different types of inputs. In this case, text and images.
Greater reasoning ability.

Leveraging multimodality, GPT-4 achieves human-level performance on various professional and academic benchmarks.

What Multimodality Enables
GPT-4 can take a combination of text and image inputs and generate text outputs. It can understand the content in images, diagrams, illustrations, and other visual types, and infer things about them.

Let’s look at an example. Below is a prompt involving an image and some text asking GPT-4 what's “unusual” about the image, and GPT-4 responds with the salient point.

‍

You can ask it to explain a meme step by step. It even has achieved human-like performance on various academic tests involving visual inputs (diagrams, equations, etc.).

Multimodal Search Has Arrived
While GPT-4 is opening a new world of possibilities in generative AI, multimodal LLMs also open new possibilities in search. In the same way that GPT-4 can see images and understand complex text, search systems can now read text and images to prepare the perfect results.

At Objective, we’re building multimodal search to produce astonishing results by matching the way you think and speak. In a world that is moving increasingly to non-text media, the possibilities are endless.

We’re reinventing core search experiences for media companies, e-commerce, marketplaces, and much more.

Want to see what Multimodal can do for you? Get in touch.

References:
https://openai.com/research/gpt-4

https://www.objective.inc/demos/comparisons/stock-photo?query=renaissance+architecture

‍

Generative AI is Having its Multimodal Moment – So is Search

Subscribe to our Newsletter!

We recommend you to read

Get Quantitative and Comprehensive Insights into your Search Quality with Auto-Evaluations

How We Built Search Over All of Wikipedia in 30 minutes with 34% Better Relevance

Automatically Generate Search Queries Tailored to Your Data. Query Generation Moves into Private Beta!

Build Contextually Aware Search with Ranking Signals, Now in Private Beta

Modern Ecommerce Is About Search at Every Layer of the Experience

Finetune Your Search with a Single API Call: Objective Finetuning Launches into Private Beta

Filter Your Search Results by Geography — Now in Private Beta!

Zoom In Semantic Text Searches with Highlights, Now in Self-Service Private Beta

Styl Makes Search Fashionable with AI-Native Search & Discovery