If Ogden Nash had rewritten Hamlet's famous soliloquoy, he might have come up with this:
Semantic Search
Wednesday, February 15, 2023
ChatGPT Teaches Ogden Nash How to Do Shakespeare
The Beatles Goes to Prison -- Folsom Prison -- and rewrite "The Fool on the Hill" (with a little help from ChatGPT)
Here's an example of how well ChatGPT can understand things: I told it to rewrite the lyrics to "The Fool on the Hill" to match the rhythm and meter of Johnny Cash's famous hit "Folsom Prison Blues."
Here's what it came up with:
Well, I hear there's a fool, up on that hill, Watchin' the world, with a heart so still, He ain't like the rest, he's got his own pace, And he sees the world, in a different way.
He don't care 'bout the money, or the fame, He's got his own way, and he won't play the game, He's just happy with his own simple ways, And he'll keep on smilin', through all his days.
The fool on the hill, he's got his own view, And he don't care 'bout the things that we do, He'll keep on smilin', through thick and thin, And he'll be happy, with the life he's in.
So don't you judge him, for his different way, He's got his own rhythm, and he'll dance all day, And if you listen, you might just hear, The beat of his heart, and the rhythm of his cheer.
The fool on the hill, he's a lesson to learn, That life ain't always, what it may seem, So take a step back, and hear his tune, And you might just find, your own new groove.
Oh, the fool on the hill, he'll keep on goin', Through all of life's twists and turns, he'll keep on knowin', That the world may change, but he'll stay the same, And he'll keep on smilin', through sunshine and rain.
The Dawn of Semantic Search with the Advent of GPT-3 and ChatGPT
This blog has lain fallow for a couple of years, but has now reawakened with the help of openai's large language models. ChatGPT and GPT-3 are amazing tools. It is not only ChatGPTs impressive ability to answer questions but, moreover, it is its ability to dialogue with humans and understand our questions that amazes me.
Here's an example: Years ago I struggled to find a Prolog predicate that would efficiently generate all the subsets of a given set. I looked on stacktrace, at SWI-Prolog's web site. I downloaded code that claimed to work but to no avail.
Today, in a matter of minutes, ChatGPT was able to generate this very simply pair of predicates that can generate -- very efficiently, as far as I can tell -- all the subsets of a given set:
The subsets/2 predicate (which I would probably relabel subset/2) is so simple
Sunday, July 6, 2014
Welcome to Semantic Search
The first subject is the little predicate 'is about'. We might say that a web page is about Babe Ruth. What does that mean? It may mean that the string 'Babe Ruth' occurs in the web page. Or it may mean that other semantically related strings like 'The Babe', 'Ruth', or 'George Herman Ruth' occur within the page.
But there is a problem with 'is about'. How do we know that a page is REALLY about Babe Ruth and doesn't just pretend to be? Maybe it is really a porno page that is trying to lure kids who are fascinated by Babe Ruth. And even if the page is truly about Babe Ruth, how do we know that it is authoritative? Is it a blog expressing some guy's opinion about Ruth, or is it a carefully researched and well documented / cited study of the life and deeds of Babe Ruth?
Furthermore, there are many adverbial qualifiers that could be added to 'is about', such as 'is only about', 'is mostly about', 'is also about'. The first of these, 'is only about', might suggest that the web page contains information only about Babe Ruth and not, say about Lou Gehrig, except perhaps incidentally. And what does 'is mostly about' mean anyway?
So, you see, the subject of semantics is infinitely more nuanced and more difficult than the subject of syntax. The latter, after all, seems to be mostly limited to parsing string patterns of various kinds. But teasing the meaning out of a word is highly dependent on many subtle, contextual factors like the mood and intent of the speaker, the sentence in which it is embedded, the paragraph in which the sentence is embedded, etc. There is also something that the hearer (or reader, in the case of a web page) brings to communication. We've all heard the expression 'said apples, but heard oranges,' meaning, of course, that hearer simply heard something completely indifferent than what the speaker intended.
All of this makes web search based on semantic intent of the searcher and the semantic content of web pages very difficult to accomplish accurately. What do search engines do to associate search keywords with indexed web pages? That is the subject of the next blog post.