What is the current stage of AI? We have a strong feeling that something coming up is disrupting the way we work and live. The questions is, how far are we yet to go?
TLDR: distance wise very far; but we are approaching it with accelerating speed.
Let's look at the history of search engine.
Long before we have "Google" and long before many people have the concept of search engine, there was a subject called Information Retrieval. It is the academic name behind search engine and the earliest records of computerised IR dated back to the 1950s. Only a handful of people in the world cared abut the subject back then.
Google's PageRank algorithm popularised the importance of quality as an indepdent factor of relevance (between query and document). All the successors have relevance X quality as the foundational model, and they only differ in how the two terms are broken down to more details.
Further more, personalisation became a standard component of search engine algorithms and the major source was click data (later could incorporate more implicit feedback like scroll/ view).
Technically, we can see a rough timeline:
- Information Retrieval -- when the technology is only of academic interest.
- Search engine -- when the technology is widely commercialized and used by businesses/ corporates. (there was a time when search engine is not a standalone product, but a business solution that empowers "portals").
- Quality advancements (e.g. PageRank) -- when the technology becomes roughly useful for laymans.
- Personalization -- when the technology becomes ambient and users do not have to think any more (now we're used to picking the first result from Google directly; but we had to scroll down and check many results to find the best one 20 years ago).
The tipping point is when "Google it" became synonym of "search it". That was the moment when most people had a chance to incorporate search engine into their daily life and use it as naturally as breathe.
If we compare this to GenAI, the distance seems pretty far. We hear many people are "using ChatGPT" to do something today, simply because we are in our own technologist echo chamber. If you ask a layman to explain "ChatGPT", I doubt how many can spell out "G-P-T" and explain the meaning of it, let alone how the "chat" business was related with GPT.
Think of a question: any ideas of birthday gift for a 6 year old girl? -- You would find the answer pretty dumb: Let's use an Information Retrieval system to find the top parents forum and check out their sub-forum about birthday gifts to gather some ideas. -- A more natural answer for our moms and dads are like: um, Let's Google it.
So the key here is, if GenAI/ LLM/ ChatGPT are still being consciously used, the technology has not yet exploded.
The most close moment to tipping point of GenAI was when I heard someone speaking "I poe-d it" yesterday. Maybe we'll hear more "Let's Bing it" and "Let's Bard it" later, but those do not sound very natural to me (as a non-native English speaker). It will be interesting to see which noun first become verb in the GenAI world, and we'll know the technology is sophisticated and generally available to everyone.
Although the distance looks very far, the speed we are approaching the "Google it" moment is very fast and the pace is still accelerating.
We already see many SMEs that use GenAI to build solutions which aggregate to bigger products and business values. This is very like the era of portal, when search engine is not an end user product but a key component to some business problems.
Also, the Microsoft + ChatGPT experiment was quite successful first step to drive the quality advancements, i.e. launching a billion scale alignment with real user feedback. This is very important step, as LLM is a huge black box that even the inventor does not know what is inside until the querying time, but it is hard to enumerate the entire query space and hard to check the results even if the space is enumerated. This is not a new problem in LLM. In the early years of search engine, nobody knew exactly what content the crawlers had collected. Only when many real users were onboarded, the search engine operators found out malicious uses like searching for porn/ drugs/ arms/ personal information/ ...
If we map the above two observations into the 4 stages of search engine, we can already see good jobs done for stage 2 and stage 3. Mark Zuckerberg, in a recent interview by Lex Fridman, emphasized personalization of GenAI and shared his vision of creator econonmy multiple times. This area maps to the stage 4 we listed above, and saw only limited trials so far. However, when we review the timeline of search engine, the "Google it" moment happened around stage 2 and stage 3, long before stage 4 was mature. That is why I find the speed we are approaching the "Google it" moment fast and accelerating.
Now, imagine there is a time machine to go back stage 2/ stage 3, what product/ business would you create? What would be the analogy in the GenAI world?
Well, I may revisit that topic many times in coming months, and I'm sure a lot of my predictions could go awefully wrong.
Only one thing I'm certain, if I could go back to 1997-1998, is that I would see Internet Bubble coming up... The then Internet companies building the infrastructure (routers, switches, base stations, fibres, etc) would run into serious financial crisis due to arm race. The remaining infrastructure after bubble would become cheap and suffice for years to come. The developers working on "just an application using Internet" became the Internet companies later on.