Ged Carroll

NORA

Published: (Updated: ) in ideas | 想法 | 생각 | 考える, innovation | 革新 | 독창성 | 改変, jargon watch | 術語定義 | 용어의 정의 | 用語の定義, media | 媒體 | 미디어 | メディア, online | 線上 | 온라인으로 | オンライン, web of no web | 無處不在的技術 | 보급 기술 | 普及したテクノロジー by .

The estimated reading time for this post is 686 seconds

Last week I heard the acronym NORA mentioned with regards the kind of problems that Microsoft’s algorithm could solve. NORA stands for no one real answer. Search is already pretty good at answering questions like ‘what time is it in Osaka’ or ‘what is the capital of Kazakhstan’.

In the mid-2000s NORA would have been called ‘knowledge search‘ by the people at Google, Yahoo! and Bing – who were the main search engine companies. So its not a new idea in search, despite what one might believe based on the hype around chatbot enabled search engines. ChatGPT and other related generative AI tools have been touted as possible routes to get to knowledge search.

Knowledge search

Back when I worked at Yahoo! the idea of knowledge search internally was about trying to carve out a space that useful and differentiated from Google’s approach as defined by their mission:

To organise the world’s information and make it universally accessible and useful

Our approach to search – Google

Google was rolling out services that not only searched the web. It also covered maps, the content of books including rare libraries and academic journals. It was organising the key news stories and curating which publications were seen in relation to that story. It could tell you the time elsewhere in the world and convert measures from imperial to metric.

Google’s Gmail set the standard in organising our personal information, making the email box more accessible and searchable than it had been previously. We take having a journaled hard drive for granted now, but at one time Google Desktop put a search of the files on your computer together with online services in one small search box.

Google Desktop Mac

Being as good as Google was just table stakes. So when I was at Yahoo! we had our own version of Google Desktop. We bought Konfabulator, that put real time data widgets on your desktop and were thinking about how to do them on the smartphone OS of the time Nokia’s Symbian S60. Konfabulator’s developer Arlo Rose went on to work on Yahoo!’s mobile experiences and Yahoo! Connected TV – a photo-smart TV system that was before the modern Apple TV apps. Tim Mayer led a project to build out an index of the web for Yahoo! as large, if not bigger than Google’s at the time. And all of these developments were just hygiene factors.

My colleagues at Yahoo! were interested in opinions or NORA; which is where the idea of knowledge search came in. Knowledge search had a number of different angles to it:

The ChatGPT type services in search are considered to provide an alternative to human-powered services. They create NORA through machine generated content based on large data sets trawled from the web.

Energy consumption

A conventional Google internet search was claimed to consume 0.3 watt/hours of power according to Google sources who responded to the New York Times back in 2011. This was back when Google claimed that it was processing about one billion (1,000,000,000) searches per day. It accounted for just over 12 million of the 260,000,000 watt hours Google’s global data centres use per day. The rest of it comes from app downloads, maps, YouTube videos.

But we also know that the number of Google searches ramped up considerably from those 2011 publicly disclosed numbers

Google global search volume

The driver for this increase was mobile search including more energy intensive Google Lens and voice activated searches thanks to Android.

Large language models (LLMs) are computationally intensive and this will result in a corresponding rise in energy consumption. That also has implications in terms of business profit margins as well as ESG related considerations.

Legal liabilities

With NORA content being created by machine learning services, it might be different to the previous generation of knowledge search services. These services were platforms, but machine learning services become publishers.

This becomes important for a few reasons

“no provider or user of an interactive computer service shall be treated as the publisher or speaker of any information provided by another information content provider”

Communications Decency Act of 1996.Section 230

Section 230 has been repeatedly used to regulate Facebook, Google et al in a lax manner as they haven’t been ‘publishers’, with ChatGPT this may change. The question of whether an algorithm is a creator has some precedence. Financial reporting has used machine learning to create news reports on company financial results over a number of years. Combine that with the general political antipathy towards Meta and Alphabet from both of the main US political parties and things could get interesting very fast.

It is interesting that OpenAI is putting a lot of thought around ethics in LLM, which will impact future services and they probably hope stave off regulation.

Regulated industries and liability

Given an LLM’s ability to make things up it can:

This is just the tip of the iceberg that NORA content powered by LLMs face.

Business model disruption

Search advertising as we know it has been the same for the past two decades. The disruption to the look and feel of search results through Bing’s chat response has a negative impact on Google’s advertising model with the search ads along the top and down the right hand side of the search engine results page. Instead you’ll end up with the ‘correct’ answer and no reason to click on the search adverts.

Currently if a non-relevant site shows up in Google. The lack of relevance is blamed on the site rather than the search engine. However an error in a machine learning created NORA response will see the search engine blamed.

Which is pretty much what happened when Google demonstrated their efforts in the area. Inaccuracies in a demonstration held in Paris cause the share price of Alphabet to decline by 7 percent in one day. Technology news site TechCrunch even went as far as to say that Google is losing control.

Microsoft probably doesn’t have a lot to lose in Bing. So integrating ChatGPT’s LLM might give them a few percentage points of search market share. Microsoft thinks that each percent gain would be worth 2 billion dollars in extra revenue.

The 2 billion number is an estimate and we don’t know how the use of NORA results generated by LLM will affect bidding on search keywords. That 2 billion might be a lot less.

Is NORA the user problem that Google and Bing’s use of LLMs are fixing?

Around about the time that Google enjoyed a massive uptake in search it also changed search to meet a mobile paradigm. Research type searches done by everyone from brand planners to recruiters and students have declined in quality to an extent that some have openly questioned is Google dead?

Google search box

Boolean search no longer works, Danny Sullivan at Google admitted as much here. While Google hasn’t trumpeted the decline of Boolean search, ‘power’ users have noticed and they aren’t happy. That narrative together with the botched demo the other week reinforced each other.

Unfortunately, due to the large number of searches that don’t require Boolean strings, Google wasn’t going to go back. Instead, chat-based interfaces done right might offer an alternative for more tailored searches that would be accessible to power users and n00bs alike?

Technology paradigm shift?

At first the biggest shock that myself and others had seeing the initial reports was how Google and Microsoft could have been left in the dust of OpenAI. Building models requires a large amount of computing power to help train and run.

Microsoft had already been doing interesting things in machine learning with Cortana on Azure cloud services and Google had been doing things with TensorFlow. Amazon Web Services provides a set of machine learning tools and the infrastructure to run it on.

Alphabet subsidiary DeepMind had already explored LLM and highlighted 21 risks associated with the technology, which is probably why Google hadn’t been looking for a ChatGPT type front end to search. The risks highlighted included areas such as:

Stories that have appeared about ChatGPT and Bing’s implementation of it seem to validate the DeepMind discussion paper on LLMs.

The Microsoft question of why they partnered with ChatGPT rather than rolling out their own product is more interesting. Stephen Wolframs in-depth explanation of how ChatGPT works is worth a read (and a couple of re-reads to actually understand it). Microsoft’s efforts in probabilistic machine learning looks very similar in nature to ChatGPT. As far back as 1996, then CEO Bill Gates was publicly talking about how Microsoft’s expertise in Bayesian networks as a competitive advantage against rivals. Microsoft relied on research and the Bayesian network model put forward by Judea Pearl which he describes in his book Heuristics.

Given the resources and head start that Microsoft had, why were they not further along and instead faced being disrupted by OpenAI? Having worked in the past with Microsoft as a client, I know they won’t buy into anything that they can build cheaper. That raises bigger questions about Microsoft’s operation over the past quarter of a century and its wider innovation story to date.

Flash in the pan

At times the technology sector looks more like a fashion industry driven by fads more than anything else. A case in point being last years focus on the metaverse. The resulting hike in interest rates has seen investment drop in the field. Businesses like Microsoft and Meta have shut down a lot of their efforts, or have scaled back. It is analogous to the numerous ‘AI winters‘ that have happened over the past 50 years as well.

Bing’s implementation of LLM is already garnering criticism from the likes of the New York Times. This new form of search may end up being a flash-in-the-pan like Clubhouse. The latent demand for NORA in search will still be there, but LLM might not be the panacea to solve it. Consumers may continue to rely on Reddit and question-and-answer platforms like Quora as an imperfect solution in the meantime.

In summary….