China bans tech companies from buying Nvidia’s AI chips | FT – the Nvidia ban is an interesting move by the Chinese government. I don’t think it’s just about putting pressure on their semiconductor companies and foundries. I think it also steers the software industry and approach to AI as well looking for more computationally efficient models. China can do comparable computes, using more lower spec chips and more power.
At the moment the leading edge models in the west are taking a hardware led approach rather like putting larger capacity engine in a car a la an old school hotrod. China is forcing its technology sector to take a more holistic approach.
Having Nvidia lobbying the US for permission to sell Blackwell in China is a secondary benefit and not the hard block people think it is. Compute jobs are already done abroad to get around the ban anyway. Its easy to move SSDs from China to Malaysia to run it on local data centres.
Can China really make its consumers spend? | Jing Daily – After decades of export success, country’s bet on domestic consumption to propel growth bumps up against beliefs about money and security. – They’ve got more chance of increasing the number of children born, the beliefs are that engrained
I love some of the apparently random things that Toyota under Akio Toyoda do. From the GR Yaris to this documentary on a vintage Komatsu steel press that was instrumental in Toyota’s first car factory and still is doing sterling work.
The dialogue is in Japanese but English subtitles are available.
Motorsport fandom is strange. Back when I was a child motorsport fandom was a bunch of anoraks – literally. There was a category of clothing that you could buy from mail order catalogues and retailers like Demon Tweeks called a rally jacket. This was a coat good enough to deal with some cold wet weather branded by a car company or a tobacco brand.
Motorsport fandom, in particular single-seater race series are starting to see very different types of fans who learned their supporting ideas from the K-pop armies which are a symbiot of the artist promotion machine. While both promotion machine and fans are separate with very different tactics, they were united by a common goal to a point.
This isn’t the first time that media has brought in new fans, gaming created fans in the past. But the current motorsport fandom is interesting because of the cultural friction that it brings for drivers and legacy fans. From hate campaigns and death threats against drivers to ‘idol’ style objectification – women are demonstrating traits that would define toxic masculinity. All wrapped up in pastel tinted social media posts and Etsy products – so that makes it all fine, doesn’t it?
WPP has its next CEO – but what do clients make of heir apparent? – It’s not indifference. It’s pragmatism. Marketers like this don’t want to buy into the idea that a leadership change signals sweeping transformation. After all, Rose doesn’t start until September. Until then, they’d rather stay focused on the present, not the promise.
Ryan Kangisser, a bellwether for client perspective thanks to his proximity to them as the chief strategy officer at MediaSense, expanded on the point: “I do think that often the industry cares more about these sorts of appointments than clients do. Especially if clients have got a really solid client lead, or business lead, then they’re the people who they feel are the ones driving their business.”
The Financial Times opined on the obsolescence of business cards. This has been a common theme for the past quarter of a century, so whether or not it’s actually news is up for debate.
Business cards have been a surprisingly accurate marker of my career’s evolution. Before college, when I was working in laboratories to save up, business cards were strictly for management. If anyone needed to reach me, they’d receive my name and extension number scribbled on a company compliments slip.
Fast forward to my early agency days, and changing my business cards became the immediate priority after receiving a promotion letter. I vividly recall discussing new cards with our office manager, Angie, to reflect my new title: from Account Executive to Senior Account Executive. While that promotion enabled me to buy my first home, it was the tangible act of updating my business cards that truly solidified that future title for me in my memory.
Building a network was an important part of development in the early part of my career and my manager at the time would ask us each week how many business cards we’d given out as a way of quantifying that development.
Even today in Asian countries, business cards come loaded with cultural symbolism and a distinct etiquette of exchange. The exchange of them is handy as it allows to lay out a model of who is around a meeting table based on the card collection, facilitating easier meeting communications.
Personal organisers
In the mid-1990s, the personal organiser was a staple, its prevalence varying depending on location and budget. These organisers typically featured loose-leaf pages for schedules, an address book, and a system for storing and archiving business cards, even those of people who had moved on. However, by 2001, the media was already concerned about the impending demise of the personal organiser and its potential impact on the business card’s future.
Filofax
Filofax has the reputation for being the most British of brands. It originally started off as an importer of an American product Lefax. Lefax was a Philadelphia-based business which made organisers popular within industry including power plant engineers in the early 20th century.
At that time electricity was considered to be the enabler that the internet is now, and Lefax helped to run power plants effectively and reliably. Filofax eventually acquired Lefax in 1992. During the 1980s, the Filofax became a symbol of professionalism and aspirational upward mobility. I was given one as soon as I started work, I still have it at my parents home. It’s leather cover didn’t even develop a patina, despite the beating it took in various parts of my work life: in night clubs, chemical plants and agency life. Filofax even became part of cinematic culture in the James Belushi film Taking Care of Business also known as Filofaxin many markets.
Day-Timer
In the US, there was the Day-Timer system, which came out of the requirements of US lawyers in the early 1950s and became a personal management tool for white collar workers in large corporates like Motorola – who appreciated their whole system approach. Day-Timer was as much a lifestyle, in the same way that David Allen’s Getting Things Done® (GTD®) methodology became in the mid-2000s to 2010s. Customers used to go and visit the personal organiser factory and printing works for fun. Along the way, other products such as At-A-Glance and Day Runner had appeared as substitute products. Day-Timer inspired the Franklin Planner system; a similar mix of personal organiser and personal management philosophy launched in 1984.
By the mid-1990s, Day-Timer had skeuomorphic PC programme that mirrored the real-world version of the Day-Timer. At the time this and competitor applications would allow print-outs that would fit in the real world Day-Timer organiser. Day-Timer’s move to mobile apps didn’t so well and now it exists in a paper-only form catering to people wanting to organise their personal lives and home-workers.
Rolodex
While the Filofax allowed you take to your world with you, the Rolodex allowed you to quickly thumb through contacts and find the appropriate name.
Back when I first started my first agency job, I was given my first Rolodex frame. I spent a small fortune on special Rolodex business card holders. At my peak usage of Rolodex as a repository for my business contacts, I had two frames that I used to rifle through names of clients, suppliers and other industry contacts.
Rolodex became a synonym for your personal network, you even heard of people being hired for ‘their Rolodex’. For instance, here’s a quote from film industry trade magazine Hollywood Reporter: Former British Vogue Chief Eyes September for Launch of New Print Magazine, Platform (May 8, 2025):
…to blend “the timeless depth of print with the dynamism of digital” with coverage of top creative forces, no doubt leaning into Edward Enninful’s enviable Rolodex of A-list stars, designers and creators gathered through years spent in the fashion and media space with tenures at British Vogue and as European editorial director of Vogue.
If I was thinking about moving role, the first thing I would do is take my Rolodex frames home on a Friday evening. The fan of business cards is as delicate as it is useful. It doesn’t do well being lugged around in a bag or rucksack. Each frame would go home in a dedicated supermarket shopping bag.
The Rolodex was anchored to the idea of the desk worker. The knowledge worker had a workstation that they used everyday. Hot-desking as much the computer is the enemy of the Rolodex. My Rolodex usage stopped when I moved to Hong Kong. My frames are now in boxes somewhere in my parents garage. Doomed not by their usefulness, but their lack of portability.
Personal information management
The roots of personal information management software goes back ideas in information theory, cognitive psychology and computing that gained currency after the second world war.
As the idea of personal computers gained currency in the 1970s and early 1980s, personal information software appeared to manage appointments and scheduling, to-do lists, phone numbers, and addresses. The details of business cards would be held electronically.
At this time laptops were a niche computing device. Like the Rolodex, the software stayed at the office or in the den at home. NoteCards used software to provide a hybridisation of hypertext linkages with the personal information models of the real world. NoteCards was developed and launched in 1987, prefiguring applications like DevonTHINK, Evernote and Notion by decades.
As well as providing new links to data, computers also allowed one’s contacts to become portable. It started off with luggable and portable laptop computers.
Putting this power into devices that can fit in the hand and a coat pocket supercharged this whole process.
Personal digital assistants
Personal digital assistants (PDA) filled a moment in time. Mobile computer data connections were very slow and very niche on GSM networks. Mobile carrier pricing meant that it only worked for certain niche uses, such as sports photographers sending their images though to their agency for distribution to picture desks at newspapers and magazines. While the transfer rate was painfully slow, it was still faster than burning the images on to CD and using a motorcycle courier to their picture agency.
The PDA offered the knowledge worker their address book, calendar, email and other apps in their pocket. It was kept up to date by a cradle connected to their computer. When the PDA went into the cradle information went both ways, contacts and calendars updated, emails sent, content to be read on the PDA pushed from the computer. IBM and others created basic productivity apps for the Palm PDA.
IrDA
By 1994, several proprietary infra red data transmission formats existed, none of which spoke to each other. This was pre-standardisation on USB cables. IrDA was a standard created by an industry group, looking to combat all the proprietary systems. The following year, Microsoft announced support in Windows, allowing laptops to talk with other devices and the creation of a simple personal area network.
This opened the possibility of having mice and other input devices unconstrained by connecting cables. It also allowed PDAs to beam data to each other via ‘line of sight’ connections. The reality of this was frustrating. You would often have to devices an inch from each other and hold them there for an eternity for the data to crawl across. It wasn’t until 1999 that the first devices with Bluetooth or wi-fi appeared and a couple more years for them to become ubiquitous. Unsolicited messages over Bluetooth aka bluejacking started to appear in the early 2000s.
But IrDA provided a mode of communication between devices.
versit Consortium
versit Consortium sorted another part of the puzzle. In the early 1990s the blending of computer systems with telephony networks as gaining pace. A number of companies including Apple, IBM and Siemens came together to help put together common standards to help computer systems and telephony. In 1995, they had come up with the versitcard format for address book contacts, better known now as ‘vCards’. These were digital business cards that could be exchanged by different personal information management software on phones, computers and PDAs. For a while in the late 1990s and early 2000s I would attach my vCard on emails to new contacts. I still do so, but much less often.
The following year the same thing happened with calendar events as well.
Over time, the digital business card came to dominate, via device-to-device exchanges until the rise of LinkedIn – the professional social network.
Faster data networks allowed the digital business card sharing to become more fluid.
A future renaissance for the business card?
While business cards are currently seen outdated in the west, could they enjoy a renaissance? There are key changes in behaviour that indicate trends which would support a revitalisation of business cards.
Digital detox
While information overload has been a turn that has been with us since personal computers, digital detox is a new phenomenon that first started to gain currency in 2008 according to Google Books data. Digital detox as a concept has continued to climb. It has manifested itself with people talking a break from their screens including smartphones. Digital detox has continued to gain common currency.
Creating a need for tangible contact details in the form of a business card in certain contexts.
The pivot of personal organisers
Day-Timer and Filofax didn’t disappear completely. While Day-Timer is no longer a professional ‘cult’, it now helps remote workers organise their own work day at home. They also tap into the needs of people organising their own wedding. The paper plans also gives them a memento of this event in a largely digital world.
If personal organisers continue to exist then real-world business cards would also make sense in those contexts.
Bullet-journaling
Ryder Carroll is known as the ‘father’ of the bullet journal which was a home-made organisation method which was similar to the kind of task lists I was taught to pull together in my first agency role. There were aspects of it that would be familiar to Day-Timer advocates as well.
When the world was going digital Carroll used paper to help organise himself. Carroll tapped into the fact that even computer programmers use paper including notebooks and post-it notes to manage projects and personal tasks within those projects. Carroll took his ‘system’ public via Kickstarter project in 2013.
Bullet journaling provided its users with simplicity, clarity and an increased sense of control in their life. What is of interest for this post, is the move from the virtual back into paper organisation.
Changing nature of work
Hybrid working, remote working and increasing freelance communities in industry such as advertising has affected one’s professional identity. This has huge implications for personal standing and even mental health. Human connection becomes more important via virtual groups and real-world meet-ups. Controlling one’s own identity via a business card at these meet-ups starts to make an increasing amount of sense.
The poisoning of the LinkedIn well
On the face of it LinkedIn has been a wonderful idea. Have a profile that’s part CV / portfolio which allows your social graph of professional connections to move with you through your career. Services were bolted on like advertising, job applications and corporate pages to attract commercial interest and drive revenue.
Over time, LinkedIn has increased the amount of its creator functions, driving thought leadership content that is a prime example of enshitification. 2025 saw ‘thought leaders’ publishing generative AI created posts as entirely their own work.
LinkedIn has become devalued as a digital alternative to the humble business card.
May 2025 introduction – two little ducks (22) edition
Welcome to my May 2025 newsletter, this newsletter marks my 22nd issue. 22 is known in bingo halls and the Spanish national lottery as two little ducks.
In France, 22 is the equivalent of 5-0 in the English speaking world as slang for the police. 22 is an important number for people who believe in numerology. In Hong Kong, 22 is associated with good fortune. This is down to the number sounding similar to ‘easy’ or ‘bright’ in Cantonese.
I hope that you are tricked into thinking I am bright based this newsletter, so let’s jump in. Inspired by catching up with my old DJing partner Griff, this month I enjoyed the unashamedly joyous pumped-up sounds of Blackpool’s AZYR at the Boiler Room x TeleTech Festival in 2023. In particular the transition at the end of the set between Frankyeffe – Save me and Infectious! – I need your lovin’. (Extra trainspotter points if you knew that Infectious! is a homage / remake of N.R.G’s The Real Hardcore from a year earlier). Wear your headphones, it might be divisive playing the set out loud in the office. More bangers from AZYR here.
New reader?
If this is the first newsletter, welcome! You can find my regular writings here and more about me here.
Things I’ve written.
Predicting market share through share of search volume and what the rise of AI likely means.
Reaching a precipice in hydrogen power and trends in Chinese skincare amongst other things.
Books that I have read.
Careless People by Sarah Wynn Williams. Williams account of her time in Facebook had become the most discussed book of the spring in my social circle. I wrote a long review of it here.
The Road to Conscious Machines by Michael Wooldridge examines the profound cultural impact of generative AI, which is currently experiencing a surge in both its cultural influence and practical applications. Drawing parallels to the internet’s transformative impact in the mid-to-late 1990s, where it permeated various aspects of society and fostered rapid adoption, Wooldridge traces the evolution of generative AI as a phenomenon that emerged gradually over the past half-century. Throughout the book, Wooldridge provides a comprehensive historical overview of AI, including the periods of research stagnation known as AI winters. This historical perspective equips readers with a nuanced understanding of the strengths and weaknesses of AI, enabling them to approach AI adoption with a well-informed perspective.
As I finish this newsletter during the bank holiday weekend, my light reading is Rogue Asset by Andy McDermott. McDermott comes from a long line of British authors like Jack Higgins, Len Deighton, Frederick Forsyth and Mick Herron who provide novels aimed at a shrinking pool of readers – men. At least, if oneis tobelieve what’s said in the media. Rogue Asset hinges on the premise that the UK has a unit which assassinates the countries enemies on a regular basis. Think somewhere between The Troubles era Det and the modern deep state trope. Our hero is snared into the plot by being discovered on the run thanks to his online behaviour – which is attributed to GCHQ; (but isn’t as mysterious as it sounds because of the programmatic advertising technology stack). So far so good for what it is. I will let know if it goes downhill as a read next month.
Things I have been inspired by.
Mmrytok
Limitations are often the mother of invention. That seems to be the theory behind mmrytok. Mmrytok allows you to do one post a day. It doesn’t support HTML formatting, it doesn’t allow you to link out and doesn’t have a newsfeed. So it’s easy-to-use because it’s less sophisticated than Geocities was. In this respect it is to social media and blogs what Punkt is to smartphones. In an always-on social time, I have found it liberating to use. You can see my page here. I heard of Mmrytok thanks to Matt Muir’s great newsletter Web Curios.
No, AI isn’t making you dumber
Australian documentary maker ColdFusion put together an interesting video essay on How AI is making you dumber.
Yes, you could argue that under certain attributes the population isn’t as smart as they have been in the past. Just last month I shared an article by John Burn-Murdoch. In the article he shared data of a longitudinal trend across countries and age-groups struggling with concentration, declining verbal and numerical reasoning. The problem with Burn-Murdoch’s article vis-a-vis the ColdFusion video is the timeline.
His article charts a decline further back than the rise of generative AI services. Mia Levitin in an essay for the FT attributed the decline in reading to the quick dopamine hits of social media content.
A college professor interviewed by The Atlantic put the decline in reading amongst his undergraduate students put it down to a practice in secondary education of atomising content. Pupils in high schools were assigned excerpts, poetry and news articles to read, but not complete books. This has impacted the size of vocabulary and grasp of language that students starting university now have.
This isn’t new territory, James Gleick in his book Fasterdocumented the massive acceleration of information through the late 20th century and its effects on the general public. The underlying accelerant was described by Kevin Kelly in What Technology Wants as the technium – a continuous forward progress due to a massively interconnected system of technology.
There were concerns in research as far back as the late 1980s that television could be adversely affecting children’s reading comprehension and attention spans.
TL;DR – with generative AI you could become dumber, if you use it unwisely – but the problem lies with all of us and what we chose to do with our personal agency.
CIA advertise for Chinese spies
The CIA commissioned a couple of high production value adverts that they’ve been running on social media channels. The adverts are designed to encourage Chinese government employees to come forward as an agent. The sales pitch is about taking control.
A translation of the Chinese tagline: ‘The reason for choosing cooperation: to become the master of (one’s own) destiny‘. More details from the FT about the campaign here, and here’s the twoexecutions currently running on YouTube.
It remains to be seen if the campaign will be effective. The Chinese Ministry of State Security managed to roll-up the CIA’s spy network back in 2010-2012. Up to 30 informants in China were executed.
Montirex
Merseyside sports-inspired lifestyle brand Montirex have published a film telling the brand story from its origins to the present day. The brand is expanding beyond its Merseyside roots to get national and international sales.
Trust, attitudes and use of artificial intelligence
A 2025 global study covering some 48 countries was conducted by KPMG in association with the University of Melbourne. Some key insights from the report. Consumer generative AI is being used instead of enterprise options by workers. Generative AI adopters still have self-perceived low AI skills but that doesn’t slow their adoption. There is higher adoption and trust rates in emerging markets than in developed markets.
Year-on-year we are seeing an increase in both distrust and trust for specific AI use cases, indicating that it is becoming a polarising subject. The lowest trust levels is in tech-savvy Finland. More here.
Chart of the month.
McDonald’s Restaurants saw a decline in sales. This was down to low income consumers spending less, while middle class earners still weren’t going into McDonalds. Normally when there is a recession, McDonalds should benefit from the more well-off trading down to McDonalds. Instead, fortunes have diverged into a ‘k-shaped’ recession. Lower income earners are hit, while middle classes aren’t. What Axios called the ‘McRecession‘.
Things I have watched.
Tony Arzenta (also known as Big Guns). The film is an early 1970s gallo film. French star Alain Delon appears in this classic retribution story based in Milan. As Tony Arzenta, Delon exacts revenge on the former bosses who killed his family by accident in a botched assassination attempt to prevent him from retiring.The film uses a wintry Milan as a good atmospheric backdrop for the action that plays out in a series of shoot-outs and car chases. It’s John Wick before it was even conceived. Delon brings a tension that other stars of the era like Charles Bronson failed to do in similar roles. As Arzenta’s targets flee across Europe, he goes through Germany and Denmark to catch up with them.
Sansho the Bailiff– as a film Sansho the Bailiff comes encumbered with a weight of praise. It is highly rated by film critics and Martin Scorsese had it as one of his must-watch films for young film makers. Director Kenji Mizoguchi assembled an ensemble cast of Japanese actors to tell a story of family hardship and poverty. Kazuo Miyagawa is key to the the production, providing a signature look to the cinematography. There is a tension between the emotional rollercoaster of the story and the reflective nature of the scenes portrayed – I don’t want to say too more, except that even the character actors like Kikue Môri (who plays a pivotal role in the plot as a priestess) are amazing in the film.
Warfare – I was a bit leery of watching Alex Garland’s Warfare after watching Civil War which was strong on aesthetics and emotion, but weak in terms of the creative conceits involved in making the story work. Warfare is the collective accounts of a US military unit during a two-hour fire fight. The story is told from multiple perspectives in real-time. The film captures the stress and boredom of inaction as well as what you would normally expect from this kind of film.
Useful tools.
Reddit Answers
Reddit Answers – alternative to Gigabrain that I recommended back in March. Like Gigabrain, Reddit Answers looks like the kind of knowledge search product that we failed to build at Yahoo! twenty years ago (or NORA as Microsoft has been calling the concept for the past few years). Reddit Answers is powered by Google Vertex AI.
Process online data like its peak web 2.0 all over again
While WordPress installations come with RSS enabled as standard and is something that can then be disabled, many types of sites aren’t RSS enabled. And where they are the web devs will often disable it just because. RSS app will create an RSS feed for websites that don’t have it. This allows you to pull it into data processing using something like Pipes. RSS app starts at $9.99 per month and goes up to $99.99 a month. Pipes starts at free and goes up to $79 per month.
The sales pitch.
I am currently working on a brand and creative strategy engagement at Google’s internal creative agency.
I am now taking bookings for strategic engagements in Q4 (October) – keep me in mind; or discussions on permanent roles. Contact me here.
Ok this is the end of my May 2025 newsletter, I hope to see you all back here again in a month. Be excellent to each other and onward into spring, and I hope you enjoyed the last bank holiday until August.
Don’t forget to share if you found it useful, interesting or insightful.
Get in touch if there is anything that you’d like to recommend for the newsletter.
My thinking on the concept of intelligence per watt started as bullets in my notebook. It was more of a timeline than anything else at first and provided a framework of sorts from which I could explore the concept of efficiency in terms of intelligence per watt.
TL;DR (too long, didn’t read)
Our path to the current state of ‘artificial intelligence’ (AI) has been shaped by the interplay and developments of telecommunications, wireless communications, materials science, manufacturing processes, mathematics, information theory and software engineering.
Progress in one area spurred advances in others, creating a feedback loop that propelled innovation.
Over time, new use cases have become more personal and portable – necessitating a focus on intelligence per watt as a key parameter. Energy consumption directly affects industrial design and end-user benefits. Small low-power integrated circuits (ICs) facilitated fuzzy logic in portable consumer electronics like cameras and portable CD players. Low power ICs and power management techniques also helped feature phones evolve into smartphones.
A second-order effect of optimising for intelligence per watt is reducing power consumption across multiple applications. This spurs yet more new use cases in a virtuous innovation circle. This continues until the laws of physics impose limits.
Energy storage density and consumption are fundamental constraints, driving the need for a focus on intelligence per watt.
As intelligence per watt improves, there will be a point at which the question isn’t just what AI can do, but what should be done with AI? And where should it be processed? Trust becomes less about emotional reassurance and more about operational discipline. Just because it can handle a task doesn’t mean it should – particularly in cases where data sensitivity, latency, or transparency to humans is non-negotiable. A highly capable, off-device AI might be a fine at drafting everyday emails, but a questionable choice for handling your online banking.
Good ‘operational security’ outweighs trust. The design of AI systems must therefore account not just for energy efficiency, but user utility and deployment context. The cost of misplaced trust is asymmetric and potentially irreversible.
Ironically the force multiplier in intelligence per watt is people and their use of ‘artificial intelligence’ as a tool or ‘co-pilot’. It promises to be an extension of the earlier memetic concept of a ‘bicycle for the mind’ that helped inspire early developments in the personal computer industry. The upside of an intelligence per watt focus is more personal, trusted services designed for everyday use.
While not a computer, but instead to integrate several radio parts in one glass envelope vacuum valve. This had three triodes (early electronic amplifiers), two capacitors and four resistors. Inside the valve the extra resistor and capacitor components went inside their own glass tubes. Normally each triode would be inside its own vacuum valve. At the time, German radio tax laws were based on the number of valve sockets in a device, making this integration financially advantageous.
Post-war scientific boom
Between 1949 and 1957 engineers and scientists from the UK, Germany, Japan and the US proposed what we’d think of as the integrated circuit (IC). These ideas were made possible when breakthroughs in manufacturing happened. Shockley Semiconductor built on work by Bell Labs and Sprague Electric Company to connect different types of components on the one piece of silicon to create the IC.
Credit is often given to Jack Kilby of Texas Instruments as the inventor of the integrated circuit. But that depends how you define IC, with what is now called a monolithic IC being considered a ‘true’ one. Kilby’s version wasn’t a true monolithic IC. As with most inventions it is usually the child of several interconnected ideas that coalesce over a given part in time. In the case of ICs, it was happening in the midst of materials and technology developments including data storage and computational solutions such as the idea of virtual memory through to the first solar cells.
Kirby’s ICs went into an Air Force computer[ii] and an onboard guidance system for the Minuteman missile. He went on to help invent the first handheld calculator and thermal printer, both of which took advantage of progress in IC design to change our modern way of life[iii].
TTL (transistor-to-transistor logic) circuitry was invented at TRW in 1961, they licensed it out for use in data processing and communications – propelling the development of modern computing. TTL circuits powered mainframes. Mainframes were housed in specialised temperature and humidity-controlled rooms and owned by large corporates and governments. Modern banking and payments systems rely on the mainframe as a concept.
AI’s early steps
What we now thing of as AI had been considered theoretically for as long as computers could be programmed. As semiconductors developed, a parallel track opened up to move AI beyond being a theoretical possibility. A pivotal moment was a workshop was held in 1956 at Dartmouth College. The workshop focused on a hypothesis ‘every aspect of learning or any other feature of intelligence can be so precisely described that a machine can be made to simulate it’. Later on, that year a meeting at MIT (Massachusetts Institute of Technology) brought together psychologists and linguists to discuss the possibility of simulating cognitive processes using a computer. This is the origin of what we’d now call cognitive science.
Out of the cognitive approach came some early successes in the move towards artificial intelligence[iv]. A number of approaches were taken based on what is now called symbolic or classical AI:
Reasoning as search – essentially step-wise trial and error approach to problem solving that was compared to wandering through a maze and back-tracking if a dead end was found.
Natural language – where related phrases existed within a structured network.
Micro-worlds – solving for artificially simple situations, similar to economic models relying on the concept of the rational consumer.
Single layer neural networks – to do rudimentary image recognition.
By the time the early 1970s came around AI researchers ran into a number of problems, some of which still plague the field to this day:
Symbolic AI wasn’t fit for purpose solving many real-world tasks like crossing a crowded room.
Trying to capture imprecise concepts with precise language.
Commonsense knowledge was vast and difficult to encode.
Intractability – many problems require an exponential amount of computing time.
Limited computing power available – there was insufficient intelligence per watt available for all but the simplest problems.
By 1966, US and UK funding bodies were frustrated with the lack of progress on the research undertaken. The axe fell first on a project to use computers on language translation. Around the time of the OPEC oil crisis, funding to major centres researching AI was reduced by both the US and UK governments respectively. Despite the reduction of funding to the major centres, work continued elsewhere.
Mini-computers and pocket calculators
ICs allowed for mini-computers due to the increase in computing power per watt. As important as the relative computing power, ICs made mini-computers more robust, easier to manufacture and maintain. DEC (Digital Equipment Corporation) launched the first minicomputer, the PDP-8 in 1964. The cost of mini-computers allowed them to run manufacturing processes, control telephone network switching and control labouratory equipment. Mini-computers expanded computer access in academia facilitating more work in artificial life and what we’d think of as early artificial intelligence. This shift laid the groundwork for intelligence per watt as a guiding principle.
A second development helped drive mass production of ICs – the pocket calculator, originally invented at Texas Instruments. It demonstrated how ICs could dramatically improve efficiency in compact, low-power devices.
LISP machines and PCs
AI researchers required more computational power than mini-computers could provide, leading to the development of LISP machines—specialised workstations designed for AI applications. Despite improvements in intelligence per watt enabled by Moore’s Law, their specialised nature meant that they were expensive. AI researchers continued with these machines until personal computers (PCs) progressed to a point that they could run LISP quicker than LISP machines themselves. The continuous improvements in data storage, memory and processing that enabled LISP machines, continued on and surpassed them as the cost of computing dropped due to mass production.
The rise of LISP machines and their decline was not only due to Moore’s Law in effect, but also that of Makimoto’s Wave. While Gordon Moore outlined an observation that the number of transistors on a given area of silicon doubled every two years or so. Tsugio Makimoto originally observed 10-year pivots from standardised semiconductor processors to customised processors[v]. The rise of personal computing drove a pivot towards standardised architectures.
PCs and workstations extended computing beyond computer rooms and labouratories to offices and production lines. During the late 1970s and 1980s standardised processor designs like the Zilog Z80, MOS Technology 6502 and the Motorola 68000 series drove home and business computing alongside Intel’s X86 processors.
Personal computing started in businesses when office workers brought a computer to use early computer programmes like the VisiCalc spreadsheet application. This allowed them to take a leap forward in not only tabulating data, but also seeing how changes to the business might affect financial performance.
Businesses then started to invest more in PCs for a wide range of uses. PCs could emulate the computer terminal of a mainframe or minicomputer, but also run applications of their own.
Typewriters were being placed by word processors that allowed the operator to edit a document in real time without resorting to using correction fluid.
A Bicycle for the Mind
Steve Jobs at Apple was as famous for being a storyteller as he was for being a technologist in the broadest sense. Internally with the Mac team he shared stories and memetic concepts to get his ideas across in everything from briefing product teams to press interviews. As a concept, a 1990 filmed interview with Steve Jobs articulates the context of this saying particularly well.
In reality, Jobs had been telling the story for a long time through the development of the Apple II and right from the beginning of the Mac. There is a version of the talk that was recorded some time in 1980 when the personal computer was still a very new idea – the video was provided to the Computer History Museum by Regis McKenna[vi].
The ‘bicycle for the mind’ concept was repeated in early Apple advertisements for the time[vii] and even informed the Macintosh project codename[viii].
Jobs articulated a few key concepts.
Buying a computer creates, rather than reduces problems. You needed software to start solving problems and making computing accessible. Back in 1980, you programmed a computer if you bought one. Which was the reason why early personal computer owners in the UK went on to birth a thriving games software industry including the likes of Codemasters[ix]. Done well, there should be no seem in the experience between hardware and software.
The idea of a personal, individual computing device (rather than a shared resource). My own computer builds on my years of how I have grown to adapt and use my Macs, from my first sit-up and beg Macintosh, to the MacBook Pro that I am writing this post on. This is even more true most people and their use of the smartphone. I am of an age, where my iPhone is still an appendage and emissary of my Mac. My Mac is still my primary creative tool. A personal computer is more powerful than a shared computer in terms of the real difference made.
At the time Jobs originally did the speech, PCs were underpowered for anything but data processing (through spreadsheets and basic word processor applications). But that didn’t stop his idea for something greater.
Jobs idea of the computer as an adjunct to the human intellect and imagination still holds true, but it doesn’t neatly fit into the intelligence per watt paradigm. It is harder to measure the effort developing prompts, or that expended evaluating, refining and filtering generative AI results. Of course, Steve Jobs Apple owed a lot to the vision shown in Doug Engelbart’s ‘Mother of All Demos’[x].
Networks
Work took a leap forward with office networked computers pioneered by Macintosh office by Apple[xi]. This was soon overtaken by competitors. This facilitated work flow within an office and its impact can still be seen in offices today, even as components from print management to file storage have moved to cloud-based services.
At the same time, what we might think of as mobile was starting to gain momentum. Bell Labs and Motorola came up with much of the technology to create cellular communications. Martin Cooper of Motorola made the first phone call on a cellular phone to a rival researcher at Bell Labs. But Motorola didn’t sell the phone commercially until 1983, as a US-only product called the DynaTAC 8000x[xii]. This was four years after Japanese telecoms company NTT launched their first cellular network for car phones. Commercial cellular networks were running in Scandinavia by 1981[xiii].
In the same way that the networked office radically changed white collar work, the cellular network did a similar thing for self-employed plumbers, electricians and photocopy repair men to travelling sales people. If they were technologically advanced, they may have had an answer machine, but it would likely have to be checked manually by playing back the tape.
Often it was a receptionist in their office if they had one. Or more likely, someone back home who took messages. The cell phone freed homemakers in a lot of self-employed households to go out into the workplace and helped raise household incomes.
Fuzzy logic
The first mainstream AI applications emerged from fuzzy logic, introduced by Lofti A. Zadeh in 1965 mathematical paper. Initial uses were for industrial controls in cement kilns and steel production[xiv]. The first prominent product to rely on fuzzy logic was the Zojirushi Micom Electric Rice Cooker (1983), which adjusted cooking time dynamically to ensure perfect rice.
Fuzzy logic reacted to changing conditions in a similar way to people. Through the 1980s and well into the 1990s, the power of fuzzy logic was under appreciated outside of Japanese product development teams. In a quote a spokesperson for the American Electronics Association’s Tokyo office said to the Washington Post[xv].
“Some of the fuzzy concepts may be valid in the U.S.,”
“The idea of better energy efficiency, or more precise heating and cooling, can be successful in the American market,”
“But I don’t think most Americans want a vacuum cleaner that talks to you and says, ‘Hey, I sense that my dust bag will be full before we finish this room.’ “
The end of the 1990s, fuzzy logic was embedded in various consumer devices:
Air-conditioner units – understands the room, the temperature difference inside-and-out, humidity. It then switches on-and-off to balance cooling and energy efficiency.
CD players – enhanced error correction on playback dealing with imperfections on the disc surface.
Dishwashers – understood how many dishes were loaded, their type of dirt and then adjusts the wash programme.
Toasters – recognised different bread types, the preferable degree of toasting and performs accordingly.
TV sets – adjust the screen brightness to the ambient light of the room and the sound volume to how far away the viewer is sitting from the TV set.
Vacuum cleaners – vacuum power that is adjusted as it moves from carpeted to hard floors.
Video cameras – compensate for the movement of the camera to reduce blurred images.
Fuzzy logic sold on the benefits and concealed the technology from western consumers. Fuzzy logic embedded intelligence in the devices. Because it worked on relatively simple dedicated purposes it could rely on small lower power specialist chips[xvi] offering a reasonable amount of intelligence per watt, some three decades before generative AI. By the late 1990s, kitchen appliances like rice cookers and microwave ovens reached ‘peak intelligence’ for what they needed to do, based on the power of fuzzy logic[xvii].
Fuzzy logic also helped in business automation. It helped to automatically read hand-written numbers on cheques in banking systems and the postcodes on letters and parcels for the Royal Mail.
Decision support systems & AI in business
Decision support systems or Business Information Systems were being used in large corporates by the early 1990s. The techniques used were varied but some used rules-based systems. These were used in at least some capacity to reduce manual office work tasks. For instance, credit card approvals were processed based on rules that included various factors including credit scores. Only some credit card providers had an analyst manually review the decision made by system. However, setting up each use case took a lot of effort involving highly-paid consultants and expensive software tools. Even then, vendors of business information systems such as Autonomy struggled with a high rate of projects that failed to deliver anything like the benefits promised.
Three decades on, IBM had a similar problem with its Watson offerings, with particularly high-profile failure in mission-critical healthcare applications[xviii]. Secondly, a lot of tasks were ad-hoc in nature, or might require transposing across disparate separate systems.
The rise of the web
The web changed everything. The underlying technology allowed for dynamic data.
Software agents
Examples of intelligence within the network included early software agents. A good example of this was PapriCom. PapriCom had a client on the user’s computer. The software client monitored price changes for products that the customer was interested in buying. The app then notified the user when the monitored price reached a price determined by the customer. The company became known as DealTime in the US and UK, or Evenbetter.com in Germany[xix].
The PapriCom client app was part of a wider set of technologies known as ‘push technology’ which brought content that the netizen would want directly to their computer. In a similar way to mobile app notifications now.
Web search
The wealth of information quickly outstripped netizen’s ability to explore the content. Search engines became essential for navigating the new online world. Progress was made in clustering vast amounts of cheap Linux powered computers together and sharing the workload to power web search amongst them. As search started to trying and make sense of an exponentially growing web, machine learning became part of the developer tool box.
Researchers at Carnegie-Mellon looked at using games to help teach machine learning algorithms based on human responses that provided rich metadata about the given item[xx]. This became known as the ESP game. In the early 2000s, Yahoo! turned to web 2.0 start-ups that used user-generated labels called tags[xxi] to help organise their data. Yahoo! bought Flickr[xxii] and deli.ico.us[xxiii].
All the major search engines looked at how deep learning could help improve search results relevance.
Given that the business model for web search was an advertising-based model, reducing the cost per search, while maintaining search quality was key to Google’s success. Early on Google focused on energy consumption, with its (search) data centres becoming carbon neutral in 2007[xxiv]. This was achieved by a whole-system effort: carefully managing power management in the silicon, storage, networking equipment and air conditioning to maximise for intelligence per watt. All of which were made using optimised versions of open-source software and cheap general purpose PC components ganged together in racks and operating together in clusters.
General purpose ICs for personal computers and consumer electronics allowed easy access relatively low power computing. Much of this was down to process improvements that were being made at the time. You needed the volume of chips to drive innovation in mass-production at a chip foundry. While application-specific chips had their uses, commodity mass-volume products for uses for everything from embedded applications to early mobile / portable devices and computers drove progress in improving intelligence-per-watt.
Makimoto’s tsunami back to specialised ICs
When I talked about the decline of LISP machines, I mentioned the move towards standardised IC design predicted by Tsugio Makimoto. This led to a surge in IC production, alongside other components including flash and RAM memory. From the mid-1990s to about 2010, Makimoto’s predicted phase was stuck in ‘standardisation’. It just worked. But several factors drove the swing back to specialised ICs.
Lithography processes got harder: standardisation got its performance and intelligence per watt bump because there had been a steady step change in improvements in foundry lithography processes that allowed components to be made at ever-smaller dimensions. The dimensions are a function wavelength of light used. The semiconductor hit an impasse when it needed to move to EUV (extreme ultra violet) light sources. From the early 1990s on US government research projects championed development of key technologies that allow EUV photolithography[xxv]. During this time Japanese equipment vendors Nikon and Canon gave up on EUV. Sole US vendor SVG (Silicon Valley Group) was acquired by ASML, giving the Dutch company a global monopoly on cutting edge lithography equipment[xxvi]. ASML became the US Department of Energy research partner on EUV photo-lithography development[xxvii]. ASML spent over two decades trying to get EUV to work. Once they had it in client foundries further time was needed to get commercial levels of production up and running. All of which meant that production processes to improve IC intelligence per watt slowed down and IC manufacturers had to start about systems in a more holistic manner. As foundry development became harder, there was a rise in fabless chip businesses. Alongside the fabless firms, there were fewer foundries: Global Foundries, Samsung and TSMC (Taiwan Semiconductor Manufacturing Company Limited). TSMC is the worlds largest ‘pure-play’ foundry making ICs for companies including AMD, Apple, Nvidia and Qualcomm.
Progress in EDA (electronic design automation). Production process improvements in IC manufacture allowed for an explosion in device complexity as the number of components on a given size of IC doubled every 18 months or so. In the mid-to-late 1970s this led to technologists thinking about the idea of very large-scale integration (VLSI) within IC designs[xxviii]. Through the 1980s, commercial EDA software businesses were formed. The EDA market grew because it facilitated the continual scaling of semiconductor technology[xxix]. Secondly, it facilitated new business models. Businesses like ARM Semiconductor and LSI Logic allowed their customers to build their own processors based on ‘blocs’ of proprietary designs like ARM’s cores. That allowed companies like Apple to focus on optimisation in their customer silicon and integration with software to help improve the intelligence per watt[xxx].
Increased focus on portable devices. A combination of digital networks, wireless connectivity, the web as a communications platform with universal standards, flat screen displays and improving battery technology led the way in moving towards more portable technologies. From personal digital assistants, MP3 players and smartphone, to laptop and tablet computers – disconnected mobile computing was the clear direction of travel. Cell phones offered days of battery life; the Palm Pilot PDA had a battery life allowing for couple of days of continuous use[xxxi]. In reality it would do a month or so of work. Laptops at the time could do half a day’s work when disconnected from a power supply. Manufacturers like Dell and HP provided spare batteries for travellers. Given changing behaviours Apple wanted laptops that were easy to carry and could last most of a day without a charge. This was partly driven by a move to a cleaner product design that wanted to move away from swapping batteries. In 2005, Apple moved from PowerPC to Intel processors. During the announcement at the company’s worldwide developer conference (WWDC), Steve Jobs talked about the focus on computing power per watt moving forwards[xxxii].
Apple’s first in-house designed IC, the A4 processor was launched in 2010 and marked the pivot of Makimoto’s wave back to specialised processor design[xxxiii]. This marked a point of inflection in the growth of smartphones and specialised computing ICs[xxxiv].
New devices also meant new use cases that melded data on the web, on device, and in the real world. I started to see this in action working at Yahoo! with location data integrated on to photos and social data like Yahoo! Research’s ZoneTag and Flickr. I had been the Yahoo! Europe marketing contact on adding Flickr support to Nokia N-series ‘multimedia computers’ (what we’d now call smartphones), starting with the Nokia N73[xxxv]. A year later the Nokia N95 was the first smartphone released with a built-in GPS receiver. William Gibson’s speculative fiction story Spook Country came out in 2007 and integrated locative art as a concept in the story[xxxvi].
Real-world QRcodes helped connect online services with the real world, such as mobile payments or reading content online like a restaurant menu or a property listing[xxxvii].
I labelled the web-world integration as a ‘web-of-no-web’[xxxviii] when I presented on it back in 2008 as part of an interactive media module, I taught to an executive MBA class at Universitat Ramon Llull in Barcelona[xxxix]. In China, wireless payment ideas would come to be labelled O2O (offline to online) and Kevin Kelly articulated a future vision for this fusion which he called Mirrorworld[xl].
Deep learning boom
Even as there was a post-LISP machine dip in funding of AI research, work on deep (multi-layered) neural networks continued through the 1980s. Other areas were explored in academia during the 1990s and early 2000s due to the large amount of computing power needed. Internet companies like Google gained experience in large clustered computing, AND, had a real need to explore deep learning. Use cases include image recognition to improve search and dynamically altered journeys to improve mapping and local search offerings. Deep learning is probabilistic in nature, which dovetailed nicely with prior work Microsoft Research had been doing since the 1980s on Bayesian approaches to problem-solving[xli].
A key factor in deep learning’s adoption was having access to powerful enough GPUs to handle the neural network compute[xlii]. This has allowed various vendors to build Large Language Models (LLMs). The perceived strategic importance of artificial intelligence has meant that considerations on intelligence per watt has become a tertiary consideration at best. Microsoft has shown interest in growing data centres with less thought has been given on the electrical infrastructure required[xliii].
Google’s conference paper on attention mechanisms[xliv] highlighted the development of the transformer model. As an architecture it got around problems in previous approaches, but is computationally intensive. Even before the paper was published, the Google transformer model had created fictional Wikipedia entries[xlv]. A year later OpenAI built on Google’s work with the generative pre-trained transformer model better known as GPT[xlvi].
Since 2018 we’ve seen successive GPT-based models from Amazon, Anthropic, Google, Meta, Alibaba, Tencent, Manus and DeepSeek. All of these models were trained on vast amounts of information sources. One of the key limitations for building better models was access to training material, which is why Meta used pirated copies of e-books obtained using bit-torrent[xlvii].
These models were so computationally intensive that the large-scale cloud service providers (CSPs) offering these generative AI services were looking at nuclear power access for their data centres[xlviii].
The current direction of development in generative AI services is raw computing power, rather than having a more energy efficient focus of intelligence per watt.
Technology consultancy / analyst Omdia estimated how many GPUs were bought by hyperscalers in 2024[xlix].
Company
Number of Nvidia GPUs bought
Number of AMD GPUs bought
Number of self-designed custom processing chips bought
Amazon
196,000
–
1,300,000
Alphabet (Google)
169,000
–
1,500,000
ByteDance
230,000
–
–
Meta
224,000
173,000
1,500,000
Microsoft
485,000
96,000
200,000
Tencent
230,000
–
–
These numbers provide an indication of the massive deployment on GPT-specific computing power. Despite the massive amount of computing power available, services still weren’t able to cope[l] mirroring some of the service problems experienced by early web users[li] and the Twitter ‘whale FAIL’[lii] phenomenon of the mid-2000s. The race to bigger, more powerful models is likely to continue for the foreseeable future[liii].
There is a second class of players typified by Chinese companies DeepSeek[liv] and Manus[lv] that look to optimise the use of older GPT models to squeeze the most utility out of them in a more efficient manner. Both of these services still rely on large cloud computing facilities to answer queries and perform tasks.
Agentic AI
Thinking on software agents went back to work being done in computer science in the mid-1970s[lvi]. Apple articulated a view[lvii]of a future system dubbed the ‘Knowledge Navigator’[lviii] in 1987 which hinted at autonomous software agents. What we’d now think of as agentic AI was discussed as a concept at least as far back as 1995[lix], this was mirrored in research labs around the world and was captured in a 1997 survey of research on intelligent software agents was published[lx]. These agents went beyond the vision that PapriCom implemented.
A classic example of this was Wildfire Communications, Inc. who created a voice enabled virtual personal assistant in 1994[lxi]. Wildfire as a service was eventually shut down in 2005 due to an apparent decline in subscribers using the service[lxii]. In terms of capability, Wildfire could do tasks that are currently beyond Apple’s Siri. Wildfire did have limitations due to it being an off-device service that used a phone call rather than an internet connection, which limited its use to Orange mobile service subscribers using early digital cellular mobile networks.
Almost a quarter century later we’re now seeing devices that are looking to go beyond Wildfire with varying degrees of success. For instance, the Rabbit R1 could order an Uber ride or groceries from DoorDash[lxiii]. Google Duplex tries to call restaurants on your behalf to make reservations[lxiv] and Amazon claims that it can shop across other websites on your behalf[lxv]. At the more extreme end is Boeing’s MQ-28[lxvi] and the Loyal Wingman programme[lxvii]. The MQ-28 is an autonomous drone that would accompany US combat aircraft into battle, once it’s been directed to follow a course of action by its human colleague in another plane.
The MQ-28 will likely operate in an electronic environment that could be jammed. Even if it wasn’t jammed the length of time taken to beam AI instructions to the aircraft would negatively impact aircraft performance. So, it is likely to have a large amount of on-board computing power. As with any aircraft, the size of computing resources and their power is a trade-off with the amount of fuel or payload it will carry. So, efficiency in terms of intelligence per watt becomes important to develop the smallest, lightest autonomous pilot.
As well as a more hostile world, we also exist in a more vulnerable time in terms of cyber security and privacy. It makes sense to have critical, more private AI tasks run on a local machine. At the moment models like DeepSeek can run natively on a top-of-the-range Mac workstation with enough memory[lxviii].
This is still a long way from the vision of completely local execution of ‘agentic AI’ on a mobile device because the intelligence per watt hasn’t scaled down to that level to useful given the vast amount of possible uses that would be asked of the Agentic AI model.
Maximising intelligence per watt
There are three broad approaches to maximise the intelligence per watt of an AI model.
Take advantage of the technium. The technium is an idea popularised by author Kevin Kelly[lxix]. Kelly argues that technology moves forward inexorably, each development building on the last. Current LLMs such as ChatGPT and Google Gemini take advantage of the ongoing technium in hardware development including high-speed computer memory and high-performance graphics processing units (GPU). They have been building large data centres to run their models in. They build on past developments in distributed computing going all the way back to the 1962[lxx].
Optimise models to squeeze the most performance out of them. The approach taken by some of the Chinese models has been to optimise the technology just behind the leading-edge work done by the likes of Google, OpenAI and Anthropic. The optimisation may use both LLMs[lxxi] and quantum computing[lxxii] – I don’t know about the veracity of either claim.
Specialised models. Developing models by use case can reduce the size of the model and improve the applied intelligence per watt. Classic examples of this would be fuzzy logic used for the past four decades in consumer electronics to Mistral AI[lxxiii] and Anduril’s Copperhead underwater drone family[lxxiv].
Even if an AI model can do something, should the model be asked to do so?
We have a clear direction of travel over the decades to more powerful, portable computing devices –which could function as an extension of their user once intelligence per watt allows it to be run locally.
Having an AI run on a cloud service makes sense where you are on a robust internet connection, such as using the wi-fi network at home. This makes sense for general everyday task with no information risk, for instance helping you complete a newspaper crossword if there is an answer you are stuck on and the intellectual struggle has gone nowhere.
A private cloud AI service would make sense when working, accessing or processing data held on the service. Examples of this would be Google’s Vertex AI offering[lxxv].
On-device AI models make sense in working with one’s personal private details such as family photographs, health information or accessing apps within your device. Apps like Strava which share data, have been shown to have privacy[lxxvi] and security[lxxvii] implications. ***I am using Strava as an example because it is popular and widely-known, not because it is a bad app per se.***
While businesses have the capability and resources to have a multi-layered security infrastructure to protect their data most[lxxviii]of[lxxix] the[lxxx] time[lxxxi], individuals don’t have the same security. As I write this there are privacy concerns[lxxxii] expressed about Waymo’s autonomous taxis. However, their mobile device is rarely out of physical reach and for many their laptop or tablet is similarly close. All of these devices tend to be used in concert with each other. So, for consumers having an on-device AI model makes the most sense. All of which results in a problem, how do technologists squeeze down their most complex models inside a laptop, tablet or smartphone?