Learning machines discussion

2 minutes estimated reading time

Why learning machines? Simply because I don’t want to get into an argument of what an AI actually is, so hence the title change – but interesting watching.

The Churchill Club manage to get top drawer panelists for this session on learning machines

Yoshua Bengio, Professor, Department of Computer Science and Operations Research, Universite de Montreal
John E. Kelly, Senior Vice President, Solutions Portfolio and Research, IBM

The panel was moderated by long time New York Times technology journalist John Markoff.

Key takeouts for me were:

Cycle since the 1950s of over-promising and under delivering that drove nuclear winters and booms. Current cycle goes back to the DARPA autonomous vehicle competitions of the past decade. Neural networks weren’t seen as ‘AI’, they went out of research fashion in the 1990s and research picked back up in 2005. Deep learning is basically layers of neural network – more layers = ‘deeper’. Deep nets are now the standard for learning machines. Object recognition improved in 2012 and both industrial and media interest took off.

Performance has been helped by improved hardware, which has driven the breakthroughs.

Learning machines still need a lot of human guidance, unsupervised learning isn’t doing as well. It probably explains why IBM has so many people working on Watson projects. This also explains why Watson is externally seen primarily as a ‘marketing concept’.

Rate of change in semiconductors. Moore’s Law is likely to top out at 5nm. Carbon nanotube devices will be the new silicon in semiconductors. Quantum computing will drive performance in certain types of calculations by a factor of 20. Graphene is better for analogue devices, nanotubes are better for switching. The cost of transistors has stopped falling, which has an implication for new disruptive industries.

We’ll get performance and density, the cost of which is more uncertain. Computing power is important for learning machine technologies. Power consumption (computing power per watt) is tremendously important. IBM Watson on Jeopardy used 85,000 watts to beat two 20 watt humans.

Back propagation allows the use of lower power processors. Speech and vision are areas of a big push, but the most exciting area is language recognition and understanding with recurrent networks – implications for conversational interfaces and services.

IBM think cognitive computing is a wider area than ‘machine learning’. Cognitive computing is what IBM think will transform ‘digital transformation’ through learning machines.

AI has a definition problem due to fashion and academic quarrels.

Watson was originally designed to deal with massive unstructured data rather than building an AI. Data was growing faster than IBM could develop for. Watson had a data centric focus. It sounds rather like the vision I once heard articulated for both Autonomy and Palantir.

Consumers will see Watson as ‘insights’. Watson as a learning machine focuses on comparing and contrast to try and find patterns.

Interesting that IBM went in so hard on healthcare as an example, given how they eventually retreated from the sector after scandals over unsafe diagnosis.

UPDATE: October 6, 2020 – report on AI progress here.