Instance

Array
(
    [title] => Recent Forum Threads
    [title_url] => 
    [ignore_sticky] => 0
    [exclude_current] => 0
    [limit] => 10
    [sluglist] => ["jobs-dashboard"]
    [rw_opt] => Array
        (
            [widget_select] => 1
            [pageid_281769] => 1
            [pageid_281772] => 1
        )

    [display_widget_mobile] => 
    [rw_opt_exclude] => Array
        (
            [pageid_274493] => 1
            [cpt_podcast] => 1
            [cpta_podcast] => 1
            [category_16613] => 1
            [category_16631] => 1
            [taxonomy_series] => 1
            [pageid_354254] => 1
        )

    [node_id] => Array
        (
            [0] => 2
        )

)

Threads

Recent Article Comments

TSMC N3 Process Technology Wiki
Agreed, marketing numbers are always misleading. Customers know that which is why they rely on PDK derived numbers. Did you…

— Daniel Nenni on July 17, 2025
TSMC N3 Process Technology Wiki
1.7X density number is misleading as well if we compare the densest library between the improvement is only 1.56X. https://fuse.wikichip.org/news/7375/tsmc-n3-and-challenges-ahead/

— siliconbruh999 on July 17, 2025
Moore’s Law Wiki
Yes, I am trying to teach AI how to do semiconductor wikis and put the Wiki back in SemiWiki. Should…

— Daniel Nenni on July 14, 2025
TSMC N3 Process Technology Wiki
I am trying to teach AI to speak semiconductor wikis. The problem is the date of the references. A 2023…

— Daniel Nenni on July 14, 2025
TSMC N3 Process Technology Wiki
Hmm - what's the source for 0.015-0.016? -- this thread shows 0.0199 (N3B) and 0.021 (N3E) https://semiwiki.com/forum/threads/tsmc-officially-halts-sram-scaling.17223/ Perhaps this source…

— Xebec on July 14, 2025
Moore’s Law Wiki
Are these AI Generated? :)

— Xebec on July 14, 2025
TSMC N3 Process Technology Wiki
It should be 25-30% smaller? Process Node Typical SRAM Cell Size Density Improvement TSMC N5 ~0.021 µm² — TSMC N3…

— Daniel Nenni on July 14, 2025
TSMC N3 Process Technology Wiki
~1.6x denser vs. N5 SRAM I thought the scaling was more like 1.05X? (Various threads here on 'SRAM scaling dead…

— Xebec on July 14, 2025
Facing the Quantum Nature of EUV Lithography
This presentation considers 5 nm Gaussian acid blur: https://www.youtube.com/watch?v=MYLdE69RDBg

— Fred Chen on July 7, 2025
Flynn Was Right: How a 2003 Warning Foretold Today’s Architectural Pivot
Appreciate your take, Rahul. You’re absolutely right that market scale drives architectural investment—scalar dominated when desktop and enterprise ruled, and…

— Jonah McLeod on June 29, 2025

WP_Term Object
(
    [term_id] => 6435
    [name] => AI
    [slug] => artificial-intelligence
    [term_group] => 0
    [term_taxonomy_id] => 6435
    [taxonomy] => category
    [description] => Artificial Intelligence
    [parent] => 0
    [count] => 576
    [filter] => raw
    [cat_ID] => 6435
    [category_count] => 576
    [category_description] => Artificial Intelligence
    [cat_name] => AI
    [category_nicename] => artificial-intelligence
    [category_parent] => 0
)

February 6, 2019 by Bernard Murphy

Machine Learning and Gödel

Machine Learning and Gödel
by Bernard Murphy on 02-06-2019 at 7:00 am
Categories: AI

Scanning ACM tech news recently, I came across a piece that spoke to my inner nerd; I hope it will appeal to some of you also. The discovery will have no impact on markets or investments or probably anyone outside theories of machine learning. Its appeal is simply in the beauty of connecting a profound but obscure corner of mathematical logic to a hot domain in AI.

There is significant activity in theories of machine learning, to figure out how best to optimize neural nets, to understand what bounds we can put on the accuracy of results and generally to add the predictive power you would expect in any scientifically/mathematically well-grounded discipline. Some of this is fairly close to implementation and some delves into the foundations of machine learning.

In foundational theory, one question is whether it is possible to prove, within some appropriate framework, that an objective is learnable (or not). Identifying cat and dog breeds is simple enough – just throw enough samples at the ML and eventually you’ll cover all the useful variants. But what about identifying patterns in very long strings of numbers or letters? Since we can’t easily cross-check that ML found just those cases it should and no others, and since potentially the sample size could be boundless – think of data streams in a network – finding a theoretical approach to validate learnability can look pretty attractive.

There’s a well-established mathematical framework for this analysis called “probably approximately correct (PAC) learning” in which a learning system reads in samples and must build a generalization function from a class of possible functions to represent the learning. The use of “functions” rather than implementation details is intentional; the goal is to support a very general analysis abstracted from any implementation. The target function is simply a map between an input sample (data set) and the output value, match or no match. There is a method in this theory to characterize how many training samples will be needed for any given problem, which apparently has been widely and productively used in ML applications.

However – when a theory uses sets (of data) and functions on those sets, it strays onto mathematical logic turf and becomes subject to known limitations in that domain. A group of mathematicians at the Technion-Israel Institute of Technology in Haifa have demonstrated that there exist families of sets together with target learning problems for which learnability can neither be proved nor disproved within the standard axioms of mathematics; learnability is undecidable (or more precisely, independent of the base mathematical system, to distinguish this from computability undecidability).

If you ever read “Gödel, Escher and Bach” or anything else on Gödel, this should sound familiar. He proved, back in 1931, that it is impossible for any mathematical system to prove all truths about the integers. There will always be statements about the integers that cannot be proved either true or false. The same restriction applies to ML it seems; there are learning problems for which learnability cannot be proved or disproved. More concretely, as I understand it, it is not possible to determine for this class of problem an upper bound to the number of training samples you would need to supply to adequately train the system. (Wait, what about proving this from the halting problem? The authors used Gödelian methods, so that’s what I describe here.)

This is unlikely to affect ML as we know it. Even in mathematics, Gödelian traps are few and far between, many quite specialized although a few like Goodstein’s theorem are quite simple. And of course we know other problems, like the traveling salesman problem which are theoretically unbounded yet are still managed effectively every day in chip physical design. So don’t sell your stock in ML-based enterprises. None of this will perturb their efforts even slightly. But it is pretty, nonetheless.

Share this post via:

Comments

There are no comments yet.

You must register or log in to view/post comments.

TSMC N3 Process Technology Wiki
Agreed, marketing numbers are always misleading. Customers know that which is why they rely on PDK derived numbers. Did you…

— Daniel Nenni on July 17, 2025
TSMC N3 Process Technology Wiki
1.7X density number is misleading as well if we compare the densest library between the improvement is only 1.56X. https://fuse.wikichip.org/news/7375/tsmc-n3-and-challenges-ahead/

— siliconbruh999 on July 17, 2025
Moore’s Law Wiki
Yes, I am trying to teach AI how to do semiconductor wikis and put the Wiki back in SemiWiki. Should…

— Daniel Nenni on July 14, 2025
TSMC N3 Process Technology Wiki
I am trying to teach AI to speak semiconductor wikis. The problem is the date of the references. A 2023…

— Daniel Nenni on July 14, 2025
TSMC N3 Process Technology Wiki
Hmm - what's the source for 0.015-0.016? -- this thread shows 0.0199 (N3B) and 0.021 (N3E) https://semiwiki.com/forum/threads/tsmc-officially-halts-sram-scaling.17223/ Perhaps this source…

— Xebec on July 14, 2025
Moore’s Law Wiki
Are these AI Generated? :)

— Xebec on July 14, 2025
TSMC N3 Process Technology Wiki
It should be 25-30% smaller? Process Node Typical SRAM Cell Size Density Improvement TSMC N5 ~0.021 µm² — TSMC N3…

— Daniel Nenni on July 14, 2025
TSMC N3 Process Technology Wiki
~1.6x denser vs. N5 SRAM I thought the scaling was more like 1.05X? (Various threads here on 'SRAM scaling dead…

— Xebec on July 14, 2025

Search Semiwiki

Recent Forum Threads

Recent Article Comments

Recent Podcast Episodes

Comments

Recent Forum Threads

Recent Article Comments