Instance

Array
(
    [title] => Recent Forum Threads
    [title_url] => 
    [ignore_sticky] => 0
    [exclude_current] => 0
    [limit] => 10
    [sluglist] => ["jobs-dashboard"]
    [rw_opt] => Array
        (
            [widget_select] => 1
            [pageid_281769] => 1
            [pageid_281772] => 1
        )

    [display_widget_mobile] => 
    [rw_opt_exclude] => Array
        (
            [pageid_274493] => 1
            [cpt_podcast] => 1
            [cpta_podcast] => 1
            [category_16613] => 1
            [category_16631] => 1
            [taxonomy_series] => 1
            [pageid_354254] => 1
        )

    [node_id] => Array
        (
            [0] => 2
        )

)

Threads

Recent Article Comments

TSMC CoWoS versus Intel EMIB Semiconductor Packaging
I think the picture is bit of wrong for the scalability EMIB mentioned as 6X in 26 and CoWoS-L is…

— siliconbruh999 on July 17, 2026
Consolidation and Competition: Who is Winning the $4.5 Billion Interface IP Race?
HPC can be Chiplet. Wondering why UCIe is not considered. Internally AMBA neither

— chiro.lentz on July 11, 2026
The Packaging PDK Is the Missing Layer for Co-Packaged Optics
Thank you to Daniel Nenni and SemiWiki for publishing my latest article: The Packaging PDK Is the Missing Layer for…

— moh.kolb on July 8, 2026
The Packaging PDK Is the Missing Layer for Co-Packaged Optics
Very interesting. Thanks.

— U235 on July 8, 2026
Why Huawei Says It Will Match TSMC’s Most Advanced Chips by 2031
N+3 is denser than N6: https://newsletter.semianalysis.com/p/steel-smic-n3-teardown?open=false

— Fred Chen on July 5, 2026
Why Huawei Says It Will Match TSMC’s Most Advanced Chips by 2031
Fixed, thank you.

— Daniel Nenni on July 4, 2026
Why Huawei Says It Will Match TSMC’s Most Advanced Chips by 2031
The article is not correct. EUV equipment is not primarily produced by ASML. It is only produced by ASML. It…

— AndyG on July 4, 2026
Intel 18A vs Intel 18A-P: What Is the Difference and Why Does It Matter?
Nice writeup

— Rahul Razdan on June 27, 2026
Available Is Not In Control: Balancing Output, Quality, and Risk in High-Volume Fabs
In a DoD centric III-V fab I had wafers run in a few decades ago, yield was miserable, but adequate…

— PBealo on June 27, 2026
Available Is Not In Control: Balancing Output, Quality, and Risk in High-Volume Fabs
Another thing that can help improve availability is a very old but often overlooked basic bedrock: Having good SPC, that…

— benb on June 24, 2026

WP_Term Object
(
    [term_id] => 151
    [name] => General
    [slug] => general
    [term_group] => 0
    [term_taxonomy_id] => 151
    [taxonomy] => category
    [description] => 
    [parent] => 0
    [count] => 449
    [filter] => raw
    [cat_ID] => 151
    [category_count] => 449
    [category_description] => 
    [cat_name] => General
    [category_nicename] => general
    [category_parent] => 0
)

December 27, 2016 by Bernard Murphy

The Other Half of AI

The Other Half of AI
by Bernard Murphy on 12-27-2016 at 7:00 am
Categories: General

I touched earlier on challenges that can appear in AI systems which operate as black-boxes, particularly in deep learning systems. Problems are limited when applied to simple recognition tasks, e.g. recognizing a speed limit posted on a sign. In these cases, the recognition task is (from a human viewpoint) simply choosing from among a limited set of easily distinguished options, so an expert observer can easily determine if/when the system made a bad decision.

But as AI is extended to more complex tasks, it becomes increasingly difficult to accurately grade the performance of those systems. Certainly, there will still be many cases where an expert observer can classify performance easily enough. But what about cases where the expert observer isn’t sure? Where is the student surpassing the master and where is the student simply wrong?

This reminded me of an important Indian mathematician, Srinivasa Ramanujan, whose methods were in some ways as opaque as current AI systems. There was a movie release this year – The Man Who Knew Infinity – covering Ramanujan’s career and challenges. He had an incredible natural genius for mathematics, but chose to present results with little or no evidence for how he got there (apparently because he couldn’t afford the extra paper required to write out the proofs).

This lack of demonstrated proofs raised concerns among professional mathematicians of Ramanujan’s time. Mathematical rigor requires a displayed proof leading to the result, so that other experts can validate (or disprove) the claim. This is not unlike the above-mentioned concern with modern deep learning systems. For conclusions which a human expert can easily classify there is no problem, but for more complex assertions a bald statement of a conclusion is insufficient. We want to know how the system arrived at that conclusion for one of two reasons: it might be wrong and if so we want to know where it went wrong so we can fix it (perhaps by improving the training set), or it might be right in which case we’d like to know why so we can improve our own understanding.

Recent work at UC Berkeley and the Max Planck Institute for Informatics has made progress in this direction for deep learning systems. The underlying mechanics are the same but they use multiple training datasets, to deliver a conclusion and to justify sub-steps leading to that conclusion. The domain for the study is image recognition, specifically determining aspects of what is happening in an image (for example, what sport is being played).

The research team noted that a system-generated chain of reasoning may not correspond to how a human expert would think of a problem, so a better approach needs some user friendliness. Instead of presenting the user with a proof, let them ask questions which the system should answer, an approach known as visual question answering (VQA). While this may not lead to mathematically rigorous proofs, it seems very appropriate for many domains where a human expert wants to feel sufficiently convinced but may not need every possible proof point.

The method requires two principle components: VQA augmented by spatial attention where the system looks at localized image features (such as a figure) to draw conclusions (this person is holding a bat), and more global activity recognition/explanation (this is a baseball game). These datasets were annotated through crowdsourcing with “proposition because explanation” labels.

The research team wanted also to point to an object supporting a proposition, for example if the VQA asserted “this person is holding a bat”, they wanted to point to the bat. This is where the attention aspect of the model becomes important. You could imagine this kind of capability being critical in a medical diagnosis where perhaps a key aspect of the diagnosis rests on an assumption that a dark spot in an X-ray corresponds to a tumor. “This patient has a tumor as shown in this X-ray” is hardly a sufficient proposition, whereas “this patient has a tumor in the liver as shown at this location in this X-ray” is much more usable information and something a doctor could confirm or challenge.

As we aim to push AI into more complex domains, this kind of justification process will become increasingly important. Which of us would trust a medical diagnosis delivered by a machine without a medical expert first reviewing and approving that diagnosis? Collision avoidance in a car may not allow time to review before taking action, but subsequent litigation may quite possibly demand review on whether the action taken was reasonable. Even (and perhaps especially) where AI is being used to guide scientific discovery or proof, the AI will need to demonstrate a chain of reasoning which human experts can test for robustness. It will be a very long time before “because my AI system said so” will be considered a sufficient alternative to peer review. Which is really the point. Important decisions, whether made by people or machines, should not be exempt from peer review.

You can read the UCB/MPI arXiv paper HERE.

Comments

0 Replies to “The Other Half of AI”

You must register or log in to view/post comments.

TSMC CoWoS versus Intel EMIB Semiconductor Packaging
I think the picture is bit of wrong for the scalability EMIB mentioned as 6X in 26 and CoWoS-L is…

— siliconbruh999 on July 17, 2026
Consolidation and Competition: Who is Winning the $4.5 Billion Interface IP Race?
HPC can be Chiplet. Wondering why UCIe is not considered. Internally AMBA neither

— chiro.lentz on July 11, 2026
The Packaging PDK Is the Missing Layer for Co-Packaged Optics
Thank you to Daniel Nenni and SemiWiki for publishing my latest article: The Packaging PDK Is the Missing Layer for…

— moh.kolb on July 8, 2026
The Packaging PDK Is the Missing Layer for Co-Packaged Optics
Very interesting. Thanks.

— U235 on July 8, 2026
Why Huawei Says It Will Match TSMC’s Most Advanced Chips by 2031
N+3 is denser than N6: https://newsletter.semianalysis.com/p/steel-smic-n3-teardown?open=false

— Fred Chen on July 5, 2026
Why Huawei Says It Will Match TSMC’s Most Advanced Chips by 2031
Fixed, thank you.

— Daniel Nenni on July 4, 2026
Why Huawei Says It Will Match TSMC’s Most Advanced Chips by 2031
The article is not correct. EUV equipment is not primarily produced by ASML. It is only produced by ASML. It…

— AndyG on July 4, 2026
Intel 18A vs Intel 18A-P: What Is the Difference and Why Does It Matter?
Nice writeup

— Rahul Razdan on June 27, 2026

Search Semiwiki

Recent Forum Threads

Recent Article Comments

Recent Podcast Episodes

Comments

0 Replies to “The Other Half of AI”

Recent Forum Threads

Recent Article Comments