[SydPhil] HPS Research Seminar, Monday 9 September 2024 at 5.30pm

Mon Sep 2 10:30:53 AEST 2024

School of History and Philosophy of Science
RESEARCH SEMINAR
[The University of Sydney]
[https://d31hzlhk6di2h5.cloudfront.net/20240901/b1/56/26/a9/887e6bfc83bb9cc89afca741_1190x792.jpg]
Illusions of Explanation in Deep Learning

Raphaël Millière (Macquarie University )

Dates: Monday, 09/09/2024
Time: 5:30pm
Venue: F09.331. Madsen Building. Madsen Seminar Room 331
How to register: Free, no registration required

Abstract: Recent advancements in artificial intelligence have been largely driven by deep learning. However, deep neural networks (DNNs) are often characterized as inscrutable "black boxes": while we can study their performance on various tasks, we struggle to understand the internal mechanisms that drive it. Mechanistic interpretability has emerged as a promising approach to unveil the inner workings of DNNs by decoding the computations and representations underlying their behavior. While preliminary results in toy models show potential, scaling these techniques to large-scale DNNs remains a challenge. Here, I investigate a serious concern about the viability of this project: the possibility of illusory explanations that appear to reveal how DNN process information but are, in fact, misleading. I present a novel typology of such interpretability illusions, and explore potential strategies to mitigate their occurrence and impact on explanations.

Bio: Raphaël Millière is a Lecturer (Assistant Professor) in Philosophy of Artificial Intelligence at Macquarie University in Sydney, Australia. Prior to joining Macquarie in 2023, he was the Robert A. Burt Presidential Scholar in Society and Neuroscience at Columbia University. His research interests lie at the intersection of philosophy, cognitive science, and artificial intelligence. His current work focuses on assessing the linguistic and reasoning abilities of large language models, drawing on philosophy and computer science to shed light on the potential of these models to advance our understanding of human cognition.

[https://images.e2ma.net/0/images/templates/spacer.gif]

[The University of Sydney]
Keep in touch
[Facebook]<https://url.au.m.mimecastprotect.com/s/NHSzC1WLPxcnmwQBrSOF9CVh740?domain=t.e2ma.net>
[Twitter]<https://url.au.m.mimecastprotect.com/s/6zmwC2xMQziKGoDZLfLHRC5f7Zk?domain=t.e2ma.net>
[Instagram]<https://url.au.m.mimecastprotect.com/s/1sbuC3QNPBi7MYr9jfpIoCQT6ZV?domain=t.e2ma.net>
[LinkedIn]<https://url.au.m.mimecastprotect.com/s/OJ9RC4QOPEiYWAw9Zt5SYC4625l?domain=t.e2ma.net>
[YouTube]<https://url.au.m.mimecastprotect.com/s/9mA8C5QPXJigKYrpyfVToCkxv2i?domain=t.e2ma.net>
Copyright © 2024 The University of Sydney, NSW 2006 Australia
Phone +61 2 9351 2222 ABN 15 211 513 464 CRICOS Number: 00026A

Please add hps.admin at sydney.edu.au to your address book or senders safe list to make sure you continue to see our emails in the future.

Manage<https://url.au.m.mimecastprotect.com/s/2I5PCXLW2mUBN7LDlcmt7CW8zGw?domain=app.e2ma.net> your preferences | Opt out<https://url.au.m.mimecastprotect.com/s/igtOCYW8NockZoypKfZuVCxVOaR?domain=t.e2ma.net> using TrueRemove®
Got this as a forward? Sign up<https://url.au.m.mimecastprotect.com/s/Z_CRCZY1NqiPRnW81fXCjCBgikM?domain=app.e2ma.net/> to receive our future emails.
View this email online<https://url.au.m.mimecastprotect.com/s/1k-OCQnMBZfBq4wlncxfpCGUs1o?domain=t.e2ma.net>.

Disclaimer<https://url.au.m.mimecastprotect.com/s/Vbt9C6XQ4LfV9l40RsQUNC5_Okz?domain=t.e2ma.net> | Privacy statement<https://url.au.m.mimecastprotect.com/s/O0x-C71R2NTE6jBVrf1c0Co6BSu?domain=t.e2ma.net> | University of Sydney<https://url.au.m.mimecastprotect.com/s/xpv7C81V0PTOM8yPmClfECyofG4?domain=t.e2ma.net>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.sydney.edu.au/pipermail/sydphil/attachments/20240902/01110ef8/attachment.htm>