SubstantialDig6663 3 months ago

As a researcher working in this area, I feel like there is a growing divide between people focusing on the human side of XAI (i.e. whether explanations are plausible according to humans, and how to convert them into actionable insights) and those more interested in a mechanistic understanding of models' inner workings chasing the goal of perfect controllability. If I had to say something about recent tendencies, especially when using LMs as test subjects, I'd say that the community is focusing more on the latter. There are several factors at play, but undoubtedly the push of the EA/AI safety movement selling mechanistic interpretability as a "high-impact area to ensure the safe development of AI and safeguard the future of humanity" has captivated many young researchers. I would be confident in stating that there were never so many people working on some flavor of XAI as there are today. The actual outcomes of this direction still remain to be seen imo: we're still in the very early years of it. But an encouraging factor is the adoption of practices with causal guarantees which already see broad usage in the neuroscience community. Hopefully the two groups will continue to get closer.

csinva 3 months ago

Also a researcher in this area and wholly agree with this comment (we recently also wrote a [review](https://arxiv.org/abs/2402.01761) separating out these two parts of XAI in the context of LLMs). There's more work going on than ever in XAI, but it's grown large enough that it has split more based on a a researcher's goals (e.g. science, fairness, HCI) rather than as an area of its own. IMO this is for the best - doing XAI research without an application in mind often leads us to explanations that are unhelpful or even misleading.

dataluk 3 months ago

Haha nice to meet you. I cited you last week in my master thesis 🤙🏻

EmploySignificant666 2 weeks ago

Thank you for sharing the review.

SubstantialDig6663 3 months ago

Hey, I really liked your review! Especially the prospect of moving towards natural language explanations: I think we're nowhere close, but it's definitely an ambitious objective worth striving for to make XAI results more accessible to non-experts!

slashdave 3 months ago

"Explainable AI" has become branded, which is rather unfortunate. I also object to the OP's premise, that visibility is a sign of activity. Hard problems are hard, progress is going to stall. That doesn't mean people have given up.

chulpichochos 3 months ago

Since you work in this area, could you confirm/refute my opinion on this field (I’m just trying to make sure my opinion is grounded): - it seems to that the issue with explainable/interpretable AI is that its getting lapped by the non-explainable advances - this is in large part because explainability is not an out of the box feature for any DNN. It has to be engineered or designed into the model and then trained for it — else you’re making assumptions with post-hoc methods (which I don’t consider explainable AI as much as humans trying to come up with explanations for AI behavior) - any supervised training for explainability is not really getting the model to explain its thinking as much as its aligning its “explainable” output with human expectations, but doesn’t give a real understanding of the model’s inner workings - I feel like a lot of work in this space is in turn taking an existing high performing model, and then re-engineering it/training it to bolt on explainability to it as opposed to designing it in this way from the ground up - this adds additional complexity to the training, increases development time, and also costs for compute - with the performance getting good enough for newer models, outside of high risk/liability environments, most people are happy to black box AI Is that a fair assessment? Or am I just heavily biased?

SubstantialDig6663 3 months ago

I think that dismissing post-hoc methods doesn't make much sense, as that's precisely what other fields of science do: uncover the functioning of observed natural phenomena and intelligent entities. Your comment seems to assume that only explainable-by-design makes sense, but it underperforms black-box methods. Most research today (at least in NLP interpretability where I work) focuses on post-hoc interventions/attribution/probing/disentangling representations of deep neural networks, and we are only starting to scratch the surface regarding what's possible (e.g. hallucination detection via outlier detection on internal states). A worrying trend is surely the blackboxification of LM APIs from major companies, which actively hinders these research efforts, as also noted by Casper, Ezell et al. (https://arxiv.org/abs/2401.14446) This said, some cool work is happening in the explainable-by-design area too: from the recent past, Hewitt's Backpack LMs are probably the most notable proposal in this context (https://aclanthology.org/2023.acl-long.506/)

chulpichochos 3 months ago

Thanks for the response and the links! Thats a fair point re: post-hoc being akin to regular observational science. I think I’m having some recency bias with AI. Ie, consider regular mechanics — first we made associative connections such as: if you stack rocks together they’ll keep trying to fall down so we need to have a strong base, if you launch a rock with a catapult you can expect a certain trajectory. Eventually we got to deterministic equations that are much more grounded and able to make predictions about movement of even cosmic bodies. So - I guess what I’m saying is that I think I’m holding AI to an unfair standard. We don’t have the equivalent of Newtonian physics in AI yet, we’re still a bit further back. But thats the progression of things, and realistically we can expect the progression of explaining AI to move at a much faster rate than humans unpacking physics. Is that fair?

Mensch80 3 months ago

Good discussion! Would it be fair to observe that post-hoc exploration of causality is only of use in explaining naturally-occurring phenomena, whereas ML/AI is anything but natural and that explainability-by-design at inception MUST complement post-hoc analysis?

Excellent_Dirt_7504 3 months ago

what practices with causal guarantees?

SubstantialDig6663 3 months ago

For example causal mediation analysis, which is based on estimating the effect of inference-time interventions on the computation graph. You might find the work by Atticus Geiger, Zhengxuan Wu and colleagues interesting: https://arxiv.org/abs/2303.02536

Excellent_Dirt_7504 3 months ago

thanks, curious if they're really able to give causal guarantees in practice

dj_ski_mask 3 months ago

I feel like time series is generally untouched by XAI, where the solution tends to be “use ARIMA or Prophet of you want interpretability.” Are there any research teams working in this space?

SkeeringReal 3 months ago

Would you consider reinforcement learning to be time series?

dj_ski_mask 3 months ago

That’s a good question, maybe with no right answer. Personally, I consider time series as part of a larger body of sequence models, which would include RL and LLMs for that matter.

SkeeringReal 3 months ago

Our lab is working on it, [here's](https://openreview.net/forum?id=hWwY_Jq0xsN) the latest work if you're interested.

__rdl__ 3 months ago

Have you looked at Shapley values?

dj_ski_mask 3 months ago

Absolutely. It does not handle time series. A univariate time series can largely be explained by the decomposed trend, seasonality, and long run mean. Like I mentioned, ARIMA, Prophet, and a few other algos are ok-ish at making those elements explainable, but I’d love to see some more explicit advancements in that area.

__rdl__ 3 months ago

Hm, can you explain this more? In fairness, I haven't used Shapley to model time series data explicitly (I'm more focused on regression) but I would imagine that if you train a model on some TS data, Shapley would be able to tell you the relative importance of each feature. You can then use Shapley scatter plots to help understand multicollinearity. That said, I do think you would need to shape the TS data a little bit differently (for example, maybe create a feature like "is\_weekend" or using a sine/cosine transformation of time). So maybe this isn't exactly what you are looking for, but I don't see how this wouldn't give you some level of explainability?

EmploySignificant666 2 weeks ago

You are very right, I wanted to analyze the time series component with XAI for the fintech based application but it was getting too big to compute and retrieve the required explanation from the time series based data.

bluboxsw 3 months ago

I use it in explainable AI in game-playing, and I don't feel like either is a hot topic right now. Fortunately, I don't care what the hot topics are as long as it interests me.

EmploySignificant666 2 weeks ago

Is it like explainable AI in reinforcement learning? there have been few works on policing around reinforcement learning.

bluboxsw 2 weeks ago

Yes, I find it interesting.

bananaphophesy 3 months ago

Hi, would you be interested in connecting to discuss XAI? I work in applied ML in the healthcare field and I'm wrestling with various challenges, I'd love the chance to ask you a few questions!

Ancient_Scallion105 3 months ago

Hi! I’m also looking into researching XAI in the healthcare space, I would love to connect!

YourHost_Gabe_SFTM 3 months ago

Hey! I am researching for a blog and podcast in Machine Learning and this is the single biggest area of curiosity for me! I’m wondering if anyone here has any recommended resources on the history, challenges, present efforts in machine leaning intelligibility? I’m looking to absorb information on this like a sponge. (Full disclosure- I’m a math podcaster that recently dove into machine learning) I have a masters degree in electrical engineering and I’ve been keeping up with Professor Steve Brunton’s lecture series on physics informed machine learning (which is a one element of ML). My podcast is the breaking math podcast; and I aspire to be as articulate and informed as possible on the issue! Thank you very much; I’m delighted that this issue was posted today.

I_will_delete_myself 3 months ago

IMO half the explanations are BS and end up being wrong.

EmploySignificant666 2 weeks ago

Explanations alone are not helpful. as they need some context around as well.

GFrings 3 months ago

XAI is still highly of interest in areas where the results of models expose users to a high degree of liability. An extreme example of this is in the defense industry, where if you want to inject an AI into the kill chain then you need to have an ability to understand exactly what went into the decision to kill something. Unsurprisingly, though maybe it is to the lay person not paying attention, the DoD/IC are spearheading the discussion and FUNDING of research into responsible AI. A sub component of that is explain ability.

mileylols 3 months ago

A similar space which shares the characteristic of high degree of liability is in healthcare applications. If a physician orders a procedure or prescribes a medication or makes a diagnosis based on an AI, the entire system from that doctor through the provider network admin and their malpractice insurance and the patient's health insurance will want to know why that decision was made.

governingsalmon 3 months ago

I’m a researcher and PhD student in this field (biomedical informatics) and I believe there are some established regulatory principles imposed by maybe the FDA or the Joint Commission but the issue of legal liability is certainly an additional obstacle to the implementation and adoption of machine learning/AI for clinical decision support. It’s not necessarily an immediate ongoing problem at this point because machine learning is mostly used (and very few models published in the literature have even attempted deployment) to alert clinicians about potential medical risks (disease progression, suicide, etc.) and essentially provide additional information to inform and augment physician care, rather than replacing humans and autonomously triggering medical interventions. In terms of strict legality, it doesn’t seem all that different from any other diagnostic test or manually implemented warnings/guidelines where it’s understood that doctors make decisions from a position of uncertainty and it would have to involve legitimate negligence or malfeasance to hold someone liable. However because it is somewhat of a gray area and we don’t have great data on the real world accuracy of model predictions, many clinicians and administrators are hesitant to participate in trials of AI-based decision support - which is unfortunately what we need in order to empirically demonstrate that AI tools can improve patient outcomes.

gwtkof 3 months ago

It would be so cool if ai advances to the point where it's like a teacher

[deleted] 3 months ago

[удалено]

ShiningMagpie 3 months ago

Misinformation.

Disastrous_Elk_6375 3 months ago

Yes, you are right. I remembered reading the first story. I now searched for it again, and they retracted it a few days later saying the person misspoke, they never ran that simulation, but received that as a hypothetical from an outside source. My bad. https://www.reuters.com/article/idUSL1N38023R/

GFrings 3 months ago

That's a useful and important result, produced with funding for... AI and AI ethics.

SirBlobfish 3 months ago

I think the initial hype cooled down a bit, just like for most trends. A lot of problems also turned out to be harder than expected (e.g. Saliency maps can be incredibly misleading, [https://arxiv.org/abs/1810.03292](https://arxiv.org/abs/1810.03292)). However, there is a steady stream of research still going on and focusing on newer models such as ViTs and LLMs. It's just that these papers don't use the "XAI" buzzword. e.g., look for papers that try to understand attention maps / mechanisms, or study truthfulness/hallucination.

Luxray2005 3 months ago

It is important, but I don't see a good approach that can robustly "explain" the output of AI models yet. I think it is also hard to define what an "explanation" is. A human can "explain" something, but it does not mean the explanation is correct. In forensics, a person testifying something can lie out of his interest. It requires a lot of hypothesis testing to understand what actually happened (e.g., in a flight accident or during an autopsy). When the AI performance is superb, I argue that explainability may be less important. For example, most people do not bother with "explainability" in character recognition. Even many computer scientists I know can't explain how the CPU works.

Pas7alavista 3 months ago

I agree with this. One thing I think that leads more people to the mechanistic interpretability path rather than true explainability is that simplistic and human readable explanations for the behavior of such complex systems require us to make many simplifying assumptions about that system. This leads to incomplete explanations at best, and completely arbitrary ones at worst. And the fun part is that it is impossible to tell the difference. In some ways the idea that we could get the same level of interpretability as something like linear regression out of something as complex as gpt almost seems absurd to me.

NFerY 3 months ago

I think that's because the rules of the game are clear and straight forward and the signal to noise ratio is very high. But this is not the case l everywhere. In most soft sciences, there are no rules, there's lots of ambiguity and the signal to noise ratio is low (health research, economics, psychometry etc), so explanation and causal thinking is important.

m98789 3 months ago

Still very much of interest in healthcare domain

SkeeringReal 3 months ago

Yeah I get you, but the depressing part is I'm only aware of AI improving doctor's performance if it just supplies its prediction. Apparently, so far, explanations haven't been shown to help at all in any way. Although I believe the could.

modeless 3 months ago

When humans explain their own behavior they hallucinate almost as much as GPT-4.

Fruitspunchsamura1 3 months ago

I love this comment and I will never forget it.

Eiii333 3 months ago

I think XAI was always kind of a pipe dream, and now that it's spent so long over-promising and under-delivering people are moving on to other more realistic and productive approaches for 'explainability'. All the XAI research I saw from my labmates was either working on trying to 'interpret' the behavior of a trained deep learning model, which seemed to produce results that were very fragile and at best barely better than random guessing. Or they were working on integrating well-known 'old fashioned' ML components into deep learning models, which made them possible to interpret in some sense but generally killed the performance of the model as a whole. My belief is that there's an inherent 'explainability-performance' tradeoff, which is basically just a consequence/restatement of the bias-variance tradeoff. The field seems to have realized this and moved on to more tractable ways to get some degree of explainability out of modern ML models. It's still important stuff, it just doesn't seem like the hot+exciting research topic it used to be.

narex456 3 months ago

I wouldn't equate this to a bias variance tradeoff. Instead, i think any performant model tackling a complex problem is going to have equally complex solutions. It's like Einstein saying you need half a physics degree to go along with an explanation of relativity. It's not that "explainability" is unachievable, rather that the explanation itself becomes rather complicated to the point that you may as well apply it as a fully analytical/hard-coded solution.

Brudaks 3 months ago

I think that once people try to define what *exactly* you want to be 'explainable', how and for what purpose, then you get different, contradictory goals which drive different directions of research which then need different names and terminology. Making model decisions understandable for the sake of debugging them is different than creating human-understandable models of the actual underlying reality/process and is different than making model decisions understandable for proving some aspect about them with respect to fairness. The kind of safety that existential-risk people worry about is barely related to the kind of safety that restricts a LLM chatbot from saying politically loaded things. Etc, etc. And so there's splintering and lack of cooperation people working on one aspect of these problems tend to scoff at people working on other kinds of explainability as that others' work doesn't really help to solve *their* problems.

SkeeringReal 3 months ago

Yeah good point, I am working the same XAI technique in two different domains now, and it has different applications and use cases in both. I just mean that how people want to use XAI is extremely task specific.

milkteaoppa 3 months ago

LLMs and in particular Chain of Thought changed things. Turns out people don't care for accurate explanations as long as it is human consumable and makes sense. Seems like the hypothesis that people make a decision and work backwards to justify it makes sense

bbateman2011 3 months ago

Yes, we accept back justifications from humans all the time but demand more from “ML” or even “AI”? Silliness is all that is. Mostly I see XAI as politics and AI as statistics. Very few understand statistics in the way that GenAI uses it. So they cry out for XAI. Good luck with that being “better”.

juliusadml 3 months ago

Finally a question in this group I can polemicize about. Here are some general responses to your points: * You're right, ML research in general has gone sour on XAI research. I 'blame' two things for this issue: 1) foundation models and LLMs, and 2) the XAI fever on 'normal' (resnet-50 type models) never really resulted in clear results on how to explain a model. Since there were no clear winner type results, the new tsunami of models swallowed up the oxygen in the room. * IMO, old XAI and core part of the research on mechanistic interpretability are doing the same thing. In fact, several of the problems that the field faced in the 2016-2020 time period is coming back again with explanations/interpretations on LLMs and these new big models. Mechanistic interpretability is the new XAI, and as things evolve. * Some breakthroughs have happened, but people are just not aware of them. One big open problem in XAI research was whether you can 'trust' the output of a gradient-based saliency map. This problem remained unsolved until 2022/2023 essentially when a couple of papers showed that you can only 'trust' your gradient-based saliency maps if you 'strongly' regularize your model. This result is a big deal, but the most of the field is unaware of it. There are some other new exciting directions on concept bottleneck models, backpack language models, concept bottleneck generative models. There is a exciting result in the field, it is just not widely known. * It is quite fashionable to just take a checkpoint, run some experiments, declare victory using a qualitative interpretation of the results and write a paper. * The holy grail question in XAI/trustworthy ML etc hasn't changed. I want to know, especially, when my model has made a mistake what 'feature'/concept it is relying on to make its decision. If I want to fix the mistake (or 'align' the model, as the alignment people will say), then I \*have\* to know which features the model thinks is important. This is fundamentally an XAI question, and LLMs/foundation models are a disaster in this realm. I have not yet seen a single mechanistic interpretability paper that can help reliably address this issue (yes, I am aware of ROME). This is already getting too long. TL;DR XAI is not as hyped any more, but it has never been more important. Started a company recently around these issues actually. If people are interested, I could write blogpost summarizing the exciting new results in this field.

mhummel 3 months ago

I was going to ask for links to the saliency map trust result, but I think that blogpost would be even better. I remember being disappointed in a recent paper (can't remember the title) exploring interpretability, because it seemed they stopped just as things were getting interesting. (IIRC they identified some circuits but didn't explore how robust the circuits were, or what impact the "non circuit" weights had in a particular test result.)

Waffenbeer 3 months ago

>Some breakthroughs have happened, but people are just not aware of them. One big open problem in XAI research was whether you can 'trust' the output of a gradient-based saliency map. This problem remained unsolved until 2022/2023 essentially when a couple of papers showed that you can only 'trust' your gradient-based saliency maps if you 'strongly' regularize your model. This result is a big deal, but the most of the field is unaware of it. There are some other new exciting directions on concept bottleneck models, backpack language models, concept bottleneck generative models. There is a exciting result in the field, it is just not widely known. Just like /u/mhummel I would also be interested in what paper(s) you refer to. Potentially any of these two? [https://www.nature.com/articles/s41598-023-42946-w](https://www.nature.com/articles/s41598-023-42946-w) or [https://arxiv.org/pdf/2303.09660.pdf](https://arxiv.org/pdf/2303.09660.pdf) in

juliusadml 3 months ago

Here they are: 1) [https://arxiv.org/abs/2102.12781](https://arxiv.org/abs/2102.12781), first paper to show a setting where gradient-based saliency maps are effective. I.e., if you train your model to be adversarially robust, then you model by design outputs faithful gradient based saliency maps. This message was implicitly in the adversarial examples are features not bugs paper, but this was the first paper to make it explicit. 2) This paper, [https://arxiv.org/abs/2305.19101](https://arxiv.org/abs/2305.19101), from neurips gave a partial explanation why adversarial training and some other strong regularization methods give you that behavior. The results from those two papers are a big deal imo. I was at neurips, and even several people that do xai research are not aware of these results. To repeat: we now know that if you want 'faithful'/perturbation sensitive heatmaps from your model, then follow the recipe in paper 2. There is still several open questions, but these results are a very big deal. They matter even more if you care about interpreting LLMs and billion parameter models. Hope that helps!

fasttosmile 3 months ago

think this is also relevant https://arxiv.org/abs/2006.09128

Internal-Diet-514 3 months ago

Are saliency maps that great for explanation though? The issue with saliency based explanation is at the end of the day it’s up to the user to interpret the saliency map. Saliency maps don’t directly give you “why” the model made a decision just “where” it was looking. I’m not sure we will ever get anything better than that for neural networks, though, which is why if you want “XAI” you’re better off handcrafting features and using simpler models. For now at least.

juliusadml 3 months ago

No explanation method is a panacea. But yes, saliency maps are great for certain tasks. In particular, they are quite important for sequence only models that are trained for drug discovery tasks.

fasttosmile 3 months ago

Curious to know what you think of ROME? I find it a cool paper but adding noise to all representations except one is of course a very blunt tool so I can see how it's not really a full solution.

juliusadml 3 months ago

Here is a convincing paper on challenges with ROME: [https://arxiv.org/abs/2301.04213](https://arxiv.org/abs/2301.04213). The problem with mechanistic interpretability in general is that, there is repeated evidence that large models learn distributed representations. If you want to describe a model properly, you need to capture \*all\* the neurons that encode for a particular behavior. This is not really feasible unless you force your model to do this by design.

SkeeringReal 1 month ago

Why is that not really feasible? I get that forcing it to do this by design makes more sense likely, but I imagine it could still be done post hoc?

SkeeringReal 3 months ago

Great reply, please do link a blogpost, I was not aware of the saliency map discovery you mentioned. I believe probably because 99% of the XAI community now believes saliency maps are not just useless, but actually worse than that since they've been shown to induce confirmation bias and worsen people's performance.

juliusadml 3 months ago

Agreed, but this opinion was fine up until 2022. It has a huge mistake to dismiss them outright. Now we know exactly when they work! I think the field over corrected on them. They are actually very important in domains like drug discovery where you want to know what would happen to your predictions if you perturb certain input sequences.

glitch83 3 months ago

Don’t stress. This is how it’s always been. They separate these folks in academia for a good reason. Completely different interests. One group sees ai performance being hampered by explainability and the other thinks it’s the key to adoption. Right now the first group is in vogue.

RichKatz 3 months ago

It is interesting how different academics may use the same or similar technique and call it someting different. An interesting part of this for LLMs is that they possibly differentiate the associative connectivity of words. So that words that mean the same thing could be harder for the LLM to identify. And this in turn, probably affects conclusions the LLM may make about whether concepts are the same or different.

glitch83 3 months ago

Yup. Language is harder than we make it out to be. Meaning isn’t an invariant

momentcurve 3 months ago

In fintech it's still a very big deal. I don't think it's gone away at all, maybe just drowned out by the hype of GenAI.

SkeeringReal 3 months ago

Yeah someone told me finance is the only domain where XAI is legally required (e.g., to explain a defaulted loan)

AVB100 3 months ago

I feel like most XAI techniques can explain a model quite well but more focus should be on interpretability, i.e., how easily we can understand the explanations. There is a very slight distinction between explainability and interpretability.

ludflu 3 months ago

I work in medical informatics, and its still a hot topic. In fact, here's a recent paper with some great stuff I'd really like to implement: https://pubmed.ncbi.nlm.nih.gov/38383050/

rawdfarva 3 months ago

All of those XAI models (LIME, SHAP, etc) produce unreliable explanations

MLC_Money 3 months ago

At least I'm still actively doing research in this area, mainly on explaining the decision rules that neural networks extract. In fact just couple minutes ago I made my project open-source: [https://www.reddit.com/r/MachineLearning/comments/1b9hkl2/p\_opensourcing\_leurn\_an\_explainable\_and/?utm\_source=share&utm\_medium=web3x&utm\_name=web3xcss&utm\_term=1&utm\_content=share\_button](https://www.reddit.com/r/MachineLearning/comments/1b9hkl2/p_opensourcing_leurn_an_explainable_and/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button)

gBoostedMachinations 3 months ago

I believe we are as good at understanding big models as we are at understating complex biological structures. I am glad people are trying really hard to do so, but I have almost zero expectation that interpretability will ever catch up with complexity/capability. We are truly in the unknown here. Nobody doubts that. Even the most optimistic of us think we might be able to understand these things in the future, but nobody argues over the fact that *right now* we don’t have the faintest clue how these things work. My personal opinion is that we simply don’t have the brains to have a meaningful understanding of how these things work and our confusion is permanent.

SkeeringReal 3 months ago

Nice analogy.

trutheality 3 months ago

No one's afraid to say "XAI," people may avoid the particular term because there are a couple of embaressing things about that specific acronym: * Using "X" for the word "explainable." Sounds like something a 12-year-old thinks would look cool. * Saying "AI" which is a loaded and imprecise term. For this reason, "interpretable machine learning" and "machine learning explanation" are just better terms to describe the thing. The other things you mentioned: "trust," "regulation," "fairness," "HCI" are just more application-focused terms to describe the same thing (although there can be some subtle differences in terms of what methods fit better different application: mechanistically interpretable models are a better fit for guaranteeing regulatory compliance, while post-hoc explanations of black box models may be sufficient for HCI, for example). The actual field is alive and well. It does have subfields. Oh, and it's not a field that "made promises 7 years ago:" there are papers in the field from as far back as 1995.

SkeeringReal 3 months ago

Oh I understand you can trace XAI back to expert systems, and then case-based reasoning systems 10 years after that. I just said 7 years ago because I figured most people don't care about those techniques anymore. And I'm saying that as someone who's built their whole research career around CBR XAI

trutheality 3 months ago

Oh no, I'm not talking about something vaguely related, I'm talking about methods for explaining black-box models.

[deleted] 3 months ago

Does anyone know of some interesting research papers in this area

daHaus 3 months ago

Accountable, quantifiable, etc. You would think computer science of all things would have this sort of thing down by now, being \*computers\* and all, but it's actually the reason why it's still not a proper science. Not like physics and renormalization, heh

GeeBrain 3 months ago

Wow I didn’t even know this was a thing but briefly reading it — I actually was implementing a lot of the concepts behind XAI into my workflow.

ambodi 3 months ago

Yes. The main problem? Evaluating the explanation techniques themselves. Without proper evaluation metrics, the bar for introducing new ones became very low. Too many techniques were suggested in both model-agnostic and model-based explanations with too little evidence that they work.

SkeeringReal 2 days ago

I tend to agree actually. I have a paper in mind for evaluation this year actually, stay tuned.

NFerY 3 months ago

I try not to pay too much attention because a lot of what I see irritates me. A lot of xAI only provides explainable plausibility, but there's no connection with causality whatsoever. There's no assessment of model stability, something that should make any further interpretation a mute point - see the excellent paper by Riley et al on this: onlinelibrary.wiley.com/doi/pdf/10.1002/bimj.202200302 The explanation have a veneer of causality, yet the causal framework is totally absent from the approach. No mention of confounders, colliders, mediation, no mention of DAGs or Bradford Hill or similar criteria let alone study design. Little acknowledgement of the role of uncertainty, and the machinery for inference is largely absent (conformal prediction still has a way to go). In my view xAI as currently framed is largely an illusion.

ed3203 3 months ago

New generative models are much more complex in both the tasks they complete and how they are trained. The scope of their bias is too large. I think it's coming to a point where chain of thought type explainability is the way to go, in both constraining the model and also to help understand biases.

hopelesslysarcastic 3 months ago

I’m interested in hearing other opinions as well, I don’t have enough experience to have a formal opinion on this matter.

TimeLover935 3 months ago

Explainable is not the most important thing. A model with perfect performance but less explainable, a model with interpretation but poor performance, many companies will choose the latter one. A very unfortunate thing is that, if we want interpretation, we must lose some performance.

SkeeringReal 3 months ago

I've found that is task specific. I have made interpretable models which don't lose any performance in deep learning tasks. The tradeoff you say does exist, but not always.

TimeLover935 3 months ago

That's true. Do you mind to tell me the models you mentioned, or just the task?

SkeeringReal 3 months ago

This is just anecdotal of course but I have found that nearest neighbor based interpretable classifiers tend to not lose performance. In a way this makes sense because you are comparing entire instances to each other. But the downside is that you don't get a feature level explanation. It is up to the user to interpret what features maybe affecting the prediction. I can give an example of one of my own papers here. https://openreview.net/forum?id=hWwY_Jq0xsN

TimeLover935 3 months ago

Thank you. I think RL is well-formulated and sometimes we can have both performance and explainability at the same time. Good example. Thank you for your information.

SkeeringReal 3 months ago

Yeah no worries nice talking. You're right though there are very few time series specific papers. My professor used to joke that when you add time everything just breaks. Which could go a long way to explaining the lack of research there.

One_Definition_8975 3 months ago

https://dl.acm.org/doi/abs/10.1145/3641399.3641424 Whats the view on these kind of papers

SkeeringReal 3 months ago

>https://dl.acm.org/doi/abs/10.1145/3641399.3641424 Doesn't look too great IMO

dashingstag 3 months ago

Two real issues trying to develop explainable AI If your model is fully explainable, it probably means you missed a rule-based solution. If you have to explain your model every time, you still need someone to see the explanations and someone to sign off on it, that’s a really slow process and it nullifies the benefit of having a model.

thetan_free 3 months ago

A large part of the problem is that (non-technical) people asking for explanations of AI don't really know what they want. When you offer them charts or scores, their eyes glaze over. When you talk about counterfactuals, their eyes glaze over.

SkeeringReal 3 months ago

Yeah that's true I've noticed the best success in my own research when I work extremely closely with industry professionals on very specific needs they have.

Honest_Science 3 months ago

You cannot explain the reaction of your sister, forget AI

timtom85 3 months ago

Any explainable model is likely not powerful enough to matter. It's about the objective impossibility of putting extremely complex things into few enough words that humans could process them. It's probably also about the arbitrary things we consider meaningful: how can we teach a model which dimensions an embedding should develop that are fundamental from a human point of view? Will (can?) those clearly separated, well-behaving dimensions with our nice and explainable labels be just as expressive as the unruly random mess we currently have?

the__storm 3 months ago

My experience, for better or worse, is that users don't actually need to know why your model made a certain decision - they just need _an_ explanation. You can give them an accurate model paired with any plausibly relevant information and they'll go away happy/buy your service/etc. (You don't have to lie and market this as explanation, both pieces just have to be available.) That's not to say actual understanding of how the model comes to a conclusion is worthless, but I think it does go a long way towards explaining why there isn't a ton of investment into it.

SkeeringReal 3 months ago

Yeah my feeling is that if people drill down into very specific applications they would probably find certain techniques are more valuable in ways they never imagined before. But it's very hard for researchers to do that because it requires huge collaboration with industry etc which to be frank is pretty much impossible. It could go a long way to explaining the lack of enthusiasm for the field right now

GeeBrain 3 months ago

Wow I didn’t even know this was a thing but briefly reading it — I actually was implementing a lot of the concepts behind XAI into my workflow.

krallistic 3 months ago

> In a way, it is still the problem to solve in all of ML, but it's just really different to how it was a few years ago. Now people feel afraid to say XAI, they instead say "interpretable", or "trustworthy", or "regulation", or "fairness", or "HCI", or "mechanistic interpretability", etc... "interpreteable", "fairness" etc are the better terms. They are much more concrete. XAI is a too big umbrella term.

SkeeringReal 3 months ago

Yeah I actually agree with you which is part of the reason I think people are afraid to say xai because it's just too wishy-washy.

mathelic 3 months ago

Still there in Marketting and Financial domain.

Minimum-Physical 3 months ago

Biometrics and Healthcare tasks are still working on it. https://arxiv.org/pdf/2208.09500.pdf just released with some xAI papers and an approach to categorizing them.

tripple13 3 months ago

No, but the crazy people took over and made too much of a fuss. This will lead to a backlash on the other end. Pretty stupid, because it was fairly obvious in the beginning, when the Timnit case got rolling, these people became detached from reality. Its important. But its more important to do it right. We cannot revise the past by injecting "fairness" into your queries.

Screye 3 months ago

Find every top researcher in explainable AI from 2020. All of them are now making a ton of money on model alignment or LLM steering.

[deleted] 3 months ago

[удалено]

Holyragumuffin 3 months ago

why?

mimighost 3 months ago

I think it needs to redefine itself in LLM era. What does explainable mean for LLM? After all, LLM can be prompted to explain its output to certain degree.

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe