MM-SHAP: A Performance-agnostic Metric for Measuring Multimodal Contributions in Vision and Language Models & Tasks

Vision and language models (VL) are known to exploit unrobust indicators in individual modalities (e.g., introduced by distributional biases) instead of focusing on relevant information in each modality. That a unimodal model achieves similar accuracy on a VL task to a multimodal one, indicates that so-called unimodal collapse occurred. However, accuracy-based tests fail to detect e.g., when the model prediction is wrong, while the model used relevant information from a modality. Instead, we propose MM-SHAP, a performance-agnostic multimodality score based on Shapley values that reliably quantifies in which proportions a multimodal model uses individual modalities. We apply MM-SHAP in two ways: (1) to compare models for their average degree of multimodality, and (2) to measure for individual models the contribution of individual modalities for different tasks and datasets. Experiments with six VL models – LXMERT, CLIP and four ALBEF variants – on four VL tasks highlight that unimodal collapse can occur to different degrees and in different directions, contradicting the wide-spread assumption that unimodal collapse is one-sided. Based on our results, we recommend MM-SHAP for analysing multimodal tasks, to diagnose and guide progress towards multimodal integration. Code available at https://github.com/Heidelberg-NLP/MM-SHAP .

Image

Image alternative text

Federation

Status:

Instances:

/m/machinelearning

Microblog (479)

Thread

KingsmanVince

@KingsmanVince@kbin.social

Added: 6 months ago
Online: -
Ratio: 1 (100%)

Magazine

Machine Learning

@machinelearning@kbin.social

Machine learning (ML) is a field devoted to understanding and building methods that let machines "learn" – that is, methods that leverage data to improve computer performance on some set of tasks.

Machine learning algorithms build a model based on sample data, known as training data, in order to make predictions or decisions without being explicitly programmed to do so. Machine learning algorithms are used in a wide variety of applications, such as in medicine, email filtering, speech recognition, agriculture, and computer vision, where it is difficult or unfeasible to develop conventional algorithms to perform the needed tasks.

Rules

Be nice: no offensive behavior, insults or attacks
Make your post clear and comprehensive
Limit self promotion

Created: 11 months ago
Owner: genesis
Subscribers: 1024
Online: -

Tags

#machine #learning #ml #ai #artificial #intelligence

Moderators

genesis
nsa

Active people

Related posts

Some thoughts on where we are with the evolution of #InformationTechnology, #AI and #machinelearning how we got here, complete with a silly #mathematical anlogy....

Show more

4 days ago to InformationTechnology

AI-Weekly for Tuesday, May 14, 2024 - Issue 112...

Show more

9 days ago to ai

(1/2) MIT Introduction to Deep Learning 🚀🚀🚀...

Show more

7 days ago to ArtificialIntelligence

The Hundred-Page Machine Learning Book https://leanpub.com/theMLbook by Andriy Burkov is the featured book on the Leanpub homepage! https://leanpub.com #DataScience #ComputerScience #MachineLearning #AI #ebooks

Show more

7 days ago to datascience

Related threads

Visions of Chaos Tutorials

Show more

11 months ago to visionsofchaos

Interdimensional Machine Room [EXPERIMENTAL VQGAN]

Show more

11 months ago to visionsofchaos

Inside the messy ethics of making war with machines

Show more

8 months ago to technology

A 'black box' AI system has been influencing criminal justice decisions for over two decades—it's time to open it up

Show more

9 months ago to tech

Support Us