Pruning for Performance: Efficient Idiom and Metaphor Classification in Low-Resource Konkani Using mBERT

Timothy Do, Pranav Saran, Harshita Poojary, Pranav Prabhu, Sean O'Brien, Vasu Sharma, Kevin Zhu

Algoverse AI Research

Paper arXiv Code Dataset
Processing of Konkani metaphorical expressions using mBERT+BiLSTM. The phrase highlighted in red is analyzed for metaphorical content, with contrasting classification outcomes shown.

Abstract

In this paper, we address the persistent challenges that figurative language expressions pose for natural language processing (NLP) systems, particularly in low-resource languages such as Konkani. We present a hybrid model that integrates a pre-trained Multilingual BERT (mBERT) with a bidirectional LSTM and a linear classifier. This architecture is fine-tuned on a newly introduced annotated dataset for metaphor classification, developed as part of this work. To improve the model's efficiency, we implement a gradient-based attention head pruning strategy. For metaphor classification, the pruned model achieves an accuracy of 78%. We also applied our pruning approach to expand on an existing idiom classification task, achieving 83% accuracy. These results demonstrate the effectiveness of attention head pruning for building efficient NLP tools in underrepresented languages.

Attention Head Importance Pruning

Heatmaps showing attention head importance scores across layers for idiom (left) and metaphor (right) classification. Idiom classification shows higher importance values in earlier layers compared to later ones, while metaphor classification exhibits a higher importance score near the center of the heatmap.

Results

Metric Idiom Classification Metaphor Classification
Original Model Pruned Model Original Model Pruned Model
Precision0.870.861.000.87
Recall0.890.910.750.65
F1-Score0.880.880.860.74
Accuracy0.820.830.880.78
Macro Avg Precision0.780.790.900.79
Macro Avg Recall0.770.770.880.78
Weighted Avg Precision0.820.820.900.79
Weighted Avg Recall0.820.830.880.78
Comparison of original and pruned mBERT+BiLSTM models on idiom and metaphor classification. Idiom performance remains stable post-pruning, while metaphor classification shows metric drops, reflecting its reliance on a broader set of attention heads and the need for task-specific pruning strategies.

BibTeX Citation

@misc{do2025pruningperformanceefficientidiom,
  title={Pruning for Performance: Efficient Idiom and Metaphor Classification in Low-Resource Konkani Using mBERT},
  author={Timothy Do and Pranav Saran and Harshita Poojary and Pranav Prabhu and Sean O'Brien and Vasu Sharma and Kevin Zhu},
  year={2025},
  eprint={2506.02005},
  archivePrefix={arXiv},
  primaryClass={cs.CL},
  url={https://arxiv.org/abs/2506.02005}
}