What Goes Where In Calgary? A Garbage Classification System Based on Images and Natural Language

Loading...
Thumbnail Image

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Disposing of garbage using the correct trash bin is important because it maximizes recycling and is good for the environment. However, this is a challenging task for individuals without proper knowledge or training to dispose of garbage properly. Artificial Intelligence methods, deep learning in special, can be leveraged in this task. Most current deep learning systems assume that all the necessary information for garbage classification is contained in images. We hypothesize that combining images with natural language descriptions of the objects provided by the individual trying to dispose of the piece of garbage can add contextual information that may not be present in the image and vice-versa, and by combining these two sources of information, images and text, it is possible to achieve better garbage classification results when performing classification using either image- or text-only information. This thesis propose (1) a novel public benchmark dataset, which includes 20,000 images of garbage with corresponding text descriptions and class labels; (2) a multimodal garbage classification model based on what we call "Reverse Cross Attention" (RCA), which explores the complementarity of information between image and text. Our proposed model achieved improved results compared to unimodal models based solely on images or text and state-of-the-art multimodal models. Our work demonstrates that the proposed model outperforms the best unimodal results by an average of 2% across all metrics when combining text and image information using the RCA mechanism.

Description

Citation

Cazarin Filho, J. C. (2025). What goes where in Calgary? A garbage classification system based on images and natural language (Master's thesis, University of Calgary, Calgary, Canada). Retrieved from https://prism.ucalgary.ca.

Endorsement

Review

Supplemented By

Referenced By