Welcome Tp CRETA! Contact Us at : +886.2.3366.1072
There is increasing interest in understanding fashion trends based on street photos. Though street photos usually contain rich clothing information, there are several technical challenges to their analysis. First, street photos collected from social media sites often contain user-provided noisy labels, and training models using these labels may deteriorate prediction performance. Second, most existing methods predict multiple clothing attributes individually and do not consider the potential to share knowledge between related tasks. In addition to these technical challenges, most fashion image datasets created by previous studies focus on American and European fashion styles. To address these technical challenges and understand fashion trends in Asia, we created RichWear, a new street fashion dataset containing 322,198 images with various text labels for fashion analysis. This dataset, collected from an Asian social network site, focuses on street styles in Japan and other Asian areas. RichWear provides a subset of expert-verified labels in addition to user-provided noisy labels for model training and evaluation. To improve fashion recognition, we propose the Fashion Attributes Recognition Network (FARNet) based on the multi-task learning framework. Instead of predicting each clothing attribute individually, FARNet predicts three types of attributes simultaneously, and, once trained, this network leverages the noisy labels and generates corrected labels based on the input images. Experimental results show that this approach significantly outperforms existing methods. We utilize the predicted labels for trend discovery and cluster images for exploration of street styles. We successfully identify street fashion trends as well as discovering style dynamics in Asia.
Keywords: Fashion recognition, Multi-label classification, Noisy labels, Deep learning, Multi-task learning, Computer vision