基于多尺度融合增强的服装图像解析方法
CSTR:
作者:
作者单位:

江南大学 人工智能与计算机学院,江苏 无锡214000

作者简介:

陈丽芳(1973—),女,教授,硕士生导师,主要研究方向为数字图像处理、深度学习理论与应用、目标三维重建。E-mail:may7366@163.com

中图分类号:

TP399

基金项目:

国家自然科学基金(61872166)


Clothing Image Parsing Method Based on Multi-scale Fusion Enhancement
Author:
Affiliation:

School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214000, China

  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [27]
  • |
  • 相似文献
  • | | |
  • 文章评论
    摘要:

    基于卷积神经网络中的各个层次特征,提出了一种基于多尺度融合增强的服装图像解析方法。通过融合增强模块,在考虑全局信息的基础上对包含的语义信息和不同尺度特征进行有效融合。结果表明:在Fashion Clothing测试集上的平均F1分数达到60.57%,在LIP(Look Into Person)验证集上的平均交并比(mean intersection over union,MIoU)达到54.93%。该方法可以有效地提升服装图像解析精度。

    Abstract:

    By using the features of each level in convolutional neural network, a clothing image parsing method based on multi-scale fusion enhancement was proposed. Through the fusion enhancement module, the semantic information and the features in different scales were effectively fused with the consideration of global information. The results show that the average F1 score on the Fashion Clothing test set reaches 60.57%, and the mean intersection over union(MIoU) on the Look Into Person(LIP) validation set reaches 54.93%. The method can effectively improve the accuracy of clothing image parsing.

    参考文献
    [1] LIU S, SONG Z, LIU G, et al. Street-to-shop: cross-scenario clothing retrieval via parts alignment and auxiliary set[C]//2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence:IEEE, 2012:3330-3337.
    [2] 徐慧, 白美丽, 万韬阮, 等. 基于深度学习的服装图像语义分析与检索推荐[J]. 纺织高校基础科学学报, 2020,33(3):64.
    [3] ZHU S, URTASUN R, FIDLER S, et al. Be your own Prada: fashion synthesis with structural coherence[C]//Proceedings of the IEEE International Conference on Computer Vision. Venice :IEEE,2017:1680-1688.
    [4] LIU X, ZHANG M, LIU W, et al. BraidNet: braiding semantics and details for accurate human parsing[C]//Proceedings of the 27th ACM International Conference on Multimedia. New York: Association for Computing Machinery, 2019: 338-346.
    [5] LUO Y, ZHENG Z, ZHENG L, et al. Macro-micro adversarial network for human parsing[C]//Proceedings of the European Conference on Computer Vision (ECCV). Berlin: Springer, 2018: 418-434.
    [6] WANG W, ZHANG Z, QI S, et al. Learning compositional neural information fusion for human parsing[C]//Proceedings of the IEEE International Conference on Computer Vision. Seoul:IEEE, 2019: 5703-5713.
    [7] GONG K, GAO Y, LIANG X, et al. Graphonomy: universal human parsing via graph transfer learning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 7450-7459.
    [8] CHEN L C, YANG Y, WANG J, et al. Attention to scale: scale-aware semantic image segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE,2016: 3640-3649.
    [9] ZHAO Y, LI J, ZHANG Y, et al. Multi-class part parsing with joint boundary-semantic awareness[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Long Beach: IEEE, 2019: 9177-9186.
    [10] LUO X, SU Z, GUO J, et al. Trusted guidance pyramid network for human parsing[C]//Proceedings of the 26th ACM International Conference on Multimedia. New York: Association for Computing Machinery, 2018: 654-662.
    [11] HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE,2018: 7132-7141.
    [12] WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]//Proceedings of the European Conference on Computer Vision (ECCV). Berlin: Springer,2018:3-19.
    [13] WANG X, GIRSHICK R, GUPTA A, et al. Non-local neural networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 7794-7803.
    [14] HU J, SHEN L, ALBANIE S, et al. Gather-excite: exploiting feature context in convolutional neural networks[C]//NIPS’18: Proceedings of the 32nd International Conference on Neural Information Processing Systems. Red Hook:Curran Associates Inc., 2018: 9423-9433.
    [15] LI H, XIONG P, An J, et al. Pyramid attention network for semantic segmentation[J/OL]. [2021-05-06].https://arxiv.org/abs/1805.10180.
    [16] HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas:IEEE, 2016: 770-778.
    [17] YANG W, LUO P, LIN L. Clothing co-parsing by joint image segmentation and labeling[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Columbus: IEEE, 2014: 3182-3189.
    [18] YAMAGUCHI K, KIAPOUR M H, ORTIZ L E, et al. Parsing clothing in fashion photographs[C]//2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence: IEEE, 2012: 3570-3577.
    [19] LIU S, FENG J, DOMOKOS C, et al. Fashion parsing with weak color-category labels[J]. IEEE Transactions on Multimedia, 2013, 16(1): 253.
    [20] GONG K, LIANG X, ZHANG D, et al. Look Into Person: self-supervised structure-sensitive learning and a new benchmark for human parsing[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE,2017: 932-940.
    [21] WANG W, ZHU H, DAI J, et al. Hierarchical human parsing with typed part-relation reasoning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 8929-8939.
    [22] YAMAGUCHI K, HADI K M, BERG T L. Paper doll parsing: retrieving similar styles to parse clothing items[C]//Proceedings of the IEEE International Conference on Computer Vision. Sydney: IEEE,2013: 3519-3526.
    [23] CHEN L C,PAPANDREOU G,KOKKINOS I, et al. DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 40(4): 834.
    [24] LUC P, COUPRIE C, CHINTALA S, et al. Semantic segmentation using adversarial networks[C]// Workshop on Adversarial TrainingNIPS 2016. Barcelona: IEEE, 2016: 1-9.
    [25] LIANG X, GONG K, SHEN X, et al. Look Into Person: joint body parsing & pose estimation network and a new benchmark[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 41(4): 871.
    [26] RUAN T, LIU T, HUANG Z, et al. Devil in the details: towards accurate single and multiple human parsing[C]//The Thirty-Third AAAI Conference on Artificial Intelligence. Menlo Park: Association for the Advancement of Artificial Intelligence, 2019:4814-4821.
    [27] ZHANG S, QI G J, CAO X, et al. Human parsing with pyramidical gather-excite context[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2020, 31(3): 1016.
    相似文献
    引证文献
引用本文

陈丽芳,余恩婷.基于多尺度融合增强的服装图像解析方法[J].同济大学学报(自然科学版),2022,50(10):1385~1391

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2022-05-10
  • 在线发布日期: 2022-11-03
文章二维码