A Fast Similarity Calculation Method Based on Cotangent Similarity and BP Neural Network
CSTR:
Author:
Affiliation:

College of Electronics and Information Engineering, Tongji University, Shanghai 201804, China

Clc Number:

TP311.1

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference [20]
  • |
  • Related [20]
  • | | |
  • Comments
    Abstract:

    Similarity measurement is of great significance in big data related applications. However, the traditional cosine similarity traversal calculation method has a poor accuracy and timeliness, which cannot provide an effective basis for the quality assessment of massive high-dimensional data. To improve the accuracy of similarity calculation, two types of cotangent similarity formulas with cotangent trigonometric function and data dimensional differences was constructed. Besides, a back-propagation(BP) neural network model approximating the similarity mapping relationship of datasets was established to reduce the time complexity. The experimental results demonstrate that the improved fast similarity calculation method has a good accuracy and timeliness. Moreover, it has a more significant performance improvement when applied to large-scale datasets.

    Table 5
    Fig.1 Schematic diagram of relationship between two-dimensional vectors
    Fig.2 Flowchart of fast similarity calculation based on cotangent similarity and BP neural network
    Fig.3 Pseudocode of fast similarity calculation based on cotangent similarity and BP neural network
    Fig.4 Similarity calculation error based on neural network and traversal calculation(CWRU subdatasets)
    Fig.5 Comparison of running time of similarity calculation based on different calculation formulas
    Fig.6 Comparison of running time of similarity calculation based on different calculation methods
    Reference
    [1] KUPPILI V, BISWAS M, EDLA D R, et al. A mechanics-based similarity measure for text classification in machine learning paradigm[J]. IEEE Transactions on Emerging Topics in Computational Intelligence, 2020, 4(2):180. DOI: 10.1109/TETCI.2018.2863728.
    [2] DUBEY V K, SAXENA A K. A cosine-similarity mutual-information approach for feature selection on high dimensional datasets[J]. Journal of Information Technology Research, 2017, 10(1): 15. DOI: 10.4018/JITR.2017010102.
    [3] LIN Liang, WANG Guangrun, ZUO Wangmeng, et al. Cross-domain visual matching via generalized similarity measure and feature learning[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6):1089. DOI: 10.1109/TPAMI.2016.2567386.
    [4] EGHBALI S, TAHVILDARI L. Fast cosine similarity search in binary space with angular multi-index hashing[J]. IEEE Transactions on Knowledge & Data Engineering, 2019, 31(2):329. DOI: 10.1109/TKDE.2018.2828095.
    [5] JIA Ke, WANG Congbo, BI Tianshu, et al. Transient current waveform similarity-based protection for flexible dc distribution system[J]. IEEE Transactions on Industrial Electronics, 2019, 66(12): 9301. DOI: 10.1109/TIE.2019.2891457.
    [6] XIA Peipei , ZHANG Li, LI Fanzhang . Learning similarity with cosine similarity ensemble[J]. Information Sciences, 2015, 307:39. DOI: 10.1016/j.ins.2015.02.024.
    [7] LIU Chengjun. Discriminant analysis and similarity measure[J]. Pattern Recognition, 2014, 47(1):359. DOI: 10.1016/j.patcog.2013.06.023.
    [8] 蒋欣, 王开军, 陈黎飞. 基于改进余弦相似度的粒子滤波故障预报[J]. 计算机系统应用, 2015, 024(1):98.
    [9] YANG Junhua, YONG Li, WEI Cheng, et al. EKF–GPR-based fingerprint renovation for subset-based indoor localization with adjusted cosine similarity[J]. Sensors, 2018, 18(2): 318. DOI: 10.3390/s18010318.
    [10] YE Jun. Improved cosine similarity measures of simplified neutrosophic sets for medical diagnoses[J]. Artificial Intelligence in Medicine, 2015, 63(3): 171. DOI: 10.1016/j.artmed.2014.12.007.
    [11] WEI Guiwu. Some cosine similarity measures for picture fuzzy sets and their applications to strategic decision making[J]. INFORMATICA, 2017, 28(3): 547–564. DOI: 10.15388/Informatica.2017.144.
    [12] YE Jun. Single-valued neutrosophic similarity measures based on cotangent function and their application in the fault diagnosis of steam turbine[J]. Soft Computing, 2017, 21(3): 817. DOI: 10.1007/s00500-015-1818-y.
    [13] HAYASHI T, SATO A. Fast similarity retrieval of vector images using representative queries[C]//IEEE International Symposium on Multimedia. Anaheim: IEEE, 2013: 498-499. DOI: 10.1109/ISM.2013.95.
    [14] TANIOKA H. A Fast content-based image retrieval method using deep visual features[C]//2019 International Conference on Document Analysis and Recognition Workshops (ICDARW). Sydney: IEEE, 2019: 20-23.
    [15] GOODFELLOW I, BENGIO Y, COURVILLE A. Deep learning[M]. Cambridge: MIT Press, 2016.
    [16] YU Wanke, ZHAO Chunhui. Broad convolutional neural network based industrial process fault diagnosis with incremental learning capability[J]. IEEE Transactions on Industrial Electronics, 2020, 67(6): 5081. DOI: 10.1109/TIE.2019.2931255.
    [17] YU Zhiwen, LUO Peinan, YOU Jane, et al. Incremental semi-supervised clustering ensemble for high dimensional data clustering[J]. IEEE Transactions on Knowledge and Data Engineering, 2016, 28(3): 701. DOI: 10.1109/TKDE.2015.2499200.
    [18] JAN Z, VERMA B. Multiple strong and balanced cluster-based ensemble of deep learners[J]. Pattern Recognition, 2020, 107: 107420. DOI: 10.1016/j.patcog.2020.107420.
    [19] SMITH W A, RANDALL R B. Rolling element bearing diagnostics using the Case Western Reserve University data: a benchmark study[J]. Mechanical Systems and Signal Processing, 2015, 64: 100-131. DOI: 10.1016/j.ymssp.2015.04.021.
    [20] MARINS M A, RIBEIRO F M L , NETTO S L , et al. Improved similarity-based modeling for the classification of rotating-machine failures[J]. Journal of the Franklin Institute, 2018, 355(4): 1913. DOI: 10.1016/j.jfranklin.2017.07.038.
    Cited by
    Comments
    Comments
    分享到微博
    Submit
Get Citation

QIAO Fei, GUAN Liuen, WANGE Qiaoling. A Fast Similarity Calculation Method Based on Cotangent Similarity and BP Neural Network[J].同济大学学报(自然科学版),2021,49(1):153~162

Copy
Share
Article Metrics
  • Abstract:631
  • PDF: 1126
  • HTML: 90
  • Cited by: 0
History
  • Received:August 27,2020
  • Online: February 26,2021
Article QR Code