SkeletonNet: A Hybrid Network With a Skeleton-Embedding Process for Multi-View Image Representation Learning

doi:10.1109/TMM.2019.2912735

CORC > 计算技术研究所 > 中国科学院计算技术研究所 > 中国科学院计算技术研究所期刊论文 > 英文

	SkeletonNet: A Hybrid Network With a Skeleton-Embedding Process for Multi-View Image Representation Learning
	Yang, Shijie 1,2; Li, Liang 3; Wang, Shuhui 3; Zhang, Weigang 4,5; Huang, Qingming 1,2,3; Tian, Qi 6,7
刊名	IEEE TRANSACTIONS ON MULTIMEDIA
	2019-11-01
卷号	21 期号:11 页码:2916-2929
关键词	Semantics Correlation Visualization Skeleton Matrix decomposition Kernel Laplace equations Unsupervised multi-view subspace learning semantic inconsistency tensor factorization deep auto-encoders
ISSN号	1520-9210
DOI	10.1109/TMM.2019.2912735
英文摘要	Multi-view representation learning plays a fundamental role in multimedia data analysis. Some specific inter-view alignment principles are adopted in conventional models, where there is an assumption that different views share a common latent subspace. However, when dealing views on diverse semantic levels, the view-specific characteristics are neglected, and the divergent inconsistency of similarity measurements hinders sufficient information sharing. This paper proposes a hybrid deep network by introducing tensor factorization into the multi-view deep auto-encoder. The network adopts skeleton-embedding process for unsupervised multi-view subspace learning. It takes full consideration of view-specific characteristics, and leverages the strength of both shallow and deep architectures for modeling low- and high-level views, respectively. We first formulate the high-level-view semantic distribution as the underlying skeleton structure of the learned subspace, and then infer the local tangent structures according to the affinity propagation of low-level-view geometric correlations. As a consequence, more discriminative subspace representation can be learned from global semantic pivots to local geometric details. Experimental comparisons on three benchmark image datasets show the promising performance and flexibility of our model.
资助项目	National Natural Science Foundation of China[61836002] ; National Natural Science Foundation of China[61771457] ; National Natural Science Foundation of China[61620106009] ; National Natural Science Foundation of China[61732007] ; National Natural Science Foundation of China[61672497] ; National Natural Science Foundation of China[U1636214] ; National Natural Science Foundation of China[61572488] ; National Natural Science Foundation of China[61772494] ; National Natural Science Foundation of China[61472389] ; National Basic Research Program of China (973 Program)[2015CB351800] ; Key Research Program of Frontier Sciences[CAS: QYZDJ-SSW-SYS013]
WOS研究方向	Computer Science ; Telecommunications
语种	英语
出版者	IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
WOS记录号	WOS:000494363000018
内容类型	期刊论文
源URL	[http://119.78.100.204/handle/2XEOYT63/14883]
专题	中国科学院计算技术研究所期刊论文_英文
通讯作者	Li, Liang
作者单位	1.Univ Chinese Acad Sci, Sch Comp Sci & Technol, Beijing 101408, Peoples R China 2.Univ Chinese Acad Sci, Key Lab Big Data Min & Knowledge Management, Beijing 101408, Peoples R China 3.Chinese Acad Sci, Inst Comp Technol, Key Lab Intelligent Informat Proc, Beijing 100190, Peoples R China 4.Harbin Inst Technol, Sch Comp Sci & Technol, Weihai 264209, Peoples R China 5.Chinese Acad Sci, Univ Chinese Acad Sci, Beijing 100049, Peoples R China 6.Univ Texas San Antonio, Dept Comp Sci, San Antonio, TX 78249 USA 7.Huawei Noahs Ark Lab, Comp Vis, Shenzhen 518129, Peoples R China
推荐引用方式 GB/T 7714	Yang, Shijie,Li, Liang,Wang, Shuhui,et al. SkeletonNet: A Hybrid Network With a Skeleton-Embedding Process for Multi-View Image Representation Learning[J]. IEEE TRANSACTIONS ON MULTIMEDIA,2019,21(11):2916-2929.
APA	Yang, Shijie,Li, Liang,Wang, Shuhui,Zhang, Weigang,Huang, Qingming,&Tian, Qi.(2019).SkeletonNet: A Hybrid Network With a Skeleton-Embedding Process for Multi-View Image Representation Learning.IEEE TRANSACTIONS ON MULTIMEDIA,21(11),2916-2929.
MLA	Yang, Shijie,et al."SkeletonNet: A Hybrid Network With a Skeleton-Embedding Process for Multi-View Image Representation Learning".IEEE TRANSACTIONS ON MULTIMEDIA 21.11(2019):2916-2929.