Extracting High-level Multimodal Features
Authors
Xin Li, Ruifang Liu
Corresponding Author
Xin Li
Available Online October 2015.
- DOI
- 10.2991/icmii-15.2015.103How to use a DOI?
- Keywords
- deep learning; denoising autoencoder, multimodal
- Abstract
Consider the problem of building high-level, multimodal features from only unlabeled data, we train model consisting of a sparse stacked denoising autoencoder network with max pooling, which can be used to extract high-level image feature, on a large dataset consisting of multimodal information, and a text treating processes. Our model joints the image feature and text feature as representation of one united movie. We find that these representation can be used in regression mission, predict movie’s rating, and the model obtains better effect than unimodal representation.
- Copyright
- © 2015, the Authors. Published by Atlantis Press.
- Open Access
- This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).
Cite this article
TY - CONF AU - Xin Li AU - Ruifang Liu PY - 2015/10 DA - 2015/10 TI - Extracting High-level Multimodal Features BT - Proceedings of the 3rd International Conference on Mechatronics and Industrial Informatics PB - Atlantis Press SP - 605 EP - 610 SN - 2352-538X UR - https://doi.org/10.2991/icmii-15.2015.103 DO - 10.2991/icmii-15.2015.103 ID - Li2015/10 ER -