Proceedings of the 3rd International Conference on Mechatronics and Industrial Informatics

Extracting High-level Multimodal Features

Authors
Xin Li, Ruifang Liu
Corresponding Author
Xin Li
Available Online October 2015.
DOI
10.2991/icmii-15.2015.103How to use a DOI?
Keywords
deep learning; denoising autoencoder, multimodal
Abstract

Consider the problem of building high-level, multimodal features from only unlabeled data, we train model consisting of a sparse stacked denoising autoencoder network with max pooling, which can be used to extract high-level image feature, on a large dataset consisting of multimodal information, and a text treating processes. Our model joints the image feature and text feature as representation of one united movie. We find that these representation can be used in regression mission, predict movie’s rating, and the model obtains better effect than unimodal representation.

Copyright
© 2015, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Download article (PDF)

Volume Title
Proceedings of the 3rd International Conference on Mechatronics and Industrial Informatics
Series
Advances in Computer Science Research
Publication Date
October 2015
ISBN
978-94-6252-131-5
ISSN
2352-538X
DOI
10.2991/icmii-15.2015.103How to use a DOI?
Copyright
© 2015, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

TY  - CONF
AU  - Xin Li
AU  - Ruifang Liu
PY  - 2015/10
DA  - 2015/10
TI  - Extracting High-level Multimodal Features
BT  - Proceedings of the 3rd International Conference on Mechatronics and Industrial Informatics
PB  - Atlantis Press
SP  - 605
EP  - 610
SN  - 2352-538X
UR  - https://doi.org/10.2991/icmii-15.2015.103
DO  - 10.2991/icmii-15.2015.103
ID  - Li2015/10
ER  -