Learning temporal information from spatial information using CapsNets for human action recognition

Algamdi, A M; Sanchez, V; Li, Chang-Tsun

File(s) under permanent embargo

Learning temporal information from spatial information using CapsNets for human action recognition

conference contribution

posted on 2019-01-01, 00:00 authored by A M Algamdi, V Sanchez, Chang-Tsun LiChang-Tsun Li

Capsule Networks (CapsNets) are recently introduced to overcome some of the shortcomings of traditional Convolutional Neural Networks (CNNs). CapsNets replace neurons in CNNs with vectors to retain spatial relationships among the features. In this paper, we propose a CapsNet architecture that employs individual video frames for human action recognition without explicitly extracting motion information. We also propose weight pooling to reduce the computational complexity and improve the classification accuracy by appropriately removing some of the extracted features. We show how the capsules of the proposed architecture can encode temporal information by using the spatial features extracted from several video frames. Compared with a traditional CNN of the same complexity, the proposed CapsNet improves action recognition performance by 12.11% and 22.29% on the KTH and UCF-sports datasets, respectively.

History

Event

IEEE Signal Processing Society. Conference (44th : 2019 : Brighton, Eng.)

Series

IEEE Signal Processing Society Conference

Pagination

3867 - 3871

Publisher

Institute of Electrical and Electronics Engineers

Location

Brighton, Eng.

Place of publication

Piscataway, N.J.

Publisher DOI

https://doi.org/10.1109/ICASSP.2019.8683720

Start date

2019-05-12

End date

2019-05-17

ISSN

1520-6149

ISBN-13

9781479981311

Language

eng

Publication classification

E1 Full written paper - refereed

Copyright notice

2019, IEEE

Editor/Contributor(s)

[Unknown]

Title of proceedings

ICASSP 2019 : Proceedings of the 2019 44th IEEE International Conference on Acoustics, Speech and Signal Processing

Usage metrics

Keywords

CapsNet Human Action Recognition CNN Routing-by-agreement Science & Technology Technology Acoustics Engineering, Electrical & Electronic Engineering

Licence

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

File(s) under permanent embargo

Learning temporal information from spatial information using CapsNets for human action recognition

History

Event

Series

Pagination

Publisher

Location

Place of publication

Publisher DOI

Start date

End date

ISSN

ISBN-13

Language

Publication classification

Copyright notice

Editor/Contributor(s)

Title of proceedings

Usage metrics

Categories

Keywords

Licence

Exports