Human Action Recognition from AD Movies


Action Recognition is curcial for robots to perfoma around humans. Indeed, robot need to asses human action and intentions in order to assist them in everyday life tasks and to collaborative efficiently.

The field of action recognition has aimed to use typical sensors found on robots to recognize agnts, objects and actions they are performing. Typical approach is to record a dataset of various action and label them. But often theses actions are not natural and it can be difficult to represent the variety of ways to perform actions with a lab built dataset. In this project we propose to use audio desription movies to label actions. AD movies integrate a form of narration to allow virually impaired veiwers to undertsnad the visual element sowed on screen. This information often deals with action actually depicted on the scene.

Goals & Milestones

During this project, the student will:

  • Develop a pipeline to collect and crop clip of AD movies for at home actions. This extraction tool should be fexible and allow for integration of next action. It will for instance feature video and text processing to extract [Subject+ Action + Object] type of data.
  • Investigate methods for HAR
  • Implement a tree model combaning HAR with YOLO to identify agent and objects
  • Evaluate the HAR pipeline with the Toyota Robot HSR


Human Action Recognition,


  • Skills: Python, C++, Git.


Senior Lecturer and ARC DECRA Fellow

My research interests include human-robot interaction, human-compter interaction and intelligent and autonomous systems.