Skip to main content

Robotic Perception

Robotic Perception

Academic Contact: Dr Mehmet Dogar
Academic Staff: Dr Duygu Sarikaya, Dr Mehmet Dogar, Dr Andy Bulpitt, Dr Des McLernon, Professor Anthony Cohn, Professor David Hogg, Professor Mark Mon-Williams, Professor Netta Cohen

Our work focuses on activity analysis from video, with fundamental research on categorisation, tracking, segmentation and motion modelling, through to the application of this research in several areas. Part of the work is exploring the integration of vision within a broader cognitive framework that includes audition, language, action, and reasoning.

Activity monitoring and recovery: Most of the existing approaches for activity recognition are designed to perform after-the-fact classification of activities after fully observing videos of single activity. Moreover, such systems usually expect that the same number of people or objects are observed over the entire activity whilst in realistic scenarios often people and objects enter/leave the scene while activity is going on. There are three main objectives of our activity recognition approach: 1) to recognise the current event from a short observation period (typically two seconds); 2) to anticipate the most probable event that follows on from the current event; 3) to recognise activity deviations.

Diagram illustrating a neural network.

Seeing to Learn: Observation Learning in Robotics: Seeing to learn (S2l) is an ambitious computer vision-robotics project currently ongoing at University of Leeds, which is aimed at developing advanced observation learning methods for robotics systems. It addresses the inability of current robotic systems to learn from human demonstrations. The project envisions a future where robots could acquire new skills by just observing humans perform a task or even by watching online tutorial videos of demonstrations. In future the robots equipped with these learning methods could be applied in several real world conditions ranging from home to work environments such as construction sites where it could learn to perform the relevant tasks of drilling holes, hammering nails or screwing a bolt just by observing other workers.

Diagram showing a robot observing a human carrying out a task and then reproducing the activity.

Unsupervised activity analysis: Our goal is to learn conceptual models of human activity from observation by a mobile robot given minimal guidance. The ultimate aim is to enable a robot and people to have a shared understanding of what is occuring in the environment. This is a prerequisite for the robot to become a useful assistant.

Photograph of a human using a water cooler with the person's body overlaid by a wire mesh showing torso and limb positions.

Tracking objects: This work applies local and global constraints in an optimisation procedure where the interpretation of the tracks (local constraints) and events (global constraints) mutually influence each other. We build a model in terms of the person-object relationship over time which focuses on the carry event. Closed contours which are approximately convex are detected as potential carried objects and form a set of initial object detections. A high level interpretation of which objects are carried when and by whom is computed from high confidence object detections. The current high level interpretation induces a set of object tracks. Confidence estimates on object detections are changed based on the current object tracks. The high level interpretation from above is repeated until convergence.

A photograph of a man carrying a plastic container. The container has been outlined in red by the computer vision system.