Visual object perception problems in computer vision

DR Home
→
Electrical Engineering
→
Dissertations and Theses (Masters and Doctoral)
→
View Item

dc.contributor.advisor	Raman, Shanmuganathan
dc.contributor.author	Vora, Aditya Narendrabhai
dc.date.accessioned	2025-04-25T14:23:43Z
dc.date.available	2025-04-25T14:23:43Z
dc.date.issued	2017
dc.identifier.citation	Vora, Aditya Narendrabhai (2017). Visual object perception problems in computer vision. Gandhinagar: Indian Institute of Technology Gandhinagar, 36p. (Acc. No.: T00223).
dc.identifier.uri	https://repository.iitgn.ac.in/handle/123456789/11348
dc.description.abstract	Video object segmentation is the task of estimating foreground object segments from the background throughout video. We propose a frame-by-frame approach for video object segmentation that uses cluster information in order to select foreground segments. Unlike previous approaches for video object segmentation that makes use of optical flow in order to localize dynamic object segments throughout the video, we rather focus on selecting a set of foreground segments from a pool of region proposals through clustering, which helps to avoid making use of optical flow and thus help our algorithm to scale-up to longer video sequences. Object localization is the task of estimating precise localized windows around all object instances in the image. We proposed an algorithm for object localization given that single object instance appears in the image. Unlike previous supervised and weakly supervised techniques that require heavy training in order to learn classifiers, our approach is completely unsupervised. Our approach depends on iterative spectral clustering in order select proposals that contain an object from a huge set of proposals generated from an object proposal generation algorithm. From these set of filtered object proposals, we then estimate the final localized window by considering the inter and intra class variations among the object proposals, thus making the entire algorithm completely unsupervised. We consider designing a fully automated action recognition system under uncontrolled environments. Most existing algorithms rely on constructing handcrafted features from the input and then learn classifiers based on the designed features. However, these hand-crafted features are inefficient in modelling more complex scenes. CNN are a class of deep learning models that can learn features automatically from the input during the training process. We design a 3D convolutional neural network for human action recognition. This model is able to extract features in spatio-temporal domain, thereby able to capture the motion information encoded in multiple contiguous frames required for all video processing applications.
dc.description.statementofresponsibility	by Aditya Narendrabhai Vora
dc.format.extent	36p.: 29 cm.
dc.language.iso	en_US
dc.publisher	Indian Institute of Technology Gandhinagar
dc.subject	14410006
dc.subject	Video Object Segmentation
dc.subject	Spatio-Temporal Domain
dc.subject	CNN (Convolutional Neural Network)
dc.subject	Deep Learning Models
dc.subject	Video Processing Applications
dc.title	Visual object perception problems in computer vision
dc.type	Thesis
dc.contributor.department	Electrical Engineering
dc.description.degree	M. Tech.

Files in this item

Files	Size	Format	View
There are no files associated with this item.

This item appears in the following Collection(s)

Dissertations and Theses (Masters and Doctoral) [294]

Show simple item record

Search Digital Repository

Browse

All of DSpace
This Collection
- Titles
- Authors
- By Advisor
- By Issue Date
- Subjects
- By Type
- By Degree
- By Department

Visual object perception problems in computer vision

Files in this item

This item appears in the following Collection(s)

Search Digital Repository

Browse

All of DSpace

This Collection

My Account