publications | Sri Aditya Deevi

2024

RGB-X Object Detection via Scene-Specific Fusion Modules

Sri Aditya Deevi, Connor Lee, Lu Gan , and 3 more authors

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Jan 2024

Abstract HTML Code Poster Slides

Multimodal deep sensor fusion has the potential to enable autonomous vehicles to visually understand their surrounding environments in all weather conditions. However, existing deep sensor fusion methods usually employ convoluted architectures with intermingled multimodal features, requiring large coregistered multimodal datasets for training. In this work, we present an efficient and modular RGB-X fusion network that can leverage and fuse pretrained single-modal models via scene-specific fusion modules, thereby enabling joint input-adaptive network architectures to be created using small, coregistered multimodal datasets. Our experiments demonstrate the superiority of our method compared to existing works on RGB-thermal and RGB-gated datasets, performing fusion using only a small amount of additional parameters. Our code is available at https://github.com/dsriaditya999/RGBXFusion.
Efficient Self-Supervised Neural Architecture Search

Sri Aditya Deevi , Asish Kumar Mishra, Deepak Mishra , and 3 more authors

Accepted in International Conference on Ubiquitous Information Management and Communication (IMCOM) 2025, Oct 2024

Abstract HTML Slides

Deep Neural Networks (DNNs) have successfully demonstrated superior performance on many tasks across multiple domains. Their success is made possible by expert practitioners’ careful design of neural architectures. This manual handcrafted design requires a colossal number of computational resources, time, and memory to arrive at an optimal architecture. Automated Neural Architecture Search (NAS) is a promising area to explore to overcome these issues. However, optimizing a network for a job is a tedious task that requires lengthy search time, high processor needs, and a thorough examination of enormous possibilities. The need of the hour is to develop a strategy that saves time while maintaining an excellent level of accuracy. In this paper, we design, explore, and experiment with various differentiable NAS methods which are memory, time, and compute efficient. We also explore the role and efficacy of self-supervision to guide the search for optimal architectures. Self-supervision offers numerous advantages such as facilitating the use of unlabelled data and making the “learning” non-task specific, thereby improving transfer to other tasks. To study the inclusion of self-supervision into the search process, we propose a simple loss function consisting of a convex combination of supervised cross-entropy loss and self-supervision loss. In addition, we carried out various analyses to characterize the performance of different approaches considered in this paper. The inspection of results obtained from various experiments on CIFAR-10 reveals that the proposed methodology balances time and accuracy while staying as near as possible to the state-of-the-art results.

2022

Expeditious object pose estimation for autonomous robotic grasping

Sri Aditya Deevi, and Deepak Mishra

International Conference on Computer Vision and Image Processing, Nov 2022

Abstract HTML Video Slides

The ability of a robot to sense and “perceive" its surroundings to interact and influence various objects of interest by grasping them, using vision-based sensors is the main principle behind vision based Autonomous Robotic Grasping. To realise this task of autonomous object grasping, one of the critical sub-tasks is the 6D Pose Estimation of a known object of interest from sensory data in a given environment. The sensory data can include RGB images and data from depth sensors, but determining the object’s pose using only a single RGB image is cost-effective and highly desirable in many applications. In this work, we develop a series of convolutional neural network-based pose estimation models without post-refinement stages, designed to achieve high accuracy on relevant metrics for efficiently estimating the 6D pose of an object, using only a single RGB image. The designed models are incorporated into an end-to-end pose estimation pipeline based on Unity and ROS Noetic, where a UR3 Robotic Arm is deployed in a simulated pick-and-place task. The pose estimation performance of the different models is compared and analysed in both same-environment and cross-environment cases utilising synthetic RGB data collected from cluttered and simple simulation scenes constructed in Unity Environment. In addition, the developed models achieved high Average Distance (ADD) metric scores greater than 93% for most of the real-life objects tested in the LINEMOD dataset and can be integrated seamlessly with any robotic arm for estimating 6D pose from only RGB data, making our method effective, efficient and generic.
Data Summarization in Internet of Things

Sri Aditya Deevi, and BS Manoj

SN Computer Science, May 2022

Abstract HTML

With recent advances in the field of Internet of Things (IoT), the quantity of data being generated by various sensors and Internet users has increased dramatically which in turn has skyrocketed the need for efficient data compression methods. Data summarization is an efficient and effective technique for data compression that can generate a brief and succinct summary from typically larger quantities of data in an intelligent and highly useful manner, which can be done at various levels of abstraction. The impact of using such a technique in large IoT networks can be significantly advantageous in terms of reduction in the processing time, overall computation, data storage-transmission requirements, energy consumption, and possible workload on IoT users. In this work, a review of existing methods for data summarization techniques at various levels of abstraction of typical IoT networks are discussed. The levels of categorization that are considered are Low-level and High-level. Under each abstraction level, various techniques are further classified while briefly describing their essential characters.

2021

HeartNetEC: a deep representation learning approach for ECG beat classification

Sri Aditya Deevi, Christina Perinbam Kaniraja, Vani Devi Mani , and 3 more authors

Biomedical Engineering Letters, Feb 2021

Abstract HTML Code

One of the most crucial and informative tools available at the disposal of a Cardiologist for examining the condition of a patient’s cardiovascular system is the electrocardiogram (ECG/EKG). A major reason behind the need for accurate reconstruction of ECG comes from the fact that the shape of ECG tracing is very crucial for determining the health condition of an individual. Whether the patient is prone to or diagnosed with cardiovascular diseases (CVDs), this information can be gathered through examination of ECG signal. Among various other methods, one of the most helpful methods in identifying cardiac abnormalities is a beat-wise categorization of a patient’s ECG record. In this work, a highly efficient deep representation learning approach for ECG beat classification is proposed, which can significantly reduce the burden and time spent by a Cardiologist for ECG Analysis. This work consists of two sub-systems: denoising block and beat classification block. The initial block is a denoising block that acquires the ECG signal from the patient and denoises that. The next stage is the beat classification part. This processes the input ECG signal for finding out the different classes of beats in the ECG through an efficient algorithm. In both stages, deep learning-based methods have been employed for the purpose. Our proposed approach has been tested on PhysioNet’s MIT-BIH Arrhythmia Database, for beat-wise classification into ten important types of heartbeats. As per the results obtained, the proposed approach is capable of making meaningful predictions and gives superior results on relevant metrics.