From BERT to Mamba: Evaluating Deep Learning for Efficient QA Systems
This work explores the trade-offs between accuracy and computational efficiency in question-answering (QA) systems comprising of LLMs
This work explores the trade-offs between accuracy and computational efficiency in question-answering (QA) systems comprising of LLMs
This project aims to implement a distributed framework for financial risk assessment using Monte Carlo simulations to estimate Value at Risk (VaR) and Conditional Value at Risk (CVaR) by leveraging Spark’s distributed computing capabilities.
In this project, we analyze and gain actionable insights into the efficiency of emergency response times and patterns related to fire incidents and emergency medical services (EMS) in San Francisco.
Project to predict the segmentation mask of the 22nd frame of a video using the first 11 frames
This repo improves Mixture-of-Experts (MoE) models by addressing load-imbalance during dynamic routing, enhancing inference performance on hardware accelerators. It integrates CuBLAS and CuSparse, optimizing batched GEMM tasks for variable-sized inputs, resulting in significant efficiency gains across different model sizes.
We formulate the online speaker diarization as a contextual-bandit problem similar to the online semi-supervised learning method
Implementation and Analysis of a Weakly Consistent Key-Value Store - Dynamo
Project concerning Mining and Classifying tweets based on sentiment expressed in them.
Project architecture which enables continuous touch-tracking for detection of gestures so as to send commands to the wearable watch.
Project to execute the decision making process for communication of a Unmanned Aerial Vehicle