Kunwar Saaim

About Me

I'm Saaim, a Machine Learning Resident at Alberta Machine Intelligence Institute (AMII) and a University of Alberta M.Sc. graduate, specializing in Natural Language Processing, Computer Vision, and Deep Learning.

I enjoy reading research papers and exploring how disparate topics can lead to innovative ideas. My research interests focus on Natural Language Processing, Multi-Agent System, and Applied AI, particularly in multi-modal learning, biomedical applications, and deep learning applications in natural sciences.

Currently, at AMII, I'm contributing to voice-based diagnostics by engineering pipelines for complex audio datasets and developing advanced acoustic models using novel machine learning techniques. We are exploring methods like contrastive learning to improve multi-disease detection capabilities.

Degree: MS Computing Science

University University of Alberta, Canada

Here are a few technologies I've been working with recently:

Python
PyTorch
TensorFlow
JAX
Transformers
Vertex AI/GCP

vLLM/Ollama
FastAPI
Docker
OpenVINO/TensorRT
LlamaIndex
Model Context Protocol

Resume

Education

Master of Science: Computer Science

2022 - 2023

University of Alberta, Edmonton, Canada

Master's Thesis Title: Locating anomalies in aerial multi-spectral imagery

Bachelor of Technology: Computer Engineering

2017 - 2021

Aligarh Muslim University, Aligarh, India

Bachelor's Thesis Title: Nowcasting of Multispectral Satellite Imagery from INSAT-3D using Neural Networks

Professional Experience

Alberta Machine Intelligence Institute (AMII) | Machine Learning Resident

October 2024 – Present

Engineered an end-to-end data processing pipeline to standardize and unify large-scale (5,000+ hours) heterogeneous audio datasets from diverse sources, paving the way for fine-tuning Large Acoustic Models (LAMs) for voice-based diagnostics and multi-disease detection.
Developed an intelligent audio pre-processing system integrating Whisper-based speech recognition, speaker diarization, and Phi-4-based transcript analysis to automatically extract relevant patient speech segments from clinical conversations, preparing high-quality input for diagnostic model training.
Designed a hybrid Large Acoustic Model (LAM) architecture that integrates audio transformer embeddings, OpenSMILE-based voice metrics, and ModernBERT-encoded demographic features via self-attention; leveraged contrastive learning to align the fused audio features with text-encoded patient disease descriptions, achieving an F1-score of 80% and an MCC of 70%.

Vector Institute | Machine Learning Associate - Generative AI

May 2024 – September 2024

Spearheaded the development and deployment of a personalized medical scribe using Llama-3 architecture, utilizing Vertex AI on GCP for efficient data processing and vLLM for optimized model serving and low-latency inference.
Implemented distributed training using Fully Sharded Data Parallel (FSDP) for fine-tuning and preference tuning of the Mistral-7B and LLaMA-3 models, benchmarking their efficacy across multiple LLMs.
Designed and refined instruction prompts for MedLM (PaLM 2) and Claude Sonnet-3.5, implementing recursive refinement and chain-of-thought methodologies to achieve state-of-the-art few-shot performance benchmarks.

Aerium Analytics | Machine Learning Developer

May 2023 – April 2024

Developed multi-spectral FOD detection for drone imagery using normalizing flows and masked image modeling for robust feature extraction.
Achieved 99.28% IoU with Segformer for precise grass segmentation on airport runways, enhancing overall system precision.
Optimized a YOLO-NAS model via TensorRT conversion, boosting inference speed by 15% and deploying efficiently with DeepStream on Jetson Orin Nano.
Deployed models as versioned FastAPI endpoints in Docker within an Agile framework, ensuring seamless integration and maintenance.

University of Alberta | Graduate Research Assistant

October 2022 – April 2023

Implemented Implicit Neural Representation techniques to enhance spectral data richness in multi-spectral images
Created algorithms for transforming multi-spectral data from Sentinel-2 and Landsat into hyper-spectral images
Achieved 420% improvement by increasing spectral bands from 5 to 26

Fatima Fellowship | Natural Language Processing Fellow

June 2022 – February 2023

Developed the PyDebiaser library for addressing bias in transformer models (GPT2, BERT, GPT-Neo)
Demonstrated impact of bias mitigation techniques on large language models
Fostered ethical AI practices and fairness in NLP applications

Indian Institute of Technology, Delhi | Machine Learning Engineer

September 2021 – March 2022

Developed Intelligent Edge Device prototype for visually impaired pedestrians
Ported PyTorch models to ONNX, reducing model size by 50%
Achieved near real-time detection on Raspberry Pi with Intel Neural Compute Stick

Indian Institute of Information Technology, Allahabad | Research Intern

Dec 2020 – Feb 2021

Worked on the development of Explainable AI model for Seizure Detection using EEG data.
Reviewed literature on Explainable AI techniques.
Designed Residual Depthwise Separable convolution-based architecture for seizure detection.

King Fahd University of Petroleum & Minerals, Dhahran | Research Intern

Jun 2020 – Nov 2020

Worked on File Fragment Classification which is part of digital forensics and data carving.
Developed a Depthwise Separable Convolution based model for efficient classification of data fragments, the model is 24 times faster than the current state-of-the-art.

Interdisciplinary Biotechnology Unit, AMU | Undergrad Researcher

Dec 2019 – March 2020

Worked on the localisation of intrinsically disordered regions in protein sequence given in Fasta format using residual ConvLSTM network.
Reviewed literature on deep learning techniques for drug repurposing specifically for Covid-19.

Projects

All
Classification
Segmentation
Image Generation
Key Point Detection

Nowcasting Satellite Images

Nowcasting (forecasting for short duration) of Multi-spectral Satellite Imagery using Neural Networks (Bachelor's Thesis - COC4980)

Nanoparticle Segmentation

Explaining different black box Segmentation Networks for Nanoparticle TEM Images

File Fragment Classification

Efficient File Fragment Classification using inception style depthwise Separable Convolutional network, trained on FFT-75 dataset. The model achieves state-of-the-art performance on accuracy and inference time.

Seizure Detection

Explainable Residual Depthwise Separable convolution-based architecture for Seizure Detection, trained on CHB-MIT EEG dataset.

Aquatic Animal Pose Estimation

Markerless Pose Estimation of Aquatic Animals in Visual feed

Extended Abstract Poster

Classification of Disordered Residues in Intrinsically Disordered Proteins

It is a many-to-many problem for which we designed a neural network composed of Bidirectional ConvLSTM & effective skip connections to predict the chances of disorderness of each amino acid in a protein sequence.

TensorFlow Object Detection App

PyQt5 GUI application for object detection using TensorFlow Object Detection API.

GitHub

Retinal Image Generation and Segmentation

Fully unsupervised two-step pipeline for synthesizing high-quality retinal images, along with the corresponding segmented vessel structure.

Smart Diagnose

Smart Diagnose is a web app that predicts whether a chest x-ray consists of Pneumonia, in case of Pneumonia an image with localization on x-ray is also displayed.