Karun Sharma
I am a final year Artificial Intelligence and CS Undergrad student,
currently working at Georgia Institute of Technology as a Research Intern under Prof. Vijay K Madisetti on Multi-modality visual grounding.
Prior to this I worked at Zocket AI
as Computer Vision intern.
My research interests lie in Multimodal Learning(Any-To-Any Modals), Embodied AI.
Email /
GitHub /
LinkedIn
|
|
|
Research Intern - Georgia Institute of Technology
08 - 2024
Working on Multimodal Visual grounding on images and videos using Open-Vocab Computer Vision techniques.
|
|
Computer Vision Intern - Zocket AI
02 - 2024
Worked on content moderation engine for images
generated by our AI Models according to policies of various social media sites.
Trained and tuned background removal model for fine-grained and smooth output.
|
Research
I'm interested in Computer Vision, Multimodals, Machine Learning, Optimization
|
|
LLaVA-PlantDiag: Integrating Large-scale Vision-Language Abilities for Conversational Plant Pathology Diagnosis
Karun Sharma, Vidushee Vats, Abhinendra Singh, Rahul Sahani, Dr. Deepak Rai, Dr. Ashok Sharma
Preprint, 2024
website /
LLaVA-PlantDiag, is a conversational AI system designed for plant pathology. We use visual instruction tuning for model finetuning.
Our model outperforms others like GPT-4 Vision and Gemini, We also release first multimodal data on plant-pathology.
|
|
An Improved Hybrid Model for Target Detection
Umesh Gupta, Richa Golash, Vidushee Vats, Karun Sharma
International Conference on Emerging Techniques in Computational Intelligence, 2023
IEEE /
We worked on developing a refined model (YOLO and R-CNN Family) for detecting multiple objects by fusing thermal and visible images. The fusion techniques, including Multiscale Fusion, Channel-Based Fusion, and Blind Source Separation, significantly improve target detection in hazardous environments, enhancing safety and security in critical applications like autonomous driving and surveillance.
|
|