Amin Karimi Monsefi
I am a dedicated Ph.D. student in Computer Science at The Ohio State University, focusing on Computer Vision, Vision-Language Models, and Self-Supervised Learning under the supervision of Professor Rajiv Ramnath. My research encompasses image and video generation, as well as self-supervised learning techniques to advance the field of computer vision.
I am actively looking for a research internship for summer 2025!
Research Interests:
Image and Video Generation:
- Developing innovative methods for generating high-quality images and videos.
- Projects include Multi-Guided Image Inpainting and Multi-Modal Conditional Video Generation.
- Exploring how the creativity of Large Language Models (LLMs) can be utilized in video generation with diffusion models.
Self-Supervised Learning for Vision:
- Designing self-supervised approaches to learn meaningful representations from unlabeled data.
- Projects include Frequency-Guided Masking for Enhanced Vision Self-Supervised Learning and a self-supervised approach for general images using multimodal architectures like CLIP.
- Applying self-supervised learning to medical image analysis to overcome the challenge of limited labeled data.
Medical Image Analysis:
- Utilizing self-supervised learning to train models on unlabeled medical images.
- It aims to extract valuable features for better analysis and interpretation in the medical domain.
- Developed Masked LoGoNet, a neural network architecture with tailored self-supervised learning for efficient medical image segmentation.
Recent News and Updates:
Reviewer Appointments:
Selected to serve as a reviewer for ICLR 2025
Selected to serve as a reviewer for WACV 2025
Selected to serve as a reviewer for SIGKDD 2025
Selected to serve as a reviewer for SIGKDD 2024
Publications:
2024/01/10 - Submitted for Publication: KnobGen: Controlling the Sophistication of Artwork in Sketch-Based Diffusion Models
2024/01/10 - Submitted for Publication: Frequency-Guided Masking for Enhanced Vision Self-Supervised Learning
2024/01/09 - Submitted for Publication: DetailCLIP: Detail-Oriented CLIP for Fine-Grained Tasks
2024/31/05 - Accepted in SIGKDD 2024: Masked LoGoNet: Fast and Accurate 3D Image Analysis for Medical Domain
Accepted in Biomedical Optics Express Journal - 2024: Reducing Manual Labeling Requirements and Improved Retinal Ganglion Cell Identification in 3D AO-OCT Volumes Using Semi-Supervised Learning
Accepted in ACM SIGSPATIAL International Workshop on Advances in Urban-AI - 2023: CrashFormer: A Multimodal Architecture to Predict the Risk of Crash
Accepted in SIGKDD 2023: Novel Physics-Based Machine-Learning Models for Indoor Air Quality Approximations
Accepted in Digital Communications and Networks Journal - 2023: Smart and collaborative industrial IoT: A federated learning and data space approach
Accepted in ACM SIGSPATIAL 2022: Will there be a construction? Predicting road constructions based on heterogeneous spatiotemporal data
Bachelor and Master
My Bachelor’s thesis focused on applying reinforcement learning in a multi-object environment. In this unique setting, each object had the ability to train individually. Additionally, I incorporated federated learning techniques to enable the objects to generalize their models to each other. This research explored the potential of combining these approaches to enhance learning and decision-making in complex environments.
For my Master’s thesis, I delved into the realm of software testing. Specifically, I proposed an innovative approach to generating datasets using machine learning techniques. This approach aimed to cover the main paths within the software, enabling effective fault detection. By leveraging machine learning, I sought to enhance the efficiency and accuracy of software testing processes, ultimately improving software systems’ overall quality and reliability.