Amin Karimi Monsefi

I am a dedicated Ph.D. student in Computer Science at The Ohio State University, focusing on Computer Vision, Vision-Language Models, and Self-Supervised Learning under the supervision of Professor Rajiv Ramnath. My research encompasses image and video generation, as well as self-supervised learning techniques to advance the field of computer vision.

Research Interests:

Image and Video Generation:

  • Developing innovative methods for generating high-quality images and videos.
  • Extending image-based generative strategies into multi-frame sequences, focusing on preserving consistent identity, style, and motion dynamics.
  • Employing hierarchical knowledge structures to capture subtle morphological or stylistic variations, allowing for highly distinctive yet consistent image synthesis.
  • Exploring how the creativity of Large Language Models (LLMs) can be utilized in video generation with diffusion models.

Self-Supervised Learning for Vision:

  • Designing self-supervised approaches to learn meaningful representations from unlabeled data.
  • Projects include Frequency-Guided Masking for Enhanced Vision Self-Supervised Learning and a self-supervised approach for general images using multimodal architectures like CLIP.
  • Applying self-supervised learning to medical image analysis to overcome the challenge of limited labeled data.

Medical Image Analysis:

  • Utilizing self-supervised learning to train models on unlabeled medical images.
  • It aims to extract valuable features for better analysis and interpretation in the medical domain.
  • Developed Masked LoGoNet, a neural network architecture with tailored self-supervised learning for efficient medical image segmentation.

Recent News and Updates:

Professional Experience:

  • Apple Logo Incoming ML Research Intern – Apple MIND Team
    Summer 2025 | Seattle, WA
    Working on multimodal vision-language models and efficient training for generative applications.
  • Higharc Logo Machine Learning Intern – Higharc
    May 2024 – Aug 2024 · 4 months | Remote, Durham, NC
    • Conducting research on semantic and panoptic segmentation tasks.
    • Utilizing unlabeled data to pre-train a DETR-based model and addressing challenges of limited labeled data with self-supervised learning.
  • JIBB Logo Senior Data Scientist – JIBB
    Dec 2020 – Dec 2021 · 1 year 1 month | Remote, San Francisco, CA
    JIBB provides a smart platform to capture, save, and share handwriting from whiteboards or paper across devices.
    • Designed and deployed computer vision pipelines for object detection and dynamic content filtering in both images and videos.
    • Developed custom CNN architectures to accurately detect content color and remove shadows and reflections.
    • Built automated tools for enhancing visual clarity in real-time handwriting sessions.

Publications:

Reviewer Appointments:

  • Selected to serve as a reviewer for SIGKDD 2025 Second Round
  • Selected to serve as a reviewer for CVPR 2025
  • Selected to serve as a reviewer for ICLR 2025
  • Selected to serve as a reviewer for WACV 2025
  • Selected to serve as a reviewer for SIGKDD 2025 First Round (selected as an Excellent Reviewer)
  • Selected to serve as a reviewer for SIGKDD 2024

Bachelor and Master

My Bachelor’s thesis focused on applying reinforcement learning in a multi-object environment. In this unique setting, each object had the ability to train individually. Additionally, I incorporated federated learning techniques to enable the objects to generalize their models to each other. This research explored the potential of combining these approaches to enhance learning and decision-making in complex environments.

For my Master’s thesis, I delved into the realm of software testing. Specifically, I proposed an innovative approach to generating datasets using machine learning techniques. This approach aimed to cover the main paths within the software, enabling effective fault detection. By leveraging machine learning, I sought to enhance the efficiency and accuracy of software testing processes, ultimately improving software systems’ overall quality and reliability.