Amin Karimi Monsefi

I am a Ph.D. student in Computer Science at The Ohio State University, advised by Professor Rajiv Ramnath. My research spans generative modeling and representation learning, with an emphasis on making modern models efficient, controllable, and useful in real-world settings—from diffusion/flow-based generation to self-supervised and vision-language learning.

Research Interests:

Generative AI (Diffusion Models)

I develop diffusion- and flow-matching–based generative models that are efficient at inference (few-step / budgeted generation) and controllable (structured conditioning and user guidance). Representative work: FS-DFM, TaxaDiffusion, KnobGen

Foundation Models

I design self-supervised and vision-language learning methods that improve data efficiency, transfer, and detail sensitivity for downstream vision tasks (e.g., fine-grained recognition and segmentation). Representative work: Frequency-Guided Masking / FOLK, DetailCLIP

Applied ML (Health, Sensing, and Spatiotemporal Prediction)

I apply ML to high-impact domains where robustness and practicality matter—especially 3D medical imaging and multimodal spatiotemporal prediction for safety/infrastructure and sensor-driven systems. Representative work: Masked LoGoNet, ISOSNet, CrashFormer, Road Construction Forecasting, Indoor Air Quality Models

Recent News and Updates:

Professional Experience:

  • Apple Logo ML Research Intern – Apple MIND Team
    May 2025 – Sep 2025 · 5 months | Summer 2025 | Seattle, WA
    • Researched and developed advanced generative models for efficient, few-step discrete diffusion, enabling faster and scalable text generation.
    • Collaborated with a cross-functional ML team to design novel algorithms and architectures for large-scale language modeling.
    • Conducted experiments and delivered insights that advanced Apple's research in discrete generative modeling and shaped future projects.
    • Publishiung FS-DFM paper, Fast and Accurate Long Text Generation with Few-Step Diffusion Language Models.
  • Higharc Logo Machine Learning Intern – Higharc
    May 2024 – Aug 2024 · 4 months | Remote, Durham, NC
    • Conducting research on semantic and panoptic segmentation tasks.
    • Utilizing unlabeled data to pre-train a DETR-based model and addressing challenges of limited labeled data with self-supervised learning.
  • JIBB Logo Senior Data Scientist – JIBB
    Dec 2020 – Dec 2021 · 1 year 1 month | Remote, San Francisco, CA
    JIBB provides a smart platform to capture, save, and share handwriting from whiteboards or paper across devices.
    • Designed and deployed computer vision pipelines for object detection and dynamic content filtering in both images and videos.
    • Developed custom CNN architectures to accurately detect content color and remove shadows and reflections.
    • Built automated tools for enhancing visual clarity in real-time handwriting sessions.
  • TAPSI Logo Senior Data Scientist & Back-End Developer – TAPSI
    Mar 2018 – Dec 2020 · 2 years 10 months | Tehran, Iran
    TAPSI is a leading online ride-hailing platform in Iran, providing intelligent mobility solutions through advanced technology and AI.
    • Developed AI-powered pricing microservices in Python, communicating via RabbitMQ for real-time fare adjustments.
    • Designed a GPS anomaly detection system to prevent fraud and ensure user safety.
    • Built data-driven recommendation features (origin, destination, favorite places) using unsupervised learning.
    • Created a microservice to estimate ETA based on live driver GPS; published a paper on the proposed method.
    • Engineered a spatiotemporal forecasting tool to predict high-demand ride areas in urban regions.

Publications:

Reviewer Appointments:

  • CVPR: Reviewer (2025, 2026)
  • ICLR: Reviewer (2025, 2026)
  • SIGKDD: Reviewer (2024, 2025, 2026)
    • 2025 First Round: Excellent Reviewer (top 20%)
    • 2025 Second Round: Outstanding Reviewer (top 10%)
  • WACV: Reviewer (2025)

Bachelor and Master

My Bachelor’s thesis focused on applying reinforcement learning in a multi-object environment. In this unique setting, each object had the ability to train individually. Additionally, I incorporated federated learning techniques to enable the objects to generalize their models to each other. This research explored the potential of combining these approaches to enhance learning and decision-making in complex environments.

For my Master’s thesis, I delved into the realm of software testing. Specifically, I proposed an innovative approach to generating datasets using machine learning techniques. This approach aimed to cover the main paths within the software, enabling effective fault detection. By leveraging machine learning, I sought to enhance the efficiency and accuracy of software testing processes, ultimately improving software systems’ overall quality and reliability.