Amin Karimi Monsefi
I am a Ph.D. student in Computer Science at The Ohio State University, advised by Professor Rajiv Ramnath. My research spans generative modeling and representation learning, with an emphasis on making modern models efficient, controllable, and useful in real-world settings—from diffusion/flow-based generation to self-supervised and vision-language learning.
Research Interests:
Generative AI (Diffusion Models)
I develop diffusion- and flow-matching–based generative models that are efficient at inference (few-step / budgeted generation) and controllable (structured conditioning and user guidance). Representative work: FS-DFM, TaxaDiffusion, KnobGen
Foundation Models
I design self-supervised and vision-language learning methods that improve data efficiency, transfer, and detail sensitivity for downstream vision tasks (e.g., fine-grained recognition and segmentation). Representative work: Frequency-Guided Masking / FOLK, DetailCLIP
Applied ML (Health, Sensing, and Spatiotemporal Prediction)
I apply ML to high-impact domains where robustness and practicality matter—especially 3D medical imaging and multimodal spatiotemporal prediction for safety/infrastructure and sensor-driven systems. Representative work: Masked LoGoNet, ISOSNet, CrashFormer, Road Construction Forecasting, Indoor Air Quality Models
Recent News and Updates:
Professional Experience:
- May 2025 – Sep 2025 · 5 months | Summer 2025 | Seattle, WA
ML Research Intern – Apple MIND Team
• Researched and developed advanced generative models for efficient, few-step discrete diffusion, enabling faster and scalable text generation.
• Collaborated with a cross-functional ML team to design novel algorithms and architectures for large-scale language modeling.
• Conducted experiments and delivered insights that advanced Apple's research in discrete generative modeling and shaped future projects.
• Publishiung FS-DFM paper, Fast and Accurate Long Text Generation with Few-Step Diffusion Language Models. - May 2024 – Aug 2024 · 4 months | Remote, Durham, NC
Machine Learning Intern – Higharc
• Conducting research on semantic and panoptic segmentation tasks.
• Utilizing unlabeled data to pre-train a DETR-based model and addressing challenges of limited labeled data with self-supervised learning. - Dec 2020 – Dec 2021 · 1 year 1 month | Remote, San Francisco, CA
Senior Data Scientist – JIBB
JIBB provides a smart platform to capture, save, and share handwriting from whiteboards or paper across devices.
• Designed and deployed computer vision pipelines for object detection and dynamic content filtering in both images and videos.
• Developed custom CNN architectures to accurately detect content color and remove shadows and reflections.
• Built automated tools for enhancing visual clarity in real-time handwriting sessions. - Mar 2018 – Dec 2020 · 2 years 10 months | Tehran, Iran
Senior Data Scientist & Back-End Developer – TAPSI
TAPSI is a leading online ride-hailing platform in Iran, providing intelligent mobility solutions through advanced technology and AI.
• Developed AI-powered pricing microservices in Python, communicating via RabbitMQ for real-time fare adjustments.
• Designed a GPS anomaly detection system to prevent fraud and ensure user safety.
• Built data-driven recommendation features (origin, destination, favorite places) using unsupervised learning.
• Created a microservice to estimate ETA based on live driver GPS; published a paper on the proposed method.
• Engineered a spatiotemporal forecasting tool to predict high-demand ride areas in urban regions.
Publications:
- 01/2026 - Accepted in ICLR 2026: FS-DFM: Fast and Accurate Long Text Generation with Few-Step Diffusion Language Models
- 07/2025 - Accepted in Biomedical Optics Express Journal: ISOSNet: a unified framework for cone photoreceptor detection and inner segment and outer segment length measurement from AO-OCT B-scans
- 06/2025 - Accepted in ICCV 2025: TaxaDiffusion: Progressively Trained Diffusion Model for Fine-Grained Generation and Trait Discovery
- 03/2025 - Accepted in CVEU Workshop of CVPR 2025: KnobGen: Controlling the Sophistication of Artwork in Sketch-Based Diffusion Models
- 01/2025 - Accepted in ICLR 2025: Frequency-Guided Masking for Enhanced Vision Self-Supervised Learning
- 02/2025 - Accepted in SSI-FM Workshop of ICLR 2025: DetailCLIP: Detail-Oriented CLIP for Fine-Grained Tasks
- 05/2024 - Accepted in SIGKDD 2024: Masked LoGoNet: Fast and Accurate 3D Image Analysis for Medical Domain
- 03/2024 - Accepted in Biomedical Optics Express Journal: Reducing Manual Labeling Requirements and Improved Retinal Ganglion Cell Identification in 3D AO-OCT Volumes Using Semi-Supervised Learning
- 06/2023 - Accepted in ACM SIGSPATIAL International Workshop on Advances in Urban-AI: CrashFormer: A Multimodal Architecture to Predict the Risk of Crash
- 05/2023 - Accepted in SIGKDD 2023: Novel Physics-Based Machine-Learning Models for Indoor Air Quality Approximations
- 02/2023 - Accepted in Digital Communications and Networks Journal - 2023: Smart and collaborative industrial IoT: A federated learning and data space approach
- 08/2022 - Accepted in ACM SIGSPATIAL 2022: Will there be a construction? Predicting road constructions based on heterogeneous spatiotemporal data
Reviewer Appointments:
- CVPR: Reviewer (2025, 2026)
- ICLR: Reviewer (2025, 2026)
- SIGKDD: Reviewer (2024, 2025, 2026)
- 2025 First Round: Excellent Reviewer (top 20%)
- 2025 Second Round: Outstanding Reviewer (top 10%)
- WACV: Reviewer (2025)
Bachelor and Master
My Bachelor’s thesis focused on applying reinforcement learning in a multi-object environment. In this unique setting, each object had the ability to train individually. Additionally, I incorporated federated learning techniques to enable the objects to generalize their models to each other. This research explored the potential of combining these approaches to enhance learning and decision-making in complex environments.
For my Master’s thesis, I delved into the realm of software testing. Specifically, I proposed an innovative approach to generating datasets using machine learning techniques. This approach aimed to cover the main paths within the software, enabling effective fault detection. By leveraging machine learning, I sought to enhance the efficiency and accuracy of software testing processes, ultimately improving software systems’ overall quality and reliability.