I am a grad student in CS at the UofA, supervised by Prof. Osmar Zaiane. My research focuses on multimodal generative models, including LLMs, VLMs, and diffusion models, with emphasis on 3D spatial reasoning. Current work includes scaling spatial and fine-grained visual reasoning in multimodal LLMs (potentially with image generation) with test-time scaling, RL post-training for efficient reasoning with curating customized vision-language datasets.
- Efficient Spatial & Visual Reasoning with LLMs/VLMs/MMLMs
- Vision-Language Understanding & Embodied Spatial Reasoning
- 3D Representations, Grounding, & Space Understanding
- Building Vision-Language Datasets for Embodied Multi-Agent Systems
- Visual and Geometry Retrieval Systems
- M.Sc. in CS, University of Alberta (Present)
- Ph.D. in ECE, University of Alberta (Transferred to CS)
- M.Sc. & B.Sc. in ME, Sharif University of Technology & Univ. of Tehran


