Home | Portfolio | Beyond Work

Mahmoud Jahanshahi

Ph.D. in Computer Science

Hi, welcome to my page! I’m a research and data scientist with a Ph.D. in Computer Science, specializing in mining open-source software repositories using AI, machine learning, and NLP. My work focuses on software supply chains, particularly software reuse and its implications for licensing and security. I also have prior experience in business intelligence and financial analysis, especially in the telecommunications industry.

When I’m not immersed in computer science, I enjoy playing the piano or pushing my limits with CrossFit. You can learn more about my interests here.


Socials:


Education

University of Tennessee, Knoxville, USA

Doctor of Philosophy, Computer Science
Dissertation: Copy-Based Reuse and its Implications in Open Source Software Supply Chains
May 2021 - May 2025

Sharif University of Technology, Tehran, Iran

Master of Science, Industrial Engineering - Industrial
Thesis: The Influence of Information Presentation and Risk Attitude on Asset Allocation in Financial Markets
September 2011 - September 2013

Mazandaran Institute of Technology, Babol, Iran

Bachelor of Science, Industrial Engineering - System Planning and Analysis
January 2007 - July 2011

National Organization for Development of Exceptional Talents, Babol, Iran

High School Diploma, Mathematics
September 1999 - September 2006

Awards

LLM4Code Best Paper Award

Paper: “Cracks in the Stack: Hidden Vulnerabilities and Licensing Risks in LLM Pre-Training Datasets”
Venue: Second International Workshop on Large Language Models for Code (LLM4Code), 2025
May 2025

ACM SIGSOFT Distinguished Paper Award

Paper: “Understanding the Response to Open-Source Dependency Abandonment in the npm Ecosystem”
Venue: 47th International Conferenceon Software Engineering (ICSE), 2025
May 2025

Experience

Graduate Research Assistant

University of Tennessee, Knoxville, USA

Conducting research on Open Source Software supply chains through repository mining, with a focus on copy-based reuse and its implications.

May 2021 - May 2025

Business Intelligence Consultant

Freelance

Providing specialized Business Intelligence consulting services, focusing on financial data analysis and strategic insights across diverse industries.

April 2020 - May 2021

Senior Data Scientist

Mobile Communications Company of Iran (Hamrahe Aval), Tehran, Iran

Analyzing business processes, data, and reporting needs. Developing and maintaining reports, dashboards, and analyses using Oracle Business Intelligence Enterprise Edition (OBIEE). Collaborating with the data warehouse team on requirements gathering, design, testing, and ongoing development.

May 2019 - April 2020

Strategic Investments Lead

Mobile Communications Company of Iran (Hamrahe Aval), Tehran, Iran

Managing a team of 5 professionals in handling complex investment projects, interfacing with C-level management, negotiating cooperation models and contract terms, and supporting the development of investment strategies and policies.

February 2018 - May 2019

International Investment Analyst

Mobile Communications Company of Iran (Hamrahe Aval), Tehran, Iran

Using financial models to project earnings, screening markets for acquisition opportunities, predicting market events, and interpreting financial statements.

February 2016 - February 2018

Publications ORCID

  • Jahanshahi, M. & Mockus, A.. "Cracks in The Stack: Hidden Vulnerabilities and Licensing Risks in LLM Pre-Training Datasets." Accepted in Second International Workshop on Large Language Models for Code (LLM4Code 2025). Won the LLM4Code Best Paper Award.
    Preprint - Replication Package - GitHub Repo
  • Jahanshahi, M., Reid, D., & Mockus, A.. "Beyond Dependencies: The Role of Copy-Based Reuse in Open Source Software Development." Accepted in ACM Transactions on Software Engineering and Methodology (TOSEM).
    Preprint - Replication Package - GitHub Repo
  • Jahanshahi, M., Reid, D., McDaniel, A., & Mockus, A.. "OSS License Identification at Scale: A Comprehensive Dataset Using World of Code." Accepted in IEEE/ACM 22st International Conference on Mining Software Repositories (MSR 2025). IEEE.
    Preprint - Replication Package
  • Miller, C., Jahanshahi, M., Mockus, A., Vasilescu, B., & Kästner, C.. "Understanding the Response to Open-Source Dependency Abandonment in the npm Ecosystem." Accepted in 47th International Conference on Software Engineering (ICSE 2025). Won the ACM SIGSOFT Distinguished Paper Award.
    Preprint - Replication Package - GitHub Repo
  • Thakur, A., Milewicz, R., Jahanshahi, M., Paganini, L., Vasilescu, B., & Mockus, A.. "Scientific Open-Source Software Is Less Likely to Become Abandoned Than One Might Think! Lessons from Curating a Catalog of Maintained Scientific Software" Accepted in The ACM International Conference on the Foundations of Software Engineering (FSE 2025).
    Preprint - Replication Package
  • Jahanshahi, M. & Mockus, A. (2024, April). "Dataset: Copy-based Reuse in Open Source Software." In 2024 IEEE/ACM 21st International Conference on Mining Software Repositories (MSR) (pp. 42-47). IEEE.
    Paper - GitHub Repo
  • Reid, D., Jahanshahi, M., & Mockus, A. (2022, May). "The extent of orphan vulnerabilities from code reuse in open source software." In Proceedings of the 44th International Conference on Software Engineering (ICSE) (pp. 2104-2115). Nominated for the ACM SIGSOFT Distinguished Paper Award.
    Paper - GitHub Repo
  • Lyulina, E., & Jahanshahi, M. (2021, May). "Building the collaboration graph of open-source software ecosystem." In 2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR) (pp. 618-620). IEEE.
    Paper - GitHub Repo

Skills

Programming
  • Terminal/Bash Scripting
  • Databases (SQL, MongoDB, etc.)
  • Python
  • R
  • C
  • Data Visualization (Power BI, Tableau, etc.)
  • Project Management Tools
Languages
  • Persian (Native)
  • English (Fluent - C1)
  • German (Working Knowledge - B1)
Competencies
  • Advanced analytical and problem-solving capabilities
  • Ability to tackle complex, intellectually demanding problems with minimal guidance
  • Rapid learning and adaptability in dynamic environments
  • Leadership in cross-functional teams, with strong communication and organizational skills

Certificates

Specializations



Courses