Massih Forootan

View on GitHub

my_logo

massiHUB

About Me

Data Analytics Portfolio


What I am good at:

What I deliver:

What I am most proud of:

Why hire me?

I’m drawn to organizations that value clarity, efficiency, and impact, and I deliver all three. My track record in public health, education, biotech, and traffic safety analytics proves that I adapt quickly, produce consistently, and elevate the people and systems around me.

I offer a rare blend of technical rigor, process intuition, and people-centered leadership. I’m not just someone who builds dashboards; I build systems, relationships, and narratives that turn data into decisive action.

If you’re seeking a data professional who thrives in fast-paced, cross-functional environments, who’s as comfortable cleaning a messy dataset as presenting to executives, and who aligns with mission-driven, outcome-focused cultures, that’s me.

Back to top


Portfolio Projects

Tennessee Integrated Traffic Analysis Dashboards

Tools: Tableau, SQL
Key Features: Dynamic filtering, Multi-year traffic and incident trend analysis

Highlight: Fatal and Serious Injury Crashes

This dashboard comprises near-real-time interactive information on fatal and serious injury collisions on Tennessee roadways for the current and previous years.

The dashboard enables a nuanced analysis of fatal and serious crashes through interactive filters and graphs, powered by a SQL database and Tableau. Users can analyze trends and patterns by location, road conditions, time of day, victim demographics, and other parameters. The dashboard provides actionable insights to inform traffic safety policies, enforcement initiatives, infrastructure improvements, public education campaigns, and other countermeasures aimed at reducing crash-related deaths and injuries on Tennessee roads.

The dashboard was presented at the 2019 LifeSavers Conference, an annual conference on injury prevention and traffic safety organized by the National Safety Council.

Back to top


Cancer Genomics Data QC Automation

Tools: Python, Linux, REST-API
Key Features: Automated QC pipelines, API-based data retrieval & validation, End-user support & documentation

Highlights: GDC tutorial videos for tools OncoMatrix, Mutation Frequency, and ProteinPaint

Back to top


Student Life Data Integration

Tools: Python, Power BI, Semantic models, AI prompting Key Features: Live data connections, Multivariate analysis, Role-based access dashboards

Highlights: Multiple Correspondence Analysis for student satisfaction data (Jupyter Notebook)

This report presents an analysis of student feedback data regarding study support, academic accessibility, and staff responsiveness. The analyst employed two statistical approaches: descriptive analysis examining frequency distributions and inferential analysis using Multiple Correspondence Analysis (MCA).

Key findings reveal that Net Promoter Scores (NPS) were consistently high across all student groups, though senior students showed significantly lower survey participation. Many students hadn’t used tutoring services, resulting in incomplete tutor-related responses. The MCA identified two main variation components: tutor usage and staff responsiveness (first component), and NPS with class standing (second component).

Notable patterns emerged showing freshmen ranked NPS lower than sophomores and juniors, suggesting satisfaction increases with experience. Tutor-related metrics dominated the data variation, masking other parameters’ relationships. The analysis revealed potential correlations between academic success, academic support, and clear communication, though statistical significance wasn’t formally tested.

The report recommends engaging senior students for their valuable long-term perspective, tracking students longitudinally to observe changing perceptions, and considering a simplified NPS scale given the skewed distribution toward high scores. The analyst suggests dedicated statistical tests are needed to properly assess correlations between non-tutor metrics, as current relationships remain masked by tutor-usage data.

Back to top


Data Quality Testing in Agribusiness Software

Tools: Excel VBA
Key Features: Automated data checks, Cross-platform behavior validation

Back to top


Research Data Analysis and Experimental Design Projects

Tools: Excel, VBA, SAS
Key Features: Multivariate analysis, Experimental designs

This report reviews quantitative aspects of scientific research on trees, plants, and rangeland species in Iran, analyzing what has been studied and how. Comparing government-funded projects with university dissertations revealed clear differences in priorities: students favor quicker lab-based topics like genetic analysis, while institutions cover broader practical areas. Notable gaps were identified, with ecologically and commercially significant species like oak and Damask Rose receiving uneven attention across research domains. A panel of experts recommended prioritizing endangered, valuable, and ecosystem-critical species, proposing actions ranging from conservation to stress-response studies. The report concludes by calling for better organization of existing findings and a clearer roadmap for future research investment. This report was prepared in Persian (with an English translation available in the slideshow notes) and presented at the Research Institute of Forests and Rangelands in Iran in 2009.

Publications: ORCID profile

Back to top


Miscellaneous Projects:

Land Use Change in Tennessee - Nashville Software School

Tools: Python (SciKit), R (ShinyApp)

Key Features: Principal Component Analysis, k-Nearest Neighbors

Back to top

Improve College-Going and College-Readiness - Division of Research and Evaluation, TN Dept of Education

Tools: Python

Key Features: Principal Component Analysis

Back to top

Logistic Regression - Nashville Software School

Tools: Python


Technical Skills

Languages: Python, R, SQL, GraphQL, DAX, VBA
Tools: Power BI, Tableau, SAS
Techniques: Experimental designs, Multivariate & Regressional Analysis, A/B testing

Back to top


~ / m y _ l i f e / i n _ z e r o e s _ a n d _ o n e $
During my undergraduate years, as 16 & 32-bit PCs emerged, I started QBASIC and shortly afterward, QuickBASIC. With genetics and statistics dominating my routine studies, digging into statistical programming turned into my top interest (selected scripts). The big leap was starting to code for experimental designs and ANOVA, but before making significant progress, I learned about MSTAT-C, which sidetracked my interest in coding.
Meanwhile, Windows 3.x had already gotten into the market, and it was evident that the MS-DOS-based programming era was almost over. It didn’t take me long to find that Microsoft has released Visual Basic, so I grabbed it and started migrating my old QB codes to the new platform to get my feet wet. With the upgrade to Windows 9x generation, I felt my programming skills had become obsolete, thus gradually abandoned programming and focused on mastering spreadsheets (started with Lotus 1-2-3 but quickly switched to Excel) for data wrangling and exploration.
By starting postgraduate studies after a long gap, I was urged to use statistical tools again. The need for coding skills was raised again with Windows 98 in its glorious days and Windows 2000 and XP coming up, and statistical packages (Minitab, SPSS, and SAS) have become more user-friendly than ever, and learning resources were pretty well populated and accessible. Yet I could hardly be happier when found out that Visual Basic has become the kernel of macro language in Microsoft Office, specifically Excel. The VBA scripting and SAS package thus became my main tools for programming and statistical analysis for nearly 10 years (selected ANOVA templates).
After relocating to the United States, I initially joined the IT-agribusiness sector, but after a while realized that my knowledge in statistics and coding was perilously out of date. This founded an incentive to join a data science boot camp and add Python and R languages to my set of skills; followed by SQL and Tableau for business intelligence purposes after rejoining the job market.
Since then, my data analysis career journey in road safety, cancer genomics, and education industry has prompted exploring business management and data wrangling tools such as REST-API, GraphQL, PowerBI, Smartsheet, albeit with the aid of AI tools, and I’m looking forward to reconciling my cross-domain experience to develop solutions for seamless and robust data streaming.
. . / f u n _ f @ c t $
- My first email, registered in 1999, was an MS-DOS-based powered by Pegasus Mail.
- I was among the first round of Blogger users who were invited by Google to register for Gmail in 2004 (Story). Despite a wide range of usernames being available (including both my first and last name individually), I preferred to coin the very fourteen-character username I already had with Yahoo since 1999 (registered a few days after the Pegasus one) as my online identity.
- Since then, I have owned email addresses with .net, .com, .ac.uk, .ac.ir, .gov, .org, and .edu domains; in that order.
- I am an Inbox Zero.

Back to top