Mandy McLean, PhD

Research Portfolio

Shared by request. Email mndmclean@gmail.com for access.

Mixed-methods UX research leader.

I build the systems that help organizations turn user understanding into better product and learning decisions at scale.

Mixed-methods researcher · Building research at scale · PhD
Mandy McLean
About

How I work.

I'm a mixed-methods UX research leader and former STEM educator who builds the systems that turn fragmented user signals into institutional wisdom. The throughline across a decade of work is the architecture of understanding: scaling democratized research programs, governing AI-enhanced workflows with strict human-in-the-loop guardrails, and embedding rigorous methodology directly into the product lifecycle.

I bridge the technical execution of a Staff Researcher with the grounded empathy of a former K-12 teacher. That combination shows up in three places.

Research as a shared practice. Most recently, I led the UX research function at Guild, where the work scaled by reach. I built the templates, training, and tiered governance that let partner teams across the company run their own tactical research while the central team focused where deep rigor was non-negotiable. Research moved from a service desk to a shared cultural practice, supporting more than 100,000 users.

AI transformation, governed. Starting in early 2023, I led Guild's company-wide adoption of generative AI: AI-enabled research operations, human-in-the-loop validation, vendor evaluation, governance for regulated employer partners, and a sustained enablement program of training, working groups, and weekly translation of frontier developments for a non-technical audience.

Methodological depth. Underneath it all is the craft: longitudinal and diary studies, behavioral segmentation, cross-product synthesis, funnel diagnostics, and research with constrained populations (working adults, shift workers, students). Earlier in my career I was a UX researcher at LogMeIn embedded in the GoTo product cycle, and a researcher at the Norwegian Geotechnical Institute in Oslo studying socioeconomic vulnerability. Before that, I taught high school sciences in the Bay Area.

I hold a PhD in Education and Quantitative Methods from UC Santa Barbara, with earlier graduate work in environmental and earth system science (Stanford) and engineering mathematics (Dalhousie).

My current independent work focuses on responsible AI, especially as it intersects with how people learn and how products affect users. I do applied research and synthesis on AI's effects on youth wellbeing, with co-authored pieces in Psychiatric Times and After Babel, and expert testimony to state legislatures.

Selected work

A leadership narrative and three case studies.

Click any section to expand. Confidentiality has been respected throughout — proprietary details have been anonymized or generalized.

Leadership

Embedding the user voice in how Guild works.

Built UX research as a function, a shared practice, and a culture across the company. By the time I left, research wasn't a checkbox or a service: it was a continuous input shaping product, strategy, and operations across every part of the business.

Research as culture Research ops Democratization 2018–2025
+
A note on confidentiality. Artifacts and details below have been anonymized or generalized to respect Guild's proprietary information. Frameworks and approaches are shared in their general form; specifics can be discussed in conversation.
11
Partner teams running their own research
Continuous
Mixed-method data collection across the journey
20+
Studies running per quarter at peak
1 → 9
Researchers on the central team
The question
How do you make the user voice a permanent part of how a company makes decisions, with both speed and rigor?

Context

Where research lived in 2018.

Guild was early-stage, growing fast, and figuring out what working adults needed from an education benefit. Research was scattered across functions, with no shared standard for how questions got asked or how answers traveled back into decisions. I was hired onto the product design team, but as I built credibility on early projects, requests started coming from outside product. Within a year, the function existed before it had a name.

Approach

Research as practice, not as a service.

A first researcher hired into a fast-growing company can either become a request-taker (running studies, delivering reports) or build something that can outgrow one person's bandwidth. The first model breaks the moment demand exceeds capacity, and it positions research as something other teams hand off rather than do.

I built it as a shared practice. Partnership across functions was the actual work, which meant being part of the conversations where decisions were made, not just the studies that followed them. The team grew alongside the practice, but reach mattered more than headcount: templates, training, and a tiered model let partner teams own tactical research while the central team focused where rigor mattered most.

This was about building research as a shared practice and a cultural expectation, not a centralized service. By the end, "what does the user voice say?" was a default question in product reviews, strategy meetings, and roadmap discussions.

The system I built

Three layers, working together.
LAYER 01 Templates For people who weren't researchers Research plans Discussion guides Survey design Doc-style reports Retro frameworks LAYER 02 Standards For the team's shared practice Project lifecycle SOPs Qual best practices Survey best practices QA processes Accessibility standards LAYER 03 Curriculum So partner teams could grow with us UX Research 101 Survey design intensive Advanced qual methods "How We Work" series Office hours & coaching RESULT · A research function 11 partner teams could trust and use

Each layer answered a different question. Templates let designers and PMs do their own tactical research without reinventing the basics. Standards kept quality from drifting as the team grew. Curriculum turned partner teams into capable collaborators. The system worked because all three did.

How I prioritized work

Tiered model: lean vs. rigorous, partner-led vs. central-team-owned.

The methodological tension between what the rigorous answer would take and what the business needed by Friday was constant. Rigor isn't a fixed bar; it's calibrated to the consequences of being wrong. Some questions warranted six-month studies. Others needed a tight survey by end of week. The tiered model below taught partner teams how to ask "which tier is this?" before "who runs it?", and it kept the central team focused on the studies that genuinely required depth.

Three tiers, three modes of support
TIER 01 Self-serve Templates, no consult FITS: Usability tests Simple surveys Internal interviews Retros Run by partner teams TIER 02 Consulted We review and coach FITS: Multi-method studies Segmentation Deeper qual Concept testing Co-led with central team TIER 03 Central-team-owned High stakes, high rigor FITS: Longitudinal studies Public-facing reports Strategic foundational Cross-product synthesis Led by central team
The model taught partner teams to ask "which tier is this?" before "who runs it?"

Partner teams

The function reached across the company.
11 partner teams at exit
Marketing Product Product Design Engineering Academic Partnerships Employer Partnerships Strategy Business Operations Member Services Coaching Partner Enablement

Impact

What changed.
Before
Research as a checkbox. One-off studies. Decisions made on gut, opinion, or borrowed analyst data. User voice surfaced reactively, late in the process. Tactical research was a bottleneck.
After
Research as a continuous input. The user voice woven into product reviews, strategy meetings, and roadmap conversations. Tactical work owned by partner teams. The central team free to lead longitudinal, cross-product, and high-stakes work.

By 2024, partner teams across the company were running their own usability tests, surveys, and discovery work as routine practice. Research was no longer a checkbox or a service: it was an asset that informed strategy, shaped iteration, and was part of how decisions actually got made. The cultural shift mattered as much as the operational one.

01
Case Study

Leading AI transformation in research and across Guild.

Starting in early 2023, weeks after ChatGPT's launch, I led the company's adoption of generative AI as both a research operations capability and a workforce-wide transformation.

AI transformation Research ops Governance 2023–2025
+
A note on confidentiality. Specific tools, vendors, and internal program details have been generalized to respect Guild's proprietary information.
Early 2023
Adoption started weeks after ChatGPT launched
Annual priority
One of Guild's formal company-wide goals
Cross-functional
Working group across research, product, eng, legal, ops
75%+
Time saved on key ops workflows with human-in-loop AI
The question
How do you put generative AI to work across a company before there's a roadmap, while keeping output trustworthy?

Context

Early 2023.

ChatGPT launched in late November 2022. Early in 2023, I started using it in my own research work to test where it actually helped (synthesis, lit review, prompt design for survey items, faster pattern-finding) and where it produced confident nonsense. Most of the company hadn't started yet. There was no AI policy, no enterprise tooling, no shared sense of what was acceptable use.

The work started informally. Colleagues began bringing me questions ("Can I use this for X? Is this risky? What tool should I pick?") that needed answers nobody was set up to give. By 2024, "AI Transformation" was part of my title. The role grew around the work.

Approach

Three layers, the same shape as the research function.

I structured the AI work the same way I'd structured the research function — enablement for non-experts, governance for shared practice, and a regular rhythm to keep the company connected to a fast-moving field.

LAYER 01 Enablement For the whole company Weekly AI newsletter Workshops & trainings Office hours Working group Use-case library LAYER 02 Governance Human-in-the-loop, by design Acceptable-use guidance Vendor evaluations Privacy & data handling Output validation Risk review checkpoints LAYER 03 Research ops AI inside the research workflow Synthesis & coding Market understanding Transcription & tagging Repository & search Sentiment & NLP RESULT · A company that adopted AI early, broadly, and responsibly

The newsletter and the practice around it

"The AI Update @ Guild," plus the working group and consulting that surrounded it.

The weekly newsletter was the most visible artifact of the program. Each issue translated frontier AI research, product releases, and policy debates for a non-technical company audience: Stanford labor papers, OpenAI benchmarks, compute-economics primers, the evolving conversation from "AI safety" to "responsible AI." Every issue followed the same shape: a current development, a plain-language explanation of why it mattered for our work, links to original sources, a short section on how teams across Guild were trying things and what they were finding, and a "what's changing" framing for decisions ahead.

Around the newsletter sat the practice that made it work. A cross-functional working group convened regularly to surface use cases, share what was working, and resolve hard questions before they became fires. I consulted directly with leaders across the company on AI choices their teams were facing. I gave talks at all-hands meetings to a company of more than 1,000 people and built training programs that ran across functions. AI transformation became one of Guild's small handful of formal annual priorities, with measurable goals I owned.

Representative issue
The AI Update @ Guild How AI is reshaping work, from product to SWE to responsible AI AI & THE WORKFORCE How generative AI is changing jobs AI is accelerating productivity unevenly, benefiting high-skill workers most. What this means for the adult learners we serve. Source: Stanford. PRODUCT MANAGEMENT A week with AI in a PM workflow Practical look at AI integrated into daily product work, with applications worth trying inside Guild's product cycle. RESPONSIBLE AI Where human review still has to live When AI outputs touch members, employers, or regulated data, the human-in-the-loop check is not optional. A short framework. + What teams are trying this week · Confidential to Guild
A representative issue. Sustained weekly cadence over more than a year.

People forwarded issues to their teams. The newsletter, working group, and consulting practice together became the artifact people associated with Guild's AI culture.

Human-in-the-loop, in practice

Where governance had to live in the daily work.

Enablement without governance is irresponsible. The fastest way to lose trust in AI inside a company is to let an early, confident, wrong AI output reach a customer, a partner, or a board. The role I held wasn't just to evangelize AI; it was to be the person who could tell when an AI output was credible and when it wasn't, and to teach others to do the same.

Guild's employer partners ran in regulated industries (healthcare, financial services, retail) where data privacy and AI governance were not abstract concerns. We worked closely with our partners' security and compliance teams to set clear governance: what data could go through which models, how outputs were validated before they reached customers, and where human review was non-negotiable. That partnership made the trust durable.

Across vendor evaluations, output validation, and privacy review, the same principle held: the value of AI in a research function isn't speed; it's speed plus rigor. Faster synthesis only counts if the synthesis is right. The job of the human in the loop is to be the one who knows when it isn't.

What changed in how we worked

Three concrete examples.
01 · Operations
Spreadsheet-bound workflows became 75%+ faster with human-in-the-loop AI.

Internal operations processes that previously took days or weeks of manual spreadsheet work were redesigned with simple AI workflows and human review checkpoints. The result was not just faster output but also better QA quality, since the AI surfaced patterns reviewers could check against.

02 · Market understanding
Custom GPTs trained on years of research replaced low-stakes message testing.

For mature questions like value-prop messaging or naming on familiar terrain, custom GPTs grounded in our years of accumulated research and market data produced highly accurate directional answers without standing up new studies. The boundary mattered: we used these tools where the data was deep, not for genuinely new questions where AI would have been guessing.

03 · Research repositories
Years of qualitative data became searchable and queryable.

Cross-source synthesis on long-running studies became tractable for the first time. A researcher could query against months or years of accumulated qualitative data rather than relying on memory and notes. Catalog navigation studies, retention work, and longitudinal projects all benefited from being able to ask "what did we hear about X across all the work?"

I worked with the team to develop a working pattern for AI-assisted synthesis: AI surfaces candidate themes; the researcher tests them against the raw data; the researcher writes the final claim. Speed paired with validation, by default.

Outcomes

What this looked like over time.

The work that mattered most was the rhythm: a weekly artifact that signaled the company was paying attention, a working group that turned questions into shared practice, an annual priority I owned and tracked, and a culture that took AI seriously without taking it as gospel. By the time I left in mid-2025, AI was woven into how Guild worked.

02
Case Study

Why learners disengage: a year-long diary study.

A longitudinal mixed-methods study with 12 working adult learners over a full year. The findings reframed how the company understood disengagement and reshaped coaching, comms, and program decisions across teams.

Longitudinal Diary study Mixed methods 2023–2024
+
A note on confidentiality. Participant quotes, identifying details, and proprietary findings have been anonymized or generalized.
12
Learners followed for a full calendar year
3
Insights that reshaped strategy
4 teams
Adopted the framework
RAPID
Cross-functional ownership of the response
The question
Why do most paused learners say they will come back, but only one in four actually do?

Context

Who the learners are.

Guild's members are working adults whose employers (large companies in healthcare, retail, financial services, hospitality, and beyond) cover the cost of going back to school. Many are juggling shift work, caregiving, and complicated lives. They had to clear eligibility, choose a program, get approved, and start, all while continuing to work full-time. Pausing was common. Returning was rare.

Disengagement was a known problem. Most paused learners told us they intended to return; most never did. Fewer than one in four came back, even though over 80% had said they would. Behavioral data could tell us when a learner paused, what they reported at the moment of pause, and which segments paused at higher rates. What it could not tell us was the texture of the weeks before a pause. The moment a learner paused was not the moment the pause began.

To redesign anything that mattered (coaching, communications, program structure, support), we needed to understand the trajectory, not just the endpoint.

Approach

A year of weekly diaries, validated with quant.

I designed a year-long video diary study with 12 working adult learners, purposively sampled across employer partners, program types, and demographic segments. The study was built around what Guild called "moments that matter" — the points in a learner's journey where motivation, momentum, or fit were most likely to shift. Participants recorded short asynchronous video diaries each week and we held quarterly hour-long interviews to go deeper. The full study ran a calendar year per participant, long enough to capture the seasonal rhythms, life shocks, and recovery cycles that shorter studies kept missing.

Methodological choices were deliberate. Single in-depth interviews would have been retrospective. Surveys at scale already existed and told us what, not why. Focus groups would have let social pressure suppress unguarded disclosure. The diary study traded sample size for time depth, and time depth was what was missing.

Once the qualitative patterns surfaced, we tested them against Guild's behavioral and survey data. The trajectories I observed in the diaries were not just stories from 12 people. They held up across the broader population.

Findings

Three insights.
01
Disengagement is a trajectory, not an event.

Across the 12 participants, the shapes of their years were different in detail but recognizably the same in structure: an initial lift after enrolling, an inevitable dip in the middle, and then a fork. Some learners recovered. Others spiraled. The pause itself was usually weeks downstream of the moment things actually started to slip. Disengagement was a cascade, not a decision.

The pattern we found
ASSUMED · V-SHAPE High Low Before Shock After Quick recovery OBSERVED · U-SHAPE High Low Before Shock After Long bottom, slow recovery
Life shocks did not produce a quick V-shaped recovery. They produced a U with a long bottom.
02
The fork is the point of leverage.

Life shocks could not be prevented (illness, caregiving, financial stress, a child sick at home), but the conditions at the fork could be shaped. By the time a learner paused, they were typically already behind, and being-behind was itself a demotivation signal. We validated the U-shape against behavioral data: holidays did not produce the quick V everyone assumed. They produced a long, slow climb back. Coaching, comms, and product all had levers at the fork that nobody had been pulling.

03
Semester one is re-acclimation, not capacity.

Working adults stacked courses in their first term, and coaches encouraged it ("how much can you handle?" was the default). But semester one is where academic re-acclimation happens, especially in gateway math. The capacity frame was wrong: semester one is not a test of throughput, it is a re-entry into the role of student. Learners who took fewer courses up front persisted at meaningfully higher rates.

How insights traveled

The work after the work.

Findings did not change anything on their own. Translating insights into shipped change was its own discipline. I built that into the project plan from day one.

RAPID
Cross-functional ownership, named clearly.

Each recommendation was assigned a clear RAPID structure: who recommended, who agreed, who performed, who provided input, who decided. Coaching, comms, product, and strategy each owned different pieces. This stopped the common failure mode where research lands in a deck and dies there.

Channels
Insights pushed into existing rhythms.

Findings traveled through the channels people already used: Slack threads tied to specific decisions, biweekly check-ins with the teams owning the response, an internal email newsletter summarizing what shifted, and embedded slides in product and strategy reviews. The goal was for the right finding to surface at the moment a decision was being made, not in a quarterly readout three months later.

Validation
Quant tested the qual claims at scale.

For each major insight, we ran the corresponding test against behavioral and survey data. Did the U-shape hold across all paused learners, not just the 12? It did. Did first-semester course load predict persistence? It did. The mixed-methods sequence was the credibility engine: qual surfaced patterns, quant confirmed they generalized.

Impact

What changed.
Before
Disengagement framed as event. Coaching outreach generic and immediate. Comms used performance language ("don't fall behind"). Course-load decisions framed as throughput. Holiday campaigns timed to the shock itself.
After
Disengagement framed as trajectory. Outreach fork-aligned and learner-specific. Comms used momentum language. Course-load conversations reframed around re-acclimation. Holiday campaigns built around the post-shock recovery window.

Each insight reshaped a different part of how the company supported learners. Coaches got vocabulary for disengagement patterns they could feel but could not name. Comms timing moved from immediate post-pause contact to the weeks after, when learners were actually trying to come back. The pause survey was rewritten to feed a real-time retention dashboard. Industry-specific shock calendars (retail's back-to-school, healthcare's shift demands) became standing input to campaign timing.

The spiral framework was adopted across product, coaching, marketing, and strategy as a shared vocabulary for retention. Disengagement is a gradual erosion of momentum, not a sudden decision. The work that matters most is not preventing the shock; it is meeting people at the fork.

03
Case Study

Rebuilding a stalled funnel: when motivation decays in the gap.

A mixed-methods diagnostic of a critical funnel step that had been declining year-over-year. The finding reshaped how Guild engaged learners in a part of the journey that had not been studied closely.

Mixed methods Funnel diagnostic UX research 2023
+
A note on confidentiality. Specific funnel metrics, member quotes, and product details have been generalized.
Since 2020
Of year-over-year decline, reversed
8
In-depth interviews
100s
Survey responses per month
4 teams
Coordinated to ship the response
The question
Why are people who already said yes not showing up?

Context

A quiet decline.

Guild's funnel had a step between application approval and program start that had been declining year-over-year since 2020. It was a slow erosion easy to miss in any single quarter. Going deeper into the data made the trend clear.

The puzzle was simple to state and hard to answer. These were people who had cleared eligibility, completed an application, and gotten approved. And then a meaningful share never actually started. Existing assumptions ranged from "life got in the way" to "the product was confusing." Neither was specific enough to fix.

The pattern
Eligible Application Approved "Yes" THE GAP Approval to Start Weeks of silence No engagement built in Started Active learner A meaningful share who said yes never made it to "started."

Approach

Watching the gap from both directions.

I designed a two-track study. Eight in-depth qualitative interviews with members who had paused at exactly that funnel step ran first, surfacing the language and the lived experience of the gap. Those interviews shaped the questions for an ongoing touchpoint survey that reached members at the moment they hit the funnel step, hundreds per month, capturing self-reported readiness, intent, and barriers in close-to-real-time.

Two-track research design
Track 01 · Qual
In-depth interviews
Eight interviews with members who had paused at exactly that funnel step. Surfaced the language and lived experience of the gap. Questions and themes informed the survey design.
Track 02 · Quant
Touchpoint survey
Sent at the moment members hit the funnel step. Designed for under-a-minute completion. One open-text field carried most of the signal. Hundreds of responses per month.

The hardest design choice was the touchpoint survey itself. Surveys at moments of decay are notoriously underpowered: the people most likely to disengage are also the least likely to respond. The qualitative work made this tractable. We knew what to ask and how to phrase it because we had heard the language first. The survey answered "how many" and "how often." The interviews answered "why."

Finding

"Yes" decays.

The headline insight was simple and not what anyone expected. Motivation eroded in the gap. Approval was not a permanent state. "Yes" was a perishable signal. Without something to keep the commitment alive between approval and start, it quietly decayed.

The mechanism
High Low Motivation Approval Program start Time "Yes" Without engagement: decay With engagement: sustained
The gap had no engagement. There was nothing to push against the natural decay of intent.

The interviews showed people who had been excited at approval describing themselves weeks later in language that revealed slipping commitment. It was rarely a clean break. It was small things accumulating: a delayed reply, a scheduling conflict, a question they meant to ask but never did. Inertia, not friction, was the failure mode. Members who did not start were members for whom nothing in particular had happened, and that was the problem.

Resolution

Filling the gap with momentum.

The recommendations were structural, not tactical. The gap could not be removed for many members (enrollment cycles and program calendars made it unavoidable), so we built engagement into it.

The work was about sustaining motivation through the wait. Concretely:

Coaching
Outreach moved earlier and personalized to the moment of decision.

Coaches reached out during the gap with messages that built on the excitement of acceptance, surfaced practical next steps, and helped learners visualize their first week. Outreach was tied to the rhythm of the wait, not the calendar.

Communications
Comms shifted from logistics to momentum.

Blog posts on what to expect in the first weeks, planning guidance for working adults balancing school with full-time work, and stories from learners further along in the journey. Tone moved from administrative to "you got this."

Product
The wait got something to do.

The next-steps experience inside the product was reworked so members had concrete tasks during the gap: orientation content, a personal to-do list for the first month, a way to connect with peers starting at the same time. Each task was small enough to feel doable and meaningful enough to reinforce commitment.

RAPID
Cross-functional ownership, named clearly.

Coaching, comms, product, and operations each owned different pieces of the response. We mapped the work using a RAPID structure so handoffs were explicit and accountability did not get lost in the seams between teams.

Impact

A measurable lift.

The funnel step rate improved measurably after the changes shipped. The exact numbers are confidential, but the direction was clear and the change held over time. More importantly, the framing (that motivation is perishable in the gaps between commitment and action) became a shared mental model across the teams that owned the journey. Subsequent work on other funnel steps adopted the same lens. The dangerous moments in a user journey are not always moments of friction. Sometimes they are moments of nothing.