Getting Started

Data Governance Workshop

A practical session to align your team on project structure, governance ownership, naming conventions, and the Amplitude features that keep your data clean at scale.

What You'll Build Today

By the end of this session, you'll walk away with four concrete decisions and artifacts:

Your project strategy

How you structure Amplitude projects — per platform, per BU, aggregate, or Portfolios

Your governance model

A process framework matched to your team size, maturity, and org structure

Your taxonomy & instrumentation standards

Naming conventions, abstraction level, tracking plan setup, governance controls, and AI-assisted cleanup — the full operational layer

Your governance process table

Who owns what, how they do it, and when — ready to share with your team

Full agenda

Project Strategy — choose how to structure your Amplitude project(s)

What is Data Governance? — the pillars, the quality flywheel, and why it matters

Governance Model — pick the ownership and approval model that fits your org

Taxonomy Design — build your naming style guide and explore the abstraction spectrum

Tracking Plan — branches, schema settings, virtual data extensions, and transformations

Governance Controls — DAC tags, schema enforcement, and Observe monitoring

Data Assistant Agent — automate taxonomy cleanup and governance suggestions with AI

Workshop Activity — build your governance process table together

Next Steps — your prioritized action plan and recommended resources

Step 3 of 10

What is Data Governance?

Data governance is the set of standards, processes, and ownership that ensures your analytics data stays accurate, trustworthy, and actionable over time.

📘

Education

Align teams on what to measure, why it matters, and how to name and classify events consistently across the org.

🔧

Instrumentation

Build the right tracking from the start — correct event schemas, property standards, and validation so data arrives clean.

🔄

Maintenance

Continuously review and clean your taxonomy — remove stale events, resolve duplicates, enforce ownership over time.

The Quality Flywheel

🏗️

Define standards

Style guides & tracking plan

→

🚀

Instrument cleanly

Branches & schema enforcement

→

🔍

Review & approve

Observe QA, merge requests

→

🧹

Maintain & prune

Data Assistant Agent, deprecation

→

📊

Trust the data

Better analysis outcomes

💡

Governance isn't a one-time project — it's an ongoing practice. The flywheel above shows how each step reinforces the next, compounding data quality over time.

Step 2 of 10

Define Your Project Strategy

How you structure Amplitude projects determines your governance scope, cross-team visibility, and reporting flexibility. Choose the pattern that fits your organization.

🗂️ Choose a Project Strategy

Click a strategy to explore how it fits your organization. You can compare all options side-by-side using the toggle above.

📱

Per Platform

Best for platform-differentiated teams

🏢

Per Business Unit

Best for independent product lines

🌐

Aggregate Project

Best for unified analytics

📊

Portfolios

Best for cross-project rollups

All Strategy Options

📱 Per Platform

Separate Amplitude projects for each platform (iOS, Android, Web). Each platform team owns their tracking plan independently.

✅ Strengths

Clean separation of concerns per platform
Platform teams have full autonomy
Easier schema management per surface

⚠️ Tradeoffs

Cross-platform analysis requires Portfolios
Risk of duplicate/inconsistent event names
Higher governance overhead across multiple plans

🏢 Per Business Unit

One project per product line or business unit. Teams within a BU share a tracking plan, but BUs operate independently.

✅ Strengths

Aligns with org structure and ownership
BU-level governance is tractable
Good isolation for data privacy/compliance

⚠️ Tradeoffs

Cross-BU user journeys are fragmented
Company-level rollups need Portfolios
Harder to enforce naming standards across BUs

🌐 Aggregate Project

A single Amplitude project that ingests events from all platforms and teams. Unified tracking plan for the entire organization.

✅ Strengths

Full cross-platform user journey visibility
One tracking plan — simpler to govern
Unified metrics and chart definitions

⚠️ Tradeoffs

Risk of event name conflicts across teams
Requires strong central governance to scale
Schema changes affect all teams simultaneously

📊 Portfolios

Multiple projects (per platform or BU) linked together via Amplitude Portfolios, enabling cross-project analysis without merging data streams.

✅ Strengths

Autonomy per team with cross-project analysis
Best of both worlds for large orgs
Cross-project funnels and user identity stitching

⚠️ Tradeoffs

Requires consistent event naming across projects
More complex to set up and maintain
Portfolio access management adds overhead

💡

Your project strategy is a foundational decision — it shapes everything from your tracking plan scope to how you report cross-team metrics. Most teams start simple (Aggregate or Per Platform) and evolve to Portfolios as they scale.

Step 4 of 10

Choose Your Governance Model

Answer four quick questions to get a recommended model — or select one directly below.

🎯 Model Configurator

How many product teams actively instrument in Amplitude?

1 team

2–5 teams

5+ teams

Who owns the tracking plan and event naming today?

Nobody owns it formally

A central data/analytics team

Each team manages their own

A cross-functional governance council

How mature is your current governance practice?

Just starting — no formal process

Some standards, inconsistently applied

Established process, needs scaling

How do tracking changes get approved today?

Anyone can add events

One person reviews all changes

Team lead approves for each squad

Cross-team committee reviews

🏆 Recommended Model

—

Program Owner

—

Approval Process

—

Effort Level

—

Resource Checklist for this Model

All Governance Models

🧍 Centralized

A single data or analytics team owns the tracking plan, naming standards, and approvals. All instrumentation changes route through this team.

Owner: Central data/analytics team

Effort: Medium — one team absorbs all review load

✅ Strengths

Highly consistent taxonomy across all teams
Clear accountability and fast decisions
Works well with smaller orgs

⚠️ Tradeoffs

Central team becomes a bottleneck at scale
Product teams may feel blocked

👥 Federated

Each product team manages their own events, with a shared style guide and optional central review. Teams have autonomy within guardrails.

Owner: Team leads + shared standards doc

Effort: Low central, higher per-team discipline

✅ Strengths

Teams move fast with local ownership
Scales well across many teams
Reduces central team bottleneck

⚠️ Tradeoffs

Risk of naming inconsistency across teams
Requires strong style guide and enforcement tooling

🏛️ Council

A cross-functional governance committee with representatives from each team. Major taxonomy decisions are reviewed and ratified collectively.

Owner: Cross-functional governance council

Effort: High — requires ongoing coordination

✅ Strengths

High buy-in across teams
Balanced representation of needs
Great for large, politically complex orgs

⚠️ Tradeoffs

Slow decision-making by design
Hard to maintain momentum without a dedicated chair

🤝 Hybrid

A mix of centralized standards (naming, schema) with federated ownership (team-level tracking plan management). Central team sets the floor, teams build on it.

Owner: Central team (standards) + team leads (execution)

Effort: Medium — well-balanced for growing orgs

✅ Strengths

Consistent taxonomy with team autonomy
Most recommended for mid-to-large orgs
Central guardrails prevent drift

⚠️ Tradeoffs

Requires clear delineation of central vs. team scope
More complex to document and enforce

💡

These models aren't rigid — most teams blend elements. The goal is to match governance overhead to your team's capacity and data maturity.

Step 4.1 of 9

Build Your Style Guide

Define your naming conventions once — the preview updates live as you configure your rules.

🔗 Connect to Live Project optional

Pull real events from your Amplitude project to auto-detect their naming conventions.

✓ 0 custom events loaded

Enter your Data tab URL for the project of interest to generate a direct link and a pre-configured bookmarklet.

Data tab URL

Drag this to your bookmarks bar (one-time setup):

📊 Get Amplitude Events

Paste your Data tab URL above, then open your Data Tab

Once Amplitude loads, click the 📊 Get Amplitude Events bookmark — your style guide will be configured automatically

🎨 Style Rules

Event name casing

Verb tense for action events

Object–Action or Action–Object?

Property name casing

Screen / page prefix

Live Preview

Properties on button_clicked

🔍 Naming Validator

Type an event name to check it against your style guide rules.

Explore how granularity decisions affect your event names and the properties needed to make them useful.

⚡ Too Granular ⚖️ Just Right 📦 Too Consolidated

Step 5.1 of 9

Tracking Plan & Branches

The tracking plan is your source of truth. Branches let teams propose changes without breaking production data.

📋 What Goes in the Tracking Plan

Events

✓
Event name, description, and expected event properties
✓
Event ownership — who is responsible for each event
✓
Event status: Planned → Live → Unexpected
✓
Schema enforcement rules (project-wide)

User Properties

✓
User traits set via identify() calls — e.g. plan_type, country
✓
Group properties for account-level analysis (company, org)
✓
Persist across sessions — unlike event properties which are per-event

Shared Resources

✓
Property groups — reusable property sets applied across multiple events

⎇ Branch Workflow

Create a branch

Engineer proposes new events or property changes in an isolated branch

Instrument against branch

SDK validation runs against branch schema — catches issues before prod

Submit merge request

Auto-notifies reviewers in Slack — shows diff of what changed

Review & merge

Approver accepts or requests changes — clean merge into main

Schema enforcement is configured project-wide — the same rules apply across all your sources. Observe then lets you filter violations by source to pinpoint which SDK or integration is sending bad data. These settings are the rules engine that powers Amplitude Observe.

👁️

How this connects to Observe: Your schema settings define what "good data" looks like. Amplitude Observe (found at Data → Events) continuously compares your live event stream against those rules — surfacing unexpected events, missing required properties, type mismatches, and volume anomalies in real time. No code needed.

Unexpected Events

Mark as Unexpected

Event is ingested but flagged in Observe for review

What Observe shows: Events appear in Amplitude tagged "Unexpected" with a distinct status indicator. They're queryable but clearly signal the event isn't in the approved tracking plan. Great for discovery without data loss — your team can review and add to plan directly from the Observe view.

Reject at Ingestion

Unplanned events are dropped at the edge — never ingested

⚠️ Caution: Data is permanently lost — Observe will not see it. Best for teams with mature tracking plans where any unapproved event is genuinely invalid. Requires high confidence in your tracking plan completeness.

Unexpected Properties

Allow

Unexpected properties are ingested and visible

Best for discovery phases. New properties sent from the SDK appear in Amplitude and can be retroactively added to the tracking plan. Observe will show them as part of the event stream.

Mark as Unexpected

Property is ingested but flagged for review

Ingested and queryable, but tagged in Observe so data governors can review and approve or block. Good balance of control and data safety.

Reject

Unexpected property values are dropped

⚠️ Caution: Property values are permanently lost and won't appear in Observe. Best for properties with strict PII or compliance requirements.

Property Type Validation

Required Properties

Mark properties as required in the tracking plan. Observe surfaces events where required properties are missing — the "% Seen" column turns red when a required field is absent, making gaps immediately visible.

Type Enforcement

Define expected type (string, number, boolean, array) per property. Observe flags type mismatches in red — preventing silent bugs where revenue arrives as a string.

👁️ What Observe Surfaces (Data → Events)

Valid
Event matches your current schema exactly

Unexpected
Event not yet in the tracking plan — review and add, or reject

Invalid
Event deviates from schema — missing required props or wrong types

Out of Date
Event matches a previous version of the schema — SDK not yet updated

Enrich your taxonomy without re-instrumentation — these three features create new data dimensions retroactively, no SDK changes required.

⚡

Custom Events

Combine multiple existing events with an OR clause into a single reusable metric. Useful when multiple actions represent the same user intent — e.g. "Play Song" OR "Search Song" as a unified engagement event.

Key constraints

Appear with a [Custom] prefix in charts
Editing breaks charts that reference them
Event property queries only work if the property exists on all component events
Not supported in Redshift queries

🧮

Derived Properties

Compute new properties on the fly from existing event or user properties using formulas — no new data required. Calculated retroactively, so they update historical charts automatically.

Supported functions

String: REGEXEXTRACT, CONCAT, SPLIT, LOWERCASE
Math: SUM, MULTIPLY, DIVIDE, MIN, MAX
Date: EVENT_HOUR_OF_DAY, DATE_TIME_FORMATTER
Conditional: IF, SWITCH, COALESCE
Max 10 property references per derived property

📡

Channel Classifiers

Categorize traffic by UTM parameters and referrer data into named marketing channels — computed on the fly, aligned to GA4's 29-category model by default. Retroactive definitions update all existing charts.

Key details

Default classifier includes Paid/Organic Search, Social, Email, Display, LLM Search, and more
Define custom channels by building row-based rules (all conditions in a row must be true)
Use OR logic by adding separate rows with the same channel name
Max 149 rows and 1,000 total cells per classifier

Fix common instrumentation mistakes retroactively — no code changes, no re-deployment. Transformations apply at query time and leave your raw data untouched.

💡

When to use: Transformations are ideal for correcting data quality issues that are too costly or slow to fix in code — renaming inconsistent events, merging duplicates, or standardizing property values across historical data.

🔀 Merge Events

Consolidate multiple events that represent the same action into a single event name.

// Before
comment_reply_like
comment_share
// After merge
comment (comment_type: "like" | "share")

Optionally add a distinguishing property to preserve the original intent.

🏷️ Merge Properties

Combine properties with different names that represent the same dimension.

// Before
title
TITLE
item_title
// After merge
Title

Works for both event properties and user properties.

✏️ Rename Property Values

Correct misspellings or standardize inconsistent casing in property values.

// Before
paid_subscription: "true"
paid_subscription: "TRUE"
// After rename
paid_subscription: "True"

Reassigns specific values — does not affect all values of the property.

🙈 Hide Property Values

Remove unwanted or noisy values from charts and dropdowns without deleting raw data.

// Hide test/internal values
user_type: "(none)"
user_type: "test_user"
// Hidden from charts,
// visible in event stream

Raw data preserved — reversible at any time.

⚠️ Important Constraints

Only available on the main branch — not on feature branches
Default Amplitude user properties (e.g. platform, country) cannot be transformed
Transformed properties are not available for block/drop filters in Data Management
Transformations apply at query time — raw data in connected warehouses is unchanged
All transformations are non-permanent and can be edited or deleted anytime

🔗

Find transformations in Amplitude under Data → Transformations. Requires Manager or Admin role to create.

Step 7 of 10

Governance Controls

Classify your properties, control who sees sensitive data, and manage event lifecycle.

Property Classification

Property	Example Values	Classification	What Restricted Users See
`user_email`	alice@company.com	PII	Property hidden from charts and user streams
`plan_mrr`	$1,200	Revenue	Values masked — "Restricted" shown in place of value
`device_id`	a1b2c3d4…	Sensitive	Not available in group-by or filters
`plan_type`	Pro, Growth, Enterprise	Standard	Fully visible to all users
`experiment_variant`	control, treatment_a	Standard	Fully visible to all users

Event Lifecycle

Hide vs Block vs Delete

These are not the same operation — understand the consequences before acting.

🙈 Hide

Event is removed from the event picker UI
Data is still ingested and stored
Existing charts still work
Reversible — unhide anytime
Best for: decluttering noise

🚫 Block

New incoming data is dropped at edge
Historical data is preserved
Old charts using this event still render
Partially reversible (unblock re-enables)
Best for: stopping bad instrumentation

🗑️ Delete

All historical data is permanently erased
New data is also dropped
All charts referencing event break
Irreversible
Best for: compliance / data minimization only

Data Deprecation

Automated Cleanup with Data Assistant

Automated tasks run daily across your workspace — surfacing stale events so data governors can act without manual audits. Requires Admin/Manager permissions and must be enabled by Amplitude Support.

🕐 Stale Events (90-day default)

Events not queried in any chart for 90+ days are flagged as candidates. The threshold is configurable. Includes a mandatory 30-day notification window before deletion — event owners are notified via email and Slack so they can save events they still need.

📅 Single-Day Events (test data)

Events that fired only on a single calendar day (typically test instrumentation) are automatically identified and scheduled for deletion after the 30-day window. Threshold is configurable.

⚠️

Recovery: Deleted events are restorable anytime via Data → Events → Deleted Events → Restore — this does not recover the data sent to Amplitude while the event was deleted.

4-Step Deprecation Workflow

Identify

Data Assistant surfaces events unused for 90+ days or fired on a single day

→

Notify

Owners notified via email + Slack 30 days before deletion — they can save the event

→

Hide

Remove from picker, confirm no active charts depend on the event

→

Delete

Permanent removal — only for compliance or data minimization requirements

Step 8 of 10

AI-Powered Data Governance

Amplitude's AI agents help you build, maintain, and scale your taxonomy — from first instrumentation through ongoing governance.

Today — Data Assistant

Available now. Continuously analyzes your event stream and surfaces issues for review.

🔍

Auto-Detect Issues

Surfaces stale events (90+ days without a query), duplicates, and events missing descriptions or owners.

✨

Bulk Actions

Accept or reject suggested changes in bulk — categorize, tag, and clean your taxonomy in minutes, not days.

📊

Taxonomy Health Score

Proactive health scoring shows your taxonomy quality over time — trend it in your quarterly review.

🤖 Review Suggestions in Data Assistant

Data Assistant analyzes your live event stream and surfaces the most impactful recommendations — stale events, missing metadata, duplicates, and more — directly in your project.

🚀 Try the Data Assistant Agent

The Data Assistant Agent analyzes your taxonomy and creates a prioritized action plan — surfacing stale events, missing metadata, duplicates, and AI readiness improvements. Launch it directly in your project.

Coming in 2026 — AI Governance Agents

Three new agents that automate the full governance lifecycle — from initial setup through continuous self-maintenance.

🚀

Quickstart Agent

Scans your product or website during initial setup and suggests a complete tracking plan, then helps you instrument it — getting teams to clean data from day one without starting from a blank slate.

Setup & Instrumentation

🤖

Data Assistant Agent

Surfaces the top recommended actions to improve data quality on an ongoing basis. Goes beyond today's Data Assistant with richer AI-driven prioritization across four areas:

Updating descriptions and other missing metadata
Deleting unused and test events
Reducing duplicate events
Improving AI readiness of your taxonomy

Ongoing Maintenance

🔮

Self-Building Agent

Identifies new user paths and funnels in your product on a regular cadence, then proactively suggests new events and properties for your review — so your tracking plan evolves as your product does.

Proactive Discovery

Step 9 of 10

Your Governance Process Table

Fill in this table live during the workshop. This becomes your team's governance runbook — who does what, how, and when.

📝

Complete this table with your team now. When you're done, print or screenshot it — this is your governance process document.

Process Area	WHO is responsible?	HOW does it work?	WHEN does it happen?
📐 Taxonomy Schema
➕ New Events
🗑️ Event Removal
🔧 Maintenance Issues
🏷️ Style Guide
🔐 Access Control

Review Cadence

Frequency	Activity	Owner	Tool
Weekly	Review Observe for unexpected events from recent releases	Engineer / PM	Observe
Bi-weekly	Review and approve pending branch merge requests	Data Steward	Branches
Quarterly	Full taxonomy health review — accept/reject Agent suggestions	Governance Owner	Data Assistant Agent
Annually	Style guide review + bulk event cleanup	Data Council	Bulk Edit + Agent

Step 10 of 10

Next Steps

Your post-workshop action plan. Remove any items that aren't relevant to your team, then share or print this list.

✅ Your Post-Workshop Checklist

✓
Share the completed governance process table with your team
✓
Document your naming conventions from the style guide builder
✓
Assign a governance owner or program lead
✓
Set up your first branch in Amplitude Data
✓
Configure schema enforcement settings for your project
✓
Schedule a quarterly Data Assistant Agent review in your calendar
✓
Classify PII and sensitive properties in Data Access Controls
✓
Run your first Data Assistant Agent review and accept/reject suggestions