Blog
RevOps2025-08-2810 min

How to Design a RevOps Data Model That Connects Marketing, Sales, and CS

A unified data model is the foundation of RevOps. Here's how to design one that connects all revenue-generating functions.Includes process templates, metric definitions, and team alignment frameworks.

Revenue operations promises a unified view across marketing, sales, and customer success. In practice, most RevOps teams inherit three separate data architectures built by three separate teams over three separate eras of the company. Marketing tracks leads. Sales tracks deals. CS tracks tickets. And nobody can answer the most basic revenue question: what is the complete journey from first touch to renewal for a specific account? A RevOps data model is the schema-level infrastructure that makes that question answerable. Without it, every report is a reconciliation exercise, every dashboard is an approximation, and every cross-functional meeting devolves into arguments about whose numbers are right.

This guide walks through designing a data model that connects all three revenue functions into a single, queryable structure. Not the theory of data modeling. The specific objects, relationships, fields, and conventions that allow you to trace any revenue dollar from the marketing touch that initiated the relationship through the sales process that closed the deal to the customer success activities that drive retention and expansion.

TL;DR
  • A RevOps data model has five core objects: Person, Account, Opportunity, Activity, and Subscription. Every other entity (campaigns, tickets, invoices) connects to one of these five. Design the core objects first, then extend.
  • The account object is the center of gravity. In B2B, revenue is earned from companies, not individuals. The data model must support multiple people per account, multiple opportunities per account, and multiple subscriptions per account.
  • Activity is the most important and most neglected object. Every interaction (email, call, meeting, page view, support ticket, product login) should be logged as an activity with a timestamp, type, channel, and association to both a person and an account.
  • Build the data model in your warehouse, not your CRM. The CRM is the operational interface. The warehouse is the analytical foundation. Trying to make the CRM do both creates performance problems and limits analytical flexibility.

Why the Default Data Model Fails

Every CRM comes with a default data model: contacts, companies, deals, and activities. On the surface, this looks sufficient. But the default model was designed for a single team (sales) doing a single thing (managing deals). When marketing and customer success are layered on, the default model breaks in predictable ways.

The Lead vs. Contact Problem

Salesforce has a separate Lead object and Contact object. A person starts as a Lead and is "converted" to a Contact when they become a qualified opportunity. This creates a fundamental break in the data model. Pre-conversion activity lives on the Lead record. Post-conversion activity lives on the Contact record. Linking them requires matching logic that is fragile and often incomplete. HubSpot avoids this by treating all people as Contacts from the start, which is architecturally cleaner but creates different challenges (every form fill creates a Contact, which bloats the database with unqualified entries).

In a proper RevOps data model, there is one Person object. A person can have a lifecycle stage (subscriber, lead, MQL, SQL, opportunity, customer, churned) but it is always the same record. There is no conversion event that creates a new record. The lifecycle stage is a field on the Person object that changes over time, with the history of changes preserved for analysis. This allows you to reconstruct the complete journey of any person from their first interaction through their current state.

The Missing Account Layer

Many CRM implementations treat accounts as optional. Contacts exist independently, deals are associated with contacts directly, and the account is an afterthought. This works for transactional sales (one buyer, one deal) but breaks for B2B where the buying decision involves 3-10 people across different departments, the contract is with the company not the individual, and revenue metrics (ARR, NRR, expansion) are measured at the account level.

In the RevOps data model, the account is the central object. Every person belongs to an account. Every opportunity belongs to an account. Every subscription belongs to an account. Revenue is reported at the account level. Health scores are calculated at the account level. The account provides the aggregation point that allows you to see the total relationship: all people, all deals, all revenue, all activity, all support interactions, all product usage, all in one place.

The Activity Fragmentation Problem

Activities (interactions between your company and the customer) are typically scattered across multiple systems. Marketing emails are in the marketing automation platform. Sales calls are in the sales engagement tool. Support tickets are in the helpdesk. Product usage data is in the product database. Website visits are in the analytics platform. No single system has the complete picture, and reconstructing the full activity timeline requires querying and reconciling data from five or more sources.

The RevOps data model solves this by defining a unified Activity object that captures interactions from all sources in a single, standardized schema. Every activity has: a timestamp, a type (email, call, meeting, pageview, login, ticket, etc.), a channel (email, phone, web, in-app, support), an actor (who performed the activity), a subject (who the activity was with or about), an associated person, an associated account, and optional metadata (email subject line, page URL, call duration, ticket priority). When all activities are in one table with a consistent schema, building a complete timeline for any person or account is a simple query.

5
core objects
Person, Account, Opportunity, Activity, Subscription
3-10
people involved
in a typical B2B buying decision
5+
source systems
feeding activity data

Based on analysis of RevOps implementations across 200+ B2B companies

The Five Core Objects

The RevOps data model is built on five core objects. Every other entity in your revenue operation (campaigns, sequences, tickets, invoices, features, usage events) is either a subtype of one of these objects or a relationship between them. Getting these five objects right is the foundation of every report, dashboard, and analysis you will ever build.

Object 1: Person

A Person represents an individual human being who has interacted with your company in any capacity: marketing lead, sales prospect, customer contact, support requester, product user, or partner representative. The Person object should contain identity information (name, email, phone, LinkedIn URL), firmographic context (title, department, seniority level), behavioral state (lifecycle stage, lead score, engagement score), and metadata (source, creation date, last activity date).

Key design decisions for the Person object: Use a single canonical email as the primary identifier. Support multiple email addresses per person (personal, work, alternate work) but designate one as canonical for deduplication. Store the lifecycle stage as a field with a separate stage history table that logs every stage change with a timestamp and the trigger that caused it. Include derived fields for title level (C-suite, VP, Director, Manager, IC) and title function (Marketing, Sales, Engineering, Product) rather than relying on the raw title string for segmentation.

Object 2: Account

An Account represents a company or organization. The Account object should contain firmographic data (company name, domain, industry, employee count, revenue range, location, technology stack), relationship data (customer status, account tier, account owner, partner type), and aggregate metrics (total ARR, total deal value, total contacts, last activity date, health score).

Key design decisions for the Account object: Use the primary email domain as the matching key for associating People with Accounts. Maintain a domain alias table for companies with multiple domains (google.com, alphabet.com, youtube.com all map to the same account). Handle parent-child relationships for enterprises with multiple subsidiaries or divisions. Store ICP fit score as a calculated field based on firmographic data so that it updates automatically when account data is enriched or corrected.

Object 3: Opportunity

An Opportunity represents a potential or actual revenue event: a deal being worked, a contract being negotiated, a renewal being processed, or an expansion being upsold. The Opportunity object should contain deal mechanics (amount, stage, close date, probability, pipeline, deal type), associations (account, primary contact, sales owner), and temporal data (create date, stage change dates, days in current stage, total days open).

Key design decisions for the Opportunity object: Define clear, mutually exclusive deal types (new business, renewal, expansion, cross-sell) so that revenue can be segmented cleanly. Maintain a stage history table that logs every stage change with timestamp, from-stage, to-stage, and the rep who made the change. This history table is essential for pipeline velocity analysis, stage conversion rate analysis, and forecast accuracy measurement. Include a "forecast category" field separate from stage (most likely, commit, best case, pipeline) that the rep sets manually and that is used for forecasting.

Object 4: Activity

An Activity represents any interaction between your company and a person or account. Activities are the connective tissue of the data model. They link people to opportunities (which marketing touches influenced the deal?), opportunities to outcomes (what activities happened before a deal was won or lost?), and accounts to health (is the customer engaging or going quiet?).

The Activity object should have: a unique activity ID, a timestamp, an activity type (hierarchical: email > marketing email > nurture email), a channel (email, phone, web, chat, in-person, in-app), a direction (inbound or outbound), an actor (the person or system that performed the activity), a subject (the person or account the activity was with or about), an associated person ID, an associated account ID, an associated opportunity ID (if relevant), a source system (which tool generated this activity), and a metadata JSON field for type-specific details (email subject, page URL, call recording link, ticket priority, product feature used).

The Activity table is typically the largest table in the data model by far. A single account with 10 contacts might have thousands of activities: website visits, email sends, email opens, email clicks, calls, meetings, product logins, feature usage events, support tickets, and more. Design the table for volume: partition by date, index on person_id and account_id, and archive activities older than your analysis window (typically 24-36 months) to maintain query performance.

Object 5: Subscription

A Subscription represents the ongoing revenue relationship between an account and your company. For SaaS businesses, this is the contract that generates MRR/ARR. The Subscription object should contain: the associated account, the plan or product, the MRR/ARR amount, the billing frequency (monthly, annual), the start date, the current renewal date, the contract term, the expansion history (upgrades, add-ons), and the status (active, churned, paused, pending renewal).

Key design decisions for the Subscription object: Support multiple subscriptions per account (common in enterprise with multiple products or divisions). Track subscription changes in a separate change log table: every upgrade, downgrade, cancellation, pause, and renewal should be logged with timestamp, old value, new value, and reason. This change log is the foundation for expansion revenue analysis, churn analysis, and cohort-based retention modeling.

Start with the core five, then extend
It is tempting to model everything upfront: campaigns, sequences, products, features, tickets, invoices, partners. Resist this temptation. Get the five core objects right first. Once they are in production and being used for analysis, extend the model incrementally based on actual analytical needs, not hypothetical ones. Every object you add increases complexity, maintenance overhead, and the surface area for data quality issues.

Relationships Between Objects

The relationships between objects are where the analytical power of the data model lives. Getting the cardinality (one-to-one, one-to-many, many-to-many) right determines which questions you can answer.

Person to Account: Many-to-one. Multiple people belong to one account. A person can only belong to one account at a time (if they change companies, they get a new association). Track the history of person-to-account associations to handle job changes without losing historical data.

Opportunity to Account: Many-to-one. An account can have multiple opportunities (new business, renewals, expansions). An opportunity belongs to exactly one account. This allows you to calculate total pipeline, total won revenue, and total ARR at the account level by aggregating opportunities.

Person to Opportunity: Many-to-many through a contact role junction table. Multiple people can be associated with a single opportunity (the buying committee). A person can be associated with multiple opportunities (a champion who sponsors deals across different products). The junction table should include a role field (champion, decision maker, influencer, blocker, user, economic buyer) that describes the person's role in the specific deal.

Activity to Person/Account/Opportunity: Activities should be associated with at least a Person and an Account. Association with an Opportunity is optional and depends on whether the activity is related to a specific deal. The triple association (activity linked to person, account, and opportunity simultaneously) is what enables attribution analysis: which activities on which people at which accounts influenced which deals?

Subscription to Account: Many-to-one. An account can have multiple subscriptions (different products, different divisions). Each subscription belongs to exactly one account. The subscription amounts sum to the account's total ARR.

Building the Data Model Step by Step

1
Define Objects and Fields

Document the five core objects with their complete field lists, data types, and source systems. Identify the primary key for each object and the foreign keys that create relationships. Get sign-off from marketing, sales, and CS leaders on the field definitions to ensure each function's needs are represented.

2
Map Source Systems to Objects

For each object and field, identify the source system (CRM, marketing automation, product database, billing system, support platform). Determine which system is the master for each field. Document the mapping in a source-field matrix that specifies source, destination, transformation logic, and sync frequency.

3
Build ETL Pipelines

Create data pipelines that extract data from source systems, transform it to match your object schema, and load it into the data warehouse. Use tools like Fivetran or Airbyte for extraction. Use dbt for transformation. Define tests that validate data quality at each step: uniqueness tests, not-null tests, referential integrity tests, and freshness tests.

4
Build Derived Tables and Metrics

Create derived tables that answer common analytical questions without requiring analysts to join raw tables every time. Account summary (total ARR, deal count, contact count, last activity, health score). Person journey (lifecycle stages with timestamps and durations). Pipeline snapshot (daily snapshot of pipeline value by stage for trend analysis). Revenue waterfall (new, expansion, contraction, churn by period).

5
Connect to BI and Operational Tools

Expose the data model to analysts through a BI tool (Looker, Tableau, Metabase) and to operators through reverse ETL back to the CRM and engagement tools. Create dashboards that answer the most common cross-functional questions. Use reverse ETL to push calculated fields (health scores, ICP scores, engagement scores) back into the CRM where sales and CS can act on them.

The Activity Taxonomy

The Activity object is only useful if activities are categorized consistently. A taxonomy defines the hierarchy of activity types and ensures that activities from different source systems are classified the same way. Without a taxonomy, the activity table becomes a dumping ground of inconsistently labeled events that is difficult to query and impossible to analyze at scale.

CategoryTypeSource SystemDirection
MarketingEmail sent, email opened, email clicked, form submitted, content downloaded, webinar attended, ad clickedMarketing automation, ad platformsOutbound / Inbound
SalesCall made, email sent, meeting held, proposal sent, contract sent, deal stage changedSales engagement, CRMOutbound
ProductLogin, feature used, activation milestone, usage milestone, trial started, trial expiredProduct database, analyticsInbound
SupportTicket created, ticket resolved, escalation, NPS response, CSAT responseHelpdesk, survey toolsInbound
WebsitePage viewed, pricing page viewed, demo page viewed, blog post read, search performedWeb analytics, CDPInbound

The taxonomy should be hierarchical: category > type > subtype. This allows analysis at different levels of granularity. At the category level: "How much marketing activity versus sales activity does this account have?" At the type level: "How many meetings were held on this deal?" At the subtype level: "How many demo meetings versus check-in meetings?" Define the taxonomy before building ETL pipelines so that every source system maps to a consistent set of categories and types.

Unify your revenue data model automatically

OSCOM connects your marketing, sales, and CS data into a unified model with standardized activities, account-level aggregation, and cross-functional dashboards. No data engineering required.

Unify your data

Implementing the Model in Your Data Warehouse

The data model should live in your data warehouse, not your CRM. The CRM is the operational interface where reps and CSMs work. The warehouse is the analytical foundation where data from all sources is combined, transformed, and made available for analysis. Trying to build a comprehensive data model within the CRM creates performance problems (complex cross-object queries slow down the CRM for everyone), maintenance overhead (CRM schema changes require admin intervention), and analytical limitations (CRM reporting tools are typically less powerful than SQL or BI platforms).

The dbt Layer

dbt (data build tool) is the standard framework for transforming raw data into the analytical data model within the warehouse. A typical dbt project for the RevOps data model has three layers of models. The staging layer cleans and standardizes raw data from each source system: renaming columns, casting data types, deduplicating records, and handling null values. The intermediate layer performs cross-source joins and transformations: associating people with accounts via domain matching, stitching activities from multiple sources into a single timeline, and calculating derived fields like days-in-stage and engagement scores. The marts layer produces the final analytical tables that BI tools and reverse ETL tools consume: account summary, person journey, pipeline analysis, revenue waterfall, and activity timeline.

Each dbt model should include tests that validate data quality. Schema tests check that primary keys are unique, required fields are not null, and foreign keys reference valid records. Data tests check business logic: every opportunity should have an associated account, every activity should have a timestamp within a reasonable range, and revenue amounts should be positive. Run these tests on every dbt build and set up alerts when tests fail.

Handling Historical Data and Slowly Changing Dimensions

Some fields change over time and you need to track the history. A person's lifecycle stage changes as they progress through the funnel. An opportunity's stage changes as the deal progresses. An account's ARR changes as subscriptions are added, upgraded, or canceled. For these fields, implement slowly changing dimension (SCD) Type 2 tracking: instead of overwriting the old value, create a new row with the new value, an effective date, and an end date. This allows you to reconstruct the state of any record at any point in time, which is essential for analysis like "what was our pipeline on March 1?" or "when did this account become a customer?"

For fields that change frequently but where the history is not analytically important (last login date, email engagement score), use SCD Type 1 (overwrite the old value). The decision between Type 1 and Type 2 should be made for each field individually based on whether historical values are needed for analysis.

Derived Metrics and Calculated Fields

The raw data model answers questions about what happened. Derived metrics answer questions about what it means. These calculated fields and aggregate metrics are built on top of the core objects and provide the operational intelligence that drives daily decisions.

Account Health Score

The account health score is a composite metric that predicts retention risk. It combines product usage (frequency and breadth of feature usage relative to the account's plan), support sentiment (ticket volume, resolution time, CSAT/NPS scores), engagement (response rate to CSM outreach, attendance at QBRs, participation in community), and commercial signals (contract term remaining, expansion conversations, competitive mentions). Each component is weighted based on its correlation with historical churn, and the composite score is typically normalized to 0-100 with bands for healthy (80+), at-risk (50-79), and critical (below 50). Push the health score back to the CRM via reverse ETL so that CSMs can see it on the account record and act on it.

Pipeline Velocity

Pipeline velocity measures how fast pipeline converts to revenue. The formula is: (number of opportunities x average deal value x win rate) / average sales cycle length. Calculate this metric weekly and track the trend. Velocity can also be decomposed by stage: how fast do deals move from discovery to demo? From demo to proposal? From proposal to close? Stage-level velocity reveals bottlenecks that aggregate velocity obscures. Build a pipeline velocity dashboard that shows the trend over time and flags when velocity drops below historical norms.

Engagement Score

The engagement score quantifies how actively a person or account is interacting with your company. It is based on activity data from the unified Activity table: recency (how recently was the last interaction?), frequency (how many interactions in the past 30 days?), and breadth (how many different activity types?). A person who logged into the product yesterday, responded to a CSM email this week, and attended a webinar this month has a higher engagement score than a person whose last activity was a marketing email open 60 days ago. Use engagement scores as inputs to lead scoring (for prospects), health scoring (for customers), and re-engagement triggers (for contacts whose engagement is declining).

Common Pitfalls and How to Avoid Them

Over-modeling upfront. The temptation is to design a data model that covers every possible use case. Resist this. Start with the five core objects and the most critical derived metrics. Extend the model when you have a concrete analytical need that the current model cannot serve. Every additional object, field, and relationship adds maintenance overhead and increases the likelihood of data quality issues. Build what you need now, not what you might need someday.

Inconsistent granularity. Mixing different levels of granularity in the same table creates confusion and errors. Activities should always be at the individual interaction level (one row per email, one row per page view). Opportunities should always be at the deal level. Accounts should always be at the company level. If you need different levels of aggregation (daily activity counts, weekly pipeline snapshots, monthly revenue summaries), create separate derived tables for each granularity rather than trying to store multiple granularities in the same table.

Ignoring data freshness. A data model is only useful if the data is current. If activities sync daily but the dashboard is used for real-time decision-making, the 24-hour lag creates blind spots. Define freshness requirements for each table: real-time (minutes) for activities and deal stage changes, near-real-time (hourly) for enrichment and scoring updates, and daily for historical snapshots and aggregate metrics. Set up freshness monitoring that alerts when data falls behind schedule.

Not involving stakeholders. A data model designed by a data engineer without input from marketing, sales, and CS will miss critical fields and relationships. Marketing needs campaign attribution fields. Sales needs competitive and MEDDIC fields on opportunities. CS needs health score inputs and product usage fields. Involve a representative from each function in the design process and get sign-off on the field list before building pipelines. The cost of adding a field later (rebuilding pipelines, backfilling data) is much higher than including it from the start.

The CRM is not the data model
A common mistake is treating the CRM schema as the data model. The CRM is an operational tool optimized for daily use by reps and CSMs. The data model is an analytical framework optimized for cross-functional analysis. They share some objects and fields, but they serve different purposes and should be designed independently. The CRM schema should be as simple as possible to maximize adoption. The data model should be as comprehensive as needed to answer revenue questions. Use the warehouse for analysis and reverse ETL to push insights back to the CRM.

Governance and Maintenance

A data model without governance degrades over time. New fields get added without documentation. Old fields become stale but are never removed. Naming conventions drift. Data quality checks get disabled because they fail too often and nobody wants to fix the underlying issues. Governance is the set of processes that keep the data model reliable and maintainable.

Field naming conventions. Establish and enforce naming conventions: snake_case for database fields, consistent prefixes for source-specific fields (sf_ for Salesforce, hs_ for HubSpot), and descriptive names that make the field's purpose clear without needing documentation (account_mrr_amount, not amt or mrr). Document conventions in a style guide and enforce them through code review of dbt models.

Change management. Any change to the data model (new field, removed field, changed data type, new relationship) should go through a review process. The change proposal should include: what is changing, why, which downstream reports and dashboards are affected, and who needs to be notified. This prevents breaking changes from disrupting dashboards and reports that other teams depend on.

Data dictionary. Maintain a data dictionary that documents every object, field, relationship, and derived metric. For each field: the name, data type, description, source system, update frequency, and owner. The data dictionary is the reference that analysts use to understand what data is available and what it means. Without it, analysts spend hours exploring tables and guessing at field meanings, or worse, they use fields incorrectly and produce wrong results.

Quarterly review. Review the data model quarterly. Check for fields with high null rates (indicating data quality issues or irrelevance). Check for fields that are never queried (indicating they can be deprecated). Check for new analytical needs that require model extensions. The quarterly review keeps the model lean and relevant.

Key Takeaways

  • 1The RevOps data model has five core objects: Person, Account, Opportunity, Activity, and Subscription. Design these five first, then extend based on actual analytical needs.
  • 2The Account is the center of gravity. Every person, opportunity, and subscription belongs to an account. Revenue and health metrics are calculated at the account level.
  • 3The unified Activity table is the most important innovation. It consolidates interactions from marketing, sales, product, and support into a single timeline per person and account.
  • 4Build the data model in the warehouse, not the CRM. The CRM is the operational interface. The warehouse is the analytical foundation. Use reverse ETL to push insights back to the CRM.
  • 5Governance matters as much as design. Naming conventions, change management, data dictionaries, and quarterly reviews keep the model reliable and maintainable over time.

Data architecture playbooks for revenue operations teams

Data model design, warehouse architecture, dbt patterns, and governance frameworks for RevOps teams building their analytical foundation. Weekly.

A well-designed RevOps data model does not just answer questions. It changes which questions are possible to ask. When marketing, sales, and CS data live in separate silos, the only cross-functional analysis possible is manual reconciliation: export from three systems, paste into a spreadsheet, match on account name, and hope the data aligns. When all revenue data shares a common model with consistent objects, relationships, and identifiers, the entire revenue operation becomes queryable. You can trace a dollar of ARR from the ad click that initiated the relationship through the sales process that closed the deal to the product usage pattern that drives retention. That level of visibility is not a nice-to-have. It is the operational foundation that separates data-informed revenue teams from everyone else.

See exactly where revenue is leaking in your funnel

Oscom audits your funnel across 12 categories and surfaces the specific fixes that increase conversion and retention.