Skip to content

Build and activate datasets in Data Studio

Last updated: September 4, 2025

Available with any of the following subscriptions, except where noted:

Please note: if you're a Super Admin, learn how to opt your account into the beta.

HubSpot Data Studio consolidates external customer data to provide a unified platform for synchronization, integration, and activation across your technology stack. Users can build datasets from external sources like Google Sheets, Snowflake, and data sync apps, then integrate them into lists, workflows, and reports for efficient, data-driven decision-making.

Before you get started

  • In order to view, edit, and sync datasets, you must have Data Studio permissions.
  • View and Edit permissions are turned on by default. Super Admins must turn on Sync permissions for dataset users to let them use HubSpot Credits for Data Studio.

What is a dataset?

A dataset is a collection of data from across your HubSpot account that can consist of HubSpot and/or external data sources. A dataset can include properties for CRM objects and HubSpot assets, fields from external data sources, and formulas to calculate your data.

Learn more about dataset definitions in the Glossary.

Use cases for Data Studio

Below are key use cases demonstrating how you can leverage external datasets to enhance CRM records, optimize workflows, create dynamic customer lists, and use datasets to build reports.

Sync data from third-party applications

You can sync datasets from third-party applications directly to your HubSpot CRM. This allows you to:

  • Update CRM records using external datasets
  • Generate new properties from calculated formula fields
  • Instantly write data to your CRM with one click

For example, you could blend data from a product usage platform (e.g. Mixpanel) with company datasets to create a powerful Product engagement score property in HubSpot. 

Learn how to sync data from third-party applications.

Sync data to HubSpot's CRM

You can sync data to HubSpot's CRM to:

  • Improve your audience segmentation
  • Send more personalized marketing emails
  • Improve your sales outreach based on learnings derived from data

For example, you could create a dataset that tracks customer behavior across different channels. By syncing this data back to the CRM, you can refine audience segmentation to identify high-intent leads. Using these insights, you can send personalized email campaigns tailored to customer preferences, boosting engagement and conversion rates.

Learn how to sync data to HubSpot's CRM.

Use your dataset in a workflow

You can use your datasets in a workflow to:

  • Build one-time or recurring workflows for primary dataset records
  • Incorporate dataset fields into workflow actions

For example, leverage the engagement score dataset to build a workflow that automates customer alerts for potential upgrades or churn risks. This workflow can also proactively reach out to customers who may be a strong fit for an upgrade.

Learn how to use datasets in a workflow

Use your dataset to create segments

You can use your datasets to create segments to:

  • Import and merge product usage, lead scores, and customer data into Data Studio
  • Combine HubSpot contact data with external sources, refining insights with filters
  • Generate a filtered list of dataset contacts that meet specific criteria

For example, leverage a targeted customer data dataset to generate a dynamic segment that drives an automated marketing campaign, delivering personalized customer experiences at scale.

Learn how to use your datasets to create segments.

Use your dataset in a report

You can use your datasets in a report to:

  • Combine multiple data sources to create unified reports with information from your CRM, external applications, and analytics platforms
  • Segment data dynamically by applying filters to focus on specific groups, behaviors, or trends
  • Build custom metrics and calculations using formula fields to generate unique performance indicators. 

For example, you could create a field to calculate annual recurring revenue based on the deal amount property, and then use this data from the dataset in a report to track revenue trends, forecast future earnings, and analyze customer retention over time.

Learn how to use your datasets in a report.

Export datasets to Google Sheets and Office 365 Excel

You can export datasets to Google Sheets and Office 365 Excel to:

  • Maintain external copies for archival purposes or additional security.
  • Work with your data even without an internet connection.
  • Share datasets easily with team members for real-time edits and feedback.

For example, exporting HubSpot datasets to Google Sheets or Office 365 Excel allows businesses to refine data for clear insights into revenue trends and customer acquisition, then securely share it with external stakeholders without granting HubSpot access.

Learn how to export datasets

Create a dataset

To create a dataset:

  1. In your HubSpot account, navigate to Data Management > Data Studio.
  2. Click Create dataset in the top right.

Data sources

Data sources are the basis of your dataset. Learn more about different data sources in the Glossary.

  1. On the Sources tab, under Select a primary data source, click Choose a data source.
data-management-sources-tab
  1. On the Choose your data screen, select your primary data source:
    • On the Synced sources tab, click a HubSpot data source, or a data source from a third-party application. To add a new source, click + New source, then follow the prompts.
    • On the CSVs tab, click a CSV file to use as a data source. To add a CSV file, click + Upload new CSV, then follow the prompts.
  2. To add a secondary source, click + Add secondary source, then follow the steps above. Continue adding secondary sources as needed.
  3. To join additional data sources to construct a more complex data set, click Use advanced source configuration.

Advanced source configuration

To join another data source:

  1. Click Join another data source, or the + icon in the data source viewer.

data-management-join-data-source

  1. On the HubSpot data sources or External data sources tab, select a data source.
  2. Click Continue.
  3. On the Preview your join screen, configure the following:
data-management-customize-your-join 
    • In the Left table box, it will display the primary source you selected.
    • In the Right table box, it will display the additional source you selected.
    • Click the Left dropdown menu to select Left join or Inner join.
      • Left join: returns all rows from [primary data source] and any matching rows from [secondary data source]
      • Inner join: returns rows that have matching values in both [primary data source] and [secondary data source].
    • Click Use association labels or Use match key.
      • Use association labels: select an association label for the join.
      • Use match key: select a match key for the left table and right table.
    • Under Match behavior, specify the relationship between the selected data sources:
      • Multiple matches: one row in the left table can match multiple rows in the right table.
      • Only one match: one row in the left table can match at most one row in the right table. Additional rows won't be included.
    • Click Add custom join.

Columns

  1. On the Columns tab, drag and drop the columns to customize their order.
  data-management-customize-columns
  1. To choose columns you want to appear in the dataset, click Choose columns, then click the checkboxes next to columns you want to make appear.
  2. To create new columns, click + Create new column. Explore the different column creation options below:

Data Agent

Use data agent to conduct research and generate new data:

  1. Click to expand Data Agent. Then, click Create smart column
data-management-create-smart-column
  1.  In the Smart column name textbox, type the column name.
  2. In the Describe what you want to know textbox, write a description of the data you want to generate and extract from the smart column.
  3. Click + Insert column token to reference other properties within the dataset.
  4. Click the Where should this data come from? dropdown menu to select a source for your data.
    • Web research: data from searching the internet.
    • Company research: data from the company's primary website.
    • Dataset columns: data from another column in the dataset.
  5. Click the Data type dropdown menu to specify the kind of information in the cell (e.g., number, text, or date).
    • String
    • Number
    • Boolean
  6. Click Preview.
  7. The smart column will now appear in the table.

Enrichment

Use Enrichment to fill gaps or validate data:

  1. Click to expand Enrichment. Then, click Enrich [primary data source] columns or Enrich [secondary data source] columns

data-management-enrichment

  1. Click the Column for [data source] [property] dropdown menu, to select the column property you want to enrich. 
  2. Under Select columns to be created, click the checkboxes next to the properties. You can also use the search bar to find properties.
  3. Click Preview or Create [#] columns.

Formulas

Use Formulas to create calculations based on other columns:

  1. Click to expand Formulas. Then, click one of the following:
    • Apply switch case
    • Compare dates
    • Find and replace
    • Formula builder (Advanced)
    • Summary measure
  1. In the bottom panel, fill in the formula fields.
data-management-switch-case-column

  1. Then, click Save.

Modifiers

Use Modifiers to apply formatting changes and normalizations:

  1. Click Apply capitalization, then do the following:
    • Click the Column to reference dropdown menu to select a column.
    • Type a name in the New column name textbox.
    • Under Choose capitalization, click Title caseUPPERCASE, or lowercase.
    • Click Create column.
  2. Click Trim whitespace, then do the following:
    • Click the Column to reference dropdown menu to select a column.
    • Type a name in the New column name textbox.
    • Click Create column.

Summarized and unsummarized views

There are two types of views that are used to view your datasets:

  • Summarized view: the view for a dataset that's created when it has more than one data source and there are multiple matches possible between the primary and secondary data sources. This view will show data grouped by primary data source and will be used in CRM sync, workflows, and lists. Only columns available in the summarized view will be available in CRM sync, workflows, and lists.
  • Unsummarized view: the view for a dataset that displays ungrouped data. It is used for reporting. In cases where there's only one source or there's a 1:1 relationship between different sources, only unsummarized views will be available and usable across reporting, CRM sync, workflows, and lists.

Filters

Apply filters to your dataset.

  1. On the Filters tab, apply Row Level FiltersAggregate Filters, and Group Summary Filters to your dataset.
  2. Click the Include data if it matches dropdown menu, then select ALL of the filters belowANY of the filters below, or Custom filter rules.
  3. Click + Add filter, then click a filter from the pop-up box. Continue adding filters as needed.

data-management-filters

Review dataset

Once you have completed customizing your dataset, you can do the following:

  1. Review the table to make sure your dataset is appearing correctly.
  2. Click the Summarized view dropdown menu, and select one of the following:
data-management-switch-case-column 
    • Summarized view: shows data grouped by the primary source. This is used for CRM syncs, workflows, and lists.
    • Unsummarized view: shows ungrouped data with rows for all sources This is used for reporting.
  1. If you are satisfied with your dataset, click Save. Click Use in to create a report, workflow, list, do a CRM sync, or export your dataset.

Glossary

Data sources

Data sources include CRM objects (e.g., contacts, custom objects), assets (e.g., website pages, emails), and activities (e.g., conversations, sales) you want to report on. You can select up to five per dataset.

The primary data source anchors the dataset, with all other sources linking to it. HubSpot connects them using the shortest path, automatically joining related data like contacts and deals without extra steps.

Other data sources require intermediary connections for integration. For example, if Deals is the primary source and blog post data is needed, HubSpot links them via Contacts and Web activities, automatically selecting the necessary sources.

Additional data sources

You can add secondary data sources to your dataset to consolidate information into a single view. Some sources will integrate automatically, while others may require additional input to establish a connection.

Automatic joins

Automatic joins are available for common HubSpot data combinations (e.g., Companies + Deals) and integrations with sync apps (e.g., Contacts + Stripe Invoices).

When selecting these source pairs, rows are automatically matched using a default column, typically Email Address or Company Domain.

To customize the matching column, you can manage the join in the source panel.

Custom joins

When integrating data from sources with flexible column structures—such as Google Sheets, AirTable, CSVs, and warehouses—you must provide additional details to enable data merging.

If a custom join is required, you'll need to select matching columns from each source. This ensures that data aligns correctly, using common identifiers such as email addresses, IDs, or company domains.

Syntax

Within a function, you can use data from properties and fields or literal data. Property and field data will be dynamic based on the individual data sources, while literal data is constant. For example:

  • 2021-03-05 is a literal date, which is constant.
  • [CONTACT.createdate] is a property-based date, which is dynamic for each contact record.

Functions can include both literal and property/field data, as long as the data types are compatible with the function’s required arguments. For example:

DATEDIFF("MONTH", "2021-01-01", “[CONTACT.createdate]”)

Below, learn about the syntax for literals and property/field data, and how to incorporate them into formulas.

Literal syntax

Use literals to add specific, static strings of text, numbers, true or false values, and dates in your calculations.

  • String literal: text surrounded by quotation marks. For example, “My cool string”.
  • Number literal: numbers without quotation marks. For example, 42.
  • Boolean literal: true or false.
  • Date literal: a string formatted as "YYY-MM-DD" , or a date timestamp number (e.g, 1635715904). 

Property syntax

Property references allow you to directly include values from the properties of your selected data sources. You don't need to add the property as a dataset field to reference it. 

Use the following syntax when referencing properties:

  • Reference expressions are always surrounded by square brackets ( [ and ] ).
  • Property references specify the object or event name, followed by a period and the internal property name. For example:
    • [CONTACT.lifecyclestage]
    • [COMPANY.name]
    • [e_hs_scheduled_email_v2.__hs_event_native_timestamp]

Field reference syntax

You can reference fields in a formula by surrounding the field name with square brackets. For example:

    • [Field 1]
    • [My awesome custom field]

You can reference fields in formulas as long as the formula’s operators and functions accept the field’s data type. For example, if you create a new field that contains a string, you can reference the field in a function that accepts strings:

  • If Field 1 is [DEAL.name], it contains a string value (the deal's name).
  • CONCAT([Field 1], "Q4") would be valid because it contains two string values.
  • CONCAT([DEAL.name], 2012) would not be valid because it contains both a string and a number value.

Operators

You can use operators with literal and property/field values, and operators are evaluated in the standard PEMDAS order of operations. This allows you to nest operators using parentheses. For example:

  • Adding a number with a property reference: 1 + [DEAL.amount]
  • Using parentheses to nest operations: (1 + 2) * (3 + 4)
Operator Description Example usage

+

Add numbers. Returns a number.

1 + 1
= 2

[DEAL.amount] + 100

-

Subtract numbers. Returns a number.

100 - 1
= 99

EXP(1) - EXP(1)
= 0

WEEKNUM([DEAL.closedate]) - WEEKNUM([DEAL.createdate])

*

Multiply numbers. Returns a number.

2 * 2
= 4

POW(10, 2) * -1
= -100

[DEAL.amount] * 0.5

/

Divide numbers. Returns a number.

10/ 2
= 5

[DEAL.amount] / DATEDIFF("DAY", [DEAL.createdate], [DEAL.closedate])

-

Negates a number.

-100
= -100

-[DEAL.amount]

AND or &&

Checks if both of two boolean values are true. Returns a boolean.

true AND false
= false

CONTAINS(“HubSpot”, “Hub”) && CONTAINS(“HubSpot”, “Spot”)
= true

OR or ||

Check if either of two boolean values are true. Returns a boolean.

true OR false
= true

CONTAINS(“HubSpot”, “Hub”) || CONTAINS(“HubSpot”, “CRM”)
= true

!

Negatives a boolean value. Returns another boolean value.

!true
= false

= or ==

Equality operator. Returns a boolean value.

=true
= true

IF

IF logic is a set of rules performed if a certain condition is met. You can use IF logic to differentiate data. For example, you can use IF logic to:

  • Calculate different commissions based on the size of a deal (i.e. give a higher % for a bigger deal).
  • Bucket deals into tiers for analysis and action in your reports.
  • Translate feedback responses into categories (e.g. label 1-6 is detractor).
  • Define contact priority based on how many days they’ve been marked as lead.
     

LABEL

The LABEL function converts enumeration property interval values into user-friendly values. Some HubSpot defined properties, such as Deal and Contact owner are represented as internal values. This make analysis difficult. When used with HubSpot's defined properties that support translation, the LABEL function will provide the translation based on portal settings, not user settings. For example, you can use the LABEL function to:

  • Access contact or deal stage names directly in fields

LABEL([DEAL.dealstage]) = "Closed Won"(10)

  • Reference HubSpot owners by name directly in fields

LABEL([DEAL.hubspot_owner_id]) = "John Smith"


Numerical functions

Function Definition Arguments Example usage

ABS

Compute the absolute value of a number. Returns a number.

ABS(number)

number: the number to take the absolute value of.

ABS(-10)
= 10

ABS(10)
= 10

CEIL

Round a decimal value up to the nearest integer. Returns a number.

CEIL(number)

number: the number to take the ceiling of.

CEIL(3.14)
= 4

CEIL(EXP(1))
= 3

CEIL(LN([DEAL.amount]))

DIV0

Divide a number, but return zero when the divisor is zero. Returns a number.

DIV0(dividend, divisor)

dividend: the number to use as a dividend in the division operation.

divisor: the number to use as a divisor in the division operation, with zero resulting in a zero value overall

DIV0(5, 2)
= 2.5

DIV0(5, 0)
= 0

DIV0([DEAL.amount], DATEDIFF("DAY", [DEAL.createdate], [DEAL.closedate]))

EXP

Computer Euler's number raised to a value. Returns a number.

EXP(exponent)

exponent: the exponent to raise Euler’s number to.

EXP(1)
= 2.718281828459045

EXP(0)
= 1

FLOOR

Round a decimal value down to the nearest integer. Returns a number.

FLOOR(number)

number: the exponent to raise Euler’s number to.

FLOOR(3.14)
= 3

CEIL(EXP(1))
= 2

FLOOR(LN([DEAL.amount]))

LN

Compute the natural logarithm of a value. Returns a number.

LN(number)

number: the value to take the natural logarithm of.

LN(1)
= 0

LN(EXP(1))
= 1

LN([DEAL.amount])

LOG

Compute the logarithm of a value within a specified base. Returns a number.

LOG(base, value)

base: the base to be used in the logarithm calculation of the value.

value: the number to take the logarithm of.

LOG(10, 1)
= 0

LOG(10, 10)
= 1

LOG(10, [DEAL.amount])

POWER

Raise a base value to a specified power. Returns a number.

POWER(base, exponent)

base: the number to compute the power of.

exponent: the number to raise the base by.

POWER(2, 10)
= 1024

POWER(100, 0.5)
= 10

POWER([DEAL.hs_arr], 2)

SQRT

Take the square root of a non-negative number. Returns a number.

SQRT(number)

number: the number to take the square root of.

SQRT(100)
= 10

SQRT([DEAL.hs_arr])

WIDTH_BUCKET

Bin numerical values into buckets of equal width. Returns the number of the bucket that the value falls into.

If returned value is below the minimum, returns zero. If returned value is above the maximum, returns the number of buckets +1.

WIDTH_BUCKET(value, minValue, maxValue, bucketCount)

value: the number to compute in the bin number.

minValue: the minimum value to start binning from.

maxValue: the maximum value to bin to.

bucketCount: the desired number of equal-width buckets to bin values into between minValue and maxValue. 

WIDTH_BUCKET(25, 0, 100, 10)
= 3

WIDTH_BUCKET(95, 0, 100, 10)
= 10

WIDTH_BUCKET(-1000, 0, 100, 10)
= 0

WIDTH_BUCKET(9999, 0, 100, 10)
= 11

WIDTH_BUCKET([DEAL.amount], 0, 10000, 1000)

String functions

Function Definition Arguments Example usage

CONTAINS

Determine whether a string contains a case sensitive substring. Returns a boolean value.

CONTAINS("string", "substring")

string: the string value to test.

substring: the value to check for inside the string.

CONTAINS("HubSpot", "Hub")
= true

CONTAINS("foo", "bar")
= false

CONTAINS([CONTACT.firstname], "Mike")

CONCAT

Concatenate two strings. Returns a string.

CONCAT("string1", "string2")

string1: the string value to which string2 will be appended.

string2: the string value to append to string1.

CONCAT("Hub", "Spot")
= "HubSpot"

CONCAT([CONTACT.firstname], CONCAT(" ", [CONTACT.lastname]))

LENGTH

Compute the length of a string. Returns a number.

LENGTH("string")

string: the string value to calculate the length of.

LENGTH("HubSpot")
= 7

LENGTH([FEEDBACK_SUBMISSION.hs_content])

TRIM

Remove leader and trailing whitespace from a string. Returns a string.

TRIM(" string ")

string: the string value to trim whitespace from.

TRIM(" Cats are great ")
= "Cats are great"

Date functions

Function Definition Arguments Example usage

DATE_FROM_PARTS

Create a date value from year, month, and day parts. Returns a date.

DATE_FROM_PARTS(year, month, day)

year: the year part of the desired date.

month: the month part of the desired date.

day: the day part of the desired date.

DATE_FROM_PARTS(2021, 1, 1)
= 2021-01-01

DATEDIFF

Return the number of time units between the first date value and the second for a specified time unit. Returns a number.

DATEDIFF(“datePart”, “date1”, “date2”)

datePart: the year, quarter, month, week, or day unit to use in the difference calculation. 

date1: the starting date value to subtract from date2.

date2: the ending date value that date1 will be subtracted from.

DATEDIFF("DAY", "2021-01-01", "2021-02-01")
= 31

DATEDIFF("MONTH", "2021-01-01", DATE_FROM_PARTS(2021, 2, 1))
= 1

DATEDIFF("QUARTER", [DEAL.createdate], [DEAL.closedate])

DATEPART

Extract the year, quarter, month, week, or day from a date value. Returns a number.

DATEPART(“datePart”, “date”)

datePart: the year, quarter, month, week, or day unit to extract.

date: the date value to extract a date part from.

DATEPART("DAY", "2021-03-15")
= 15

DATEPART("MONTH", DATE_FROM_PARTS(2021, 3, 15))
= 3

DATEPART("YEAR", [DEAL.createdate])

DATETRUNC

Truncate a date value to its year, quarter, month, week, or day.

DATETRUNC(“datePart”, “date”)

datePart: the year, quarter, month, week, or day unit to truncate to.

date: the date value to truncate.

DATETRUNC("YEAR", DATE_FROM_PARTS(2021, 3, 15))
= 2021-01-01

DATETRUNC("MONTH", "2021-03-15")
= 2021-03-01

DATETRUNC("DAY", [e_visited_page.__hs_event_native_timestamp])

TIMESTAMP_FROM_PARTS

Create a timestamp value from year, month, day, hour, minute, and second parts. Returns a datetime value.

TIMESTAMP_FROM_PARTS(year, month, day)

year: the year part of the desired date.

month: the month part of the desired date.

day: the day part of the desired day. 

TIMESTAMP_FROM_PARTS(2021, 1, 1)
= 2021-01-01

WEEKNUM

Calculate the week number within a year for a date. Returns a number.

WEEKNUM(“date”)

date: the datetime value with which to compute the week number within a year.

WEEKNUM("2021-03-15")
= 11

WEEKNUM(“1609459200”)
= 11

WEEKNUM(“[deal.createdate]”)

NOW

Returns the current time based on your account's timezone. Returns a datetime value.

NOW()

 

NOW()
= 1633611966314

WORKINGDAYS

Returns the number of week days (Monday - Friday) between two dates.

WORKINGDAYS(value1, value2)

value1: the starting datetime value.

value 2: the ending datetime value. 

WORKINGDAYS("2022-01-01", "2022-01-31")
= 21

WORKINGDAYS("1640995200", "2022-01-31")
= 21

WORKINGDAYS([DEAL.createdate], NOW())

Was this article helpful?
This form is used for documentation feedback only. Learn how to get help with HubSpot.