Skip to content

Create and use datasets

Last updated: February 14, 2024

Available with any of the following subscriptions, except where noted:

Operations Hub   Enterprise

A dataset is a collection of data from across your HubSpot account that can then be used in custom reports. A dataset can include properties for CRM objects and HubSpot assets, along with formulas to calculate your data as needed. For example, you can create a field to calculate annual recurring revenue based on the deal amount property. 

Creating multiple datasets for your teams means that report creators won’t have to select their data sources every time they need to create a report. A dataset can also be updated after creation, making it possible to update all reports using that dataset at the same time.

Below, learn how to create a dataset, how to use the dataset in the report, and the definitions for available functions within the datasets tool.

Create a dataset

To create a dataset:

  • In your HubSpot account, navigate to Reports > Data Management Datasets.
  • In the upper right, click Create dataset. Or to create a dataset using a template, browse the available templates, then click Use template

Data sources

First, select the data sources to include in your dataset. Data sources are the objects, assets, and events that you want to report on. This includes all CRM objects such as contacts or custom objects, and assets such as website pages and emails, conversations, sales activities, and more. You can select up to 5 data sources per dataset.

The primary data source will be the focus of the dataset, with all other data sources relating to that primary source. To connect these data sources, HubSpot joins the data in the background using the shortest possible path. Contacts and deals, for example, are directly related and can be selected with no extra joins needed. 

However, other data sources cannot be associated directly and require additional sources to join the data together. For example, if your primary source is Deals and you want to include blog post data in the report, HubSpot can only link those sources through the Contacts and Web activities sources. These other sources will be selected automatically to join the data.

  • To select the primary data source, click the Primary data source dropdown menu, then select a data source

    datasets-select-data-source
  • Continue selecting additional data sources. In the right panel, you can view the relationships between the current selected data sources.
  • As you select sources, view the Preview pane at the bottom of the screen for a preview of your data. 
  • After selecting your data sources, in the upper right, click Next.

Add properties and fields

Select the fields to include in the dataset. You can add existing HubSpot properties to the dataset, as well as custom formula fields.

Add properties

  • To add properties to the dataset, drag and drop the properties from the left sidebar into the Dataset fields section.

datasets-add-properties

  • To rename a property or view a preview of its data, click the property under Dataset fields, then enter a new name into the name field on the right. Renaming a property will update its name only in this dataset. This allows you to customize how these fields appear in the report builder, but will not impact the name of the existing property.

Add formula fields

Formula fields are specific to the dataset and can be used to calculate values based on properties in the dataset. Learn more about building formulas using flexible expressions.

  • To create a formula field, click Create field > Formula field

    create-formula-field-datasets
  • On the bottom, enter a name for the field.
  • Enter your formula. You can reference properties that you've added to the dataset as well as other HubSpot properties outside of the dataset, and use functions to calculate based on property and literal data. Learn more about formula syntax and definitions below.
    • To manually enter your formula, begin typing in the Formula field. By default, HubSpot will display auto-complete options as you enter text.

      auto-complete-formula-datasets
    • To insert a property you've added to the dataset, click the Dataset field dropdown menu, then select the property.
    • To insert a property that isn't in the dataset, click the HubSpot property dropdown menu, then select the property.
    • To insert a function, click the Function dropdown menu, then select a function.
    • To insert a snippet, click the Snippet dropdown menu, then select a pre-built formula. Snippets are populated based on the sources included in the dataset. Therefore different snippets will be available for datasets made from the contacts data source compared to the deals datasource.
dataset-snippets
  • As you create formulas, the Formula field will display any issues it detects. When a formula is invalid, an indicator will display. Click [X] issues to view error details.
     view-dataset-issues
  • Once you've set up your fields, click Review in the upper right.

Add conditional fields

Conditional fields enable users to group or bucket data based set conditions. These fields can only be used in datasets or the custom report builder. You can use conditional fields to calculate different commissions based on the size of a deal or translate feedback responses in to categories, for example, a label between 1-6 is labeled as a detractor.

The conditional field allows you to build a formula using the IF() function. Any fields built using a conditional field can be recreated using an IF() function in a standard formula creation field.

  • To create a conditional field, click Create field > Conditional formula field
create-formula-field-datasets
  • At the bottom, enter a name for the field. This name will be displayed when building a report using this segment.
  • In the Condition section, enter your IF condition. For each row of data, if the condition set in this section is true, the row will be labeled by the value set in the Then field. If the condition is false, the row will be labelled by the value set in the Default value field. You can reference properties that you've added to the dataset as well as other HubSpot properties outside of the dataset, and use functions to calculate based on property and literal data. Learn more about formula syntax and definitions below.
    • To manually enter your formula, begin typing in the IF field. By default, HubSpot will display auto-complete options as you enter text. 

      enter-condition
    • To insert a property you've added to the dataset, click the Dataset field dropdown menu, then select the property.
    • To insert a property that isn't in the dataset, click the HubSpot property dropdown menu, then select the property.
    • To insert a function, click the Function dropdown menu, then select a function.

      insert-property-field-function
    • In the Then field, enter the value you want to assign if the condition is true.
  • To add another conditional field, click Add condition block. Conditional blocks are processed sequentially, if the first condition is true the value defined in that block will be set. If it is not true, HubSpot will go to the next block and so on. Any rows that do not meet the conditions set will be assigned the Default value.

    add-conditional-block-datasets
  • In the Default section, enter a value for Else if the condition is not met.

    enter-else-value
  • Once you've set up your fields, click Review in the upper right.
For example, to use conditional fields to evaluate your customers by their subscription level, create three conditional blocks. 

example-conditional-field

Filter

Refine your data further by adding filters to your fields.

To add a filter:

  • Navigate to the Filters tab.
  • From the left sidebar, click and drag fields.
  • Click a field to view its filter options. Select a filter, then click Apply.
  • You can group filters together by clicking a field, then clicking the Group with another filter dropdown menu. Select another active filter, then click Apply.

datasets-group-filters0

    • You can change how the filters are used by clicking the Include data if it matches dropdown menu:
      • All of the filters below: data must meet all of the filters set.
      • Any of the filters below: data can be included if it meets one or more of the filters set.
      • Custom filter rules: write your own filter rules. You can use boolean expressions like "1 and (2 or 3)".
    • To ungroup filters, click a filter, then click the Group with another filter dropdown menu. Select None, then click Apply.
  • After setting up your filters, click Next.

Review

Review your dataset before saving it. 

  • Under Sources, view the data sources you’ve selected. 
  • Under Fields, view the fields in the dataset. This includes:
    • Field: the name of the field.
    • Derived: whether the field is a standard HubSpot field or a custom calculated field.
    • Data type: the type of data contained in the field.
    • Input: the field’s expression.
    • Source: the source of the data (e.g., Contacts).
  • In the Preview panel, view the preview of your data. You can click View table relationships to see how the data is connected.
  • After reviewing your data, save the dataset by clicking Save.
  • In the right panel, enter a name and description for the dataset.
  • Click Apply.

You’ll then be brought to the report builder where you can create a report based on your dataset.

View and manage datasets

On the datasets dashboard, you can view and edit existing datasets.

  • In your HubSpot account, navigate to Reports > Data Management, then select Datasets.
  • You can filter your existing datasets by using the filters at the top of the table.
  • To edit a dataset, hover over the dataset, then click Edit. You’ll then be brought to the dataset details page.
    • On the Preview tab, view a preview of the dataset’s data. 
    • On the Metadata tab, view the data sources and fields included in the dataset. This includes:
      • Field: the name of the field.
      • Derived: whether the field is a standard HubSpot field or a custom calculated field.
      • Data type: the type of data contained in the field.
      • Input: the string used to bring data into the field.
      • Source: the source of the data (e.g., Contacts).
    • On the Reports tab, view the reports that’ve been created using the dataset.
    • From the datasets details page, you can also create a new report using the dataset by clicking Build report.

Create a report using a dataset

Once a dataset has been created, you can create a report based on the dataset either from the report builder or from the datasets tool.

  • To create a report from within a dataset:
  • To create a report from the custom report builder:
    • In your HubSpot account, navigate to Reporting & Data > Reports.
    • In the upper right, click Create report.
    • Select Custom Report Builder.
    • At the top, click the Datasets tab to view your existing datasets. 
    • Select the dataset you want to use, then click Next.

Reference

Syntax

Within a function, you can use data from properties and fields or literal data. Property and field data will be dynamic based on the individual data sources, while literal data is constant. For example:

  • 2021-03-05 is a literal date, which is constant.
  • [CONTACT.createdate] is a property-based date, which is dynamic for each contact record.

Functions can include both literal and property/field data, as long as the data types are compatible with the function’s required arguments. For example:

DATEDIFF("MONTH", "2021-01-01", “[CONTACT.createdate]”)

Below, learn about the syntax for literals and property/field data, and how to incorporate them into formulas.

Literal syntax

Use literals to add specific, static strings of text, numbers, true or false values, and dates in your calculations.

  • String literal: text surrounded by quotation marks. For example, “My cool string”.
  • Number literal: numbers without quotation marks. For example, 42.
  • Boolean literal: true or false.
  • Date literal: a string formatted as "YYY-MM-DD" , or a date timestamp number (e.g, 1635715904). 

Property syntax

Property references allow you to directly include values from the properties of your selected data sources. You don't need to add the property as a dataset field to reference it. 

Use the following syntax when referencing properties:

  • Reference expressions are always surrounded by square brackets ( [ and ] ).
  • Property references specify the object or event name, followed by a period and the internal property name. For example:
    • [CONTACT.lifecyclestage]
    • [COMPANY.name]
    • [e_hs_scheduled_email_v2.__hs_event_native_timestamp]

Field reference syntax

You can reference fields in a formula by surrounding the field name with square brackets. For example:

    • [Field 1]
    • [My awesome custom field]

You can reference fields in formulas as long as the formula’s operators and functions accept the field’s data type. For example, if you create a new field that contains a string, you can reference the field in a function that accepts strings:

  • If Field 1 is [DEAL.name], it contains a string value (the deal's name).
  • CONCAT([Field 1], "Q4") would be valid because it contains two string values.
  • CONCAT([DEAL.name], 2012) would not be valid because it contains both a string and a number value.

Operators

You can use operators with literal and property/field values, and operators are evaluated in the standard PEMDAS order of operations. This allows you to nest operators using parentheses. For example:

  • Adding a number with a property reference: 1 + [DEAL.amount]
  • Using parentheses to nest operations: (1 + 2) * (3 + 4)
Operator Description Example usage

+

Add numbers. Returns a number.

1 + 1
= 2

[DEAL.amount] + 100

-

Subtract numbers. Returns a number.

100 - 1
= 99

EXP(1) - EXP(1)
= 0

WEEKNUM([DEAL.closedate]) - WEEKNUM([DEAL.createdate])

*

Multiply numbers. Returns a number.

2 * 2
= 4

POW(10, 2) * -1
= -100

[DEAL.amount] * 0.5

/

Divide numbers. Returns a number.

10/ 2
= 5

[DEAL.amount] / DATEDIFF("DAY", [DEAL.createdate], [DEAL.closedate])

-

Negates a number.

-100
= -100

-[DEAL.amount]

AND or &&

Checks if both of two boolean values are true. Returns a boolean.

true AND false
= false

CONTAINS(“HubSpot”, “Hub”) && CONTAINS(“HubSpot”, “Spot”)
= true

OR or ||

Check if either of two boolean values are true. Returns a boolean.

true OR false
= true

CONTAINS(“HubSpot”, “Hub”) || CONTAINS(“HubSpot”, “CRM”)
= true

!

Negatives a boolean value. Returns another boolean value.

!true
= false

= or ==

Equality operator. Returns a boolean value.

=true
= true

IF

IF logic is a set of rules performed if a certain condition is met. You can use IF logic to differentiate data. For example, you can use IF logic to:

  • Calculate different commissions based on the size of a deal (i.e. give a higher % for a bigger deal).
  • Bucket deals into tiers for analysis and action in your reports.
  • Translate feedback responses into categories (e.g. label 1-6 is detractor).
  • Define contact priority based on how many days they’ve been marked as lead.
     

LABEL

The LABEL function converts enumeration property interval values into user-friendly values. Some HubSpot defined properties, such as Deal and Contact owner are represented as internal values. This make analysis difficult. When used with HubSpot's defined properties that support translation, the LABEL function will provide the translation based on portal settings, not user settings. For example, you can use the LABEL function to:

  • Access contact or deal stage names directly in fields

LABEL([DEAL.dealstage]) = "Closed Won"(10)

  • Reference HubSpot owners by name directly in fields

LABEL([DEAL.hubspot_owner_id]) = "John Smith"


Numerical functions

Function Definition Arguments Example usage

ABS

Compute the absolute value of a number. Returns a number.

ABS(number)

number: the number to take the absolute value of.

ABS(-10)
= 10

ABS(10)
= 10

CEIL

Round a decimal value up to the nearest integer. Returns a number.

CEIL(number)

number: the number to take the ceiling of.

CEIL(3.14)
= 4

CEIL(EXP(1))
= 3

CEIL(LN([DEAL.amount]))

DIV0

Divide a number, but return zero when the divisor is zero. Returns a number.

DIV0(dividend, divisor)

dividend: the number to use as a dividend in the division operation.

divisor: the number to use as a divisor in the division operation, with zero resulting in a zero value overall

DIV0(5, 2)
= 2.5

DIV0(5, 0)
= 0

DIV0([DEAL.amount], DATEDIFF("DAY", [DEAL.createdate], [DEAL.closedate]))

EXP

Computer Euler's number raised to a value. Returns a number.

EXP(exponent)

exponent: the exponent to raise Euler’s number to.

EXP(1)
= 2.718281828459045

EXP(0)
= 1

FLOOR

Round a decimal value down to the nearest integer. Returns a number.

FLOOR(number)

number: the exponent to raise Euler’s number to.

FLOOR(3.14)
= 3

CEIL(EXP(1))
= 2

FLOOR(LN([DEAL.amount]))

LN

Compute the natural logarithm of a value. Returns a number.

LN(number)

number: the value to take the natural logarithm of.

LN(1)
= 0

LN(EXP(1))
= 1

LN([DEAL.amount])

LOG

Compute the logarithm of a value within a specified base. Returns a number.

LOG(base, value)

base: the base to be used in the logarithm calculation of the value.

value: the number to take the logarithm of.

LOG(10, 1)
= 0

LOG(10, 10)
= 1

LOG(10, [DEAL.amount])

POWER

Raise a base value to a specified power. Returns a number.

POWER(base, exponent)

base: the number to compute the power of.

exponent: the number to raise the base by.

POWER(2, 10)
= 1024

POWER(100, 0.5)
= 10

POWER([DEAL.hs_arr], 2)

SQRT

Take the square root of a non-negative number. Returns a number.

SQRT(number)

number: the number to take the square root of.

SQRT(100)
= 10

SQRT([DEAL.hs_arr])

WIDTH_BUCKET

Bin numerical values into buckets of equal width. Returns the number of the bucket that the value falls into.

If returned value is below the minimum, returns zero. If returned value is above the maximum, returns the number of buckets +1.

WIDTH_BUCKET(value, minValue, maxValue, bucketCount)

value: the number to compute in the bin number.

minValue: the minimum value to start binning from.

maxValue: the maximum value to bin to.

bucketCount: the desired number of equal-width buckets to bin values into between minValue and maxValue. 

WIDTH_BUCKET(25, 0, 100, 10)
= 3

WIDTH_BUCKET(95, 0, 100, 10)
= 10

WIDTH_BUCKET(-1000, 0, 100, 10)
= 0

WIDTH_BUCKET(9999, 0, 100, 10)
= 11

WIDTH_BUCKET([DEAL.amount], 0, 10000, 1000)

String functions

Function Definition Arguments Example usage

CONTAINS

Determine whether a string contains a case sensitive substring. Returns a boolean value.

CONTAINS("string", "substring")

string: the string value to test.

substring: the value to check for inside the string.

CONTAINS("HubSpot", "Hub")
= true

CONTAINS("foo", "bar")
= false

CONTAINS([CONTACT.firstname], "Mike")

CONCAT

Concatenate two strings. Returns a string.

CONCAT("string1", "string2")

string1: the string value to which string2 will be appended.

string2: the string value to append to string1.

CONCAT("Hub", "Spot")
= "HubSpot"

CONCAT([CONTACT.firstname], CONCAT(" ", [CONTACT.lastname]))

LENGTH

Compute the length of a string. Returns a number.

LENGTH("string")

string: the string value to calculate the length of.

LENGTH("HubSpot")
= 7

LENGTH([FEEDBACK_SUBMISSION.hs_content])

TRIM

Remove leader and trailing whitespace from a string. Returns a string.

TRIM(" string ")

string: the string value to trim whitespace from.

TRIM(" Cats are great ")
= "Cats are great"

Date functions

Function Definition Arguments Example usage

DATE_FROM_PARTS

Create a date value from year, month, and day parts. Returns a date.

DATE_FROM_PARTS(year, month, day)

year: the year part of the desired date.

month: the month part of the desired date.

day: the day part of the desired date.

DATE_FROM_PARTS(2021, 1, 1)
= 2021-01-01

DATEDIFF

Return the number of time units between the first date value and the second for a specified time unit. Returns a number.

DATEDIFF(“datePart”, “date1”, “date2”)

datePart: the year, quarter, month, week, or day unit to use in the difference calculation. 

date1: the starting date value to subtract from date2.

date2: the ending date value that date1 will be subtracted from.

DATEDIFF("DAY", "2021-01-01", "2021-02-01")
= 31

DATEDIFF("MONTH", "2021-01-01", DATE_FROM_PARTS(2021, 2, 1))
= 1

DATEDIFF("QUARTER", [DEAL.createdate], [DEAL.closedate])

DATEPART

Extract the year, quarter, month, week, or day from a date value. Returns a number.

DATEPART(“datePart”, “date”)

datePart: the year, quarter, month, week, or day unit to extract.

date: the date value to extract a date part from.

DATEPART("DAY", "2021-03-15")
= 15

DATEPART("MONTH", DATE_FROM_PARTS(2021, 3, 15))
= 3

DATEPART("YEAR", [DEAL.createdate])

DATETRUNC

Truncate a date value to its year, quarter, month, week, or day.

DATETRUNC(“datePart”, “date”)

datePart: the year, quarter, month, week, or day unit to truncate to.

date: the date value to truncate.

DATETRUNC("YEAR", DATE_FROM_PARTS(2021, 3, 15))
= 2021-01-01

DATETRUNC("MONTH", "2021-03-15")
= 2021-03-01

DATETRUNC("DAY", [e_visited_page.__hs_event_native_timestamp])

TIMESTAMP_FROM_PARTS

Create a timestamp value from year, month, day, hour, minute, and second parts. Returns a datetime value.

TIMESTAMP_FROM_PARTS(year, month, day)

year: the year part of the desired date.

month: the month part of the desired date.

day: the day part of the desired day. 

TIMESTAMP_FROM_PARTS(2021, 1, 1)
= 2021-01-01

WEEKNUM

Calculate the week number within a year for a date. Returns a number.

WEEKNUM(“date”)

date: the datetime value with which to compute the week number within a year.

WEEKNUM("2021-03-15")
= 11

WEEKNUM(“1609459200”)
= 11

WEEKNUM(“[deal.createdate]”)

NOW

Returns the current time based on your account's timezone. Returns a datetime value.

NOW()

 

NOW()
= 1633611966314

WORKINGDAYS

Returns the number of week days (Monday - Friday) between two dates.

WORKINGDAYS(value1, value2)

value1: the starting datetime value.

value 2: the ending datetime value. 

WORKINGDAYS("2022-01-01", "2022-01-31")
= 21

WORKINGDAYS("1640995200", "2022-01-31")
= 21

WORKINGDAYS([DEAL.createdate], NOW())

Was this article helpful?
This form is used for documentation feedback only. Learn how to get help with HubSpot.