Create and use datasets
Last updated: October 24, 2024
Available with any of the following subscriptions, except where noted:
Marketing Hub Professional |
Sales Hub Professional |
Service Hub Professional |
Operations Hub Professional , Enterprise |
Content Hub Professional |
A dataset is a collection of data from across your HubSpot account that can then be used in custom reports. A dataset can include properties for CRM objects and HubSpot assets, along with formulas to calculate your data as needed. For example, you can create a field to calculate annual recurring revenue based on the deal amount property.
Access and use prebuilt datasets
- In your HubSpot account, navigate to Data Management > Datasets.
- Hover over the name of the dataset you want to use and click Build report.
- To edit a dataset, click Edit.
- To export the dataset, click Actions, then click Export.
Create a custom dataset (Operations Hub Professional and Enterprise only)
Creating multiple datasets for your teams means that report creators won’t have to select their data sources every time they need to create a report. A dataset can also be updated after creation, making it possible to update all reports using that dataset at the same time.
Below, learn how to create a dataset, how to use the dataset in the report, and the definitions for available functions within the datasets tool.
To create a dataset:
- In your HubSpot account, navigate to Data Management > Datasets.
- In the upper right, click Create dataset. Or to create a dataset using a template, browse the available templates, then click Use template.
Data sources
First, select the data sources to include in your dataset. Data sources are the objects, assets, and events that you want to report on. This includes all CRM objects such as contacts or custom objects, and assets such as website pages and emails, conversations, sales activities, and more. You can select up to 5 data sources per dataset.
The primary data source will be the focus of the dataset, with all other data sources relating to that primary source. To connect these data sources, HubSpot joins the data in the background using the shortest possible path. Contacts and deals, for example, are directly related and can be selected with no extra joins needed.
However, other data sources cannot be associated directly and require additional sources to join the data together. For example, if your primary source is Deals and you want to include blog post data in the report, HubSpot can only link those sources through the Contacts and Web activities sources. These other sources will be selected automatically to join the data.
- To select the primary data source, click Add primary data source, then select a data source.
- Click Continue.
- Continue selecting additional data sources. In the right panel, you can view the relationships between the current selected data sources.
- As you select sources, view the Preview pane at the bottom of the screen for a preview of your data.
- After selecting your data sources, in the upper right, click Next.
Add properties and fields
Select the fields to include in the dataset. You can add existing HubSpot properties to the dataset, as well as custom formula fields.
Add properties
-
To add properties to the dataset, drag and drop the properties from the left sidebar into the Dataset fields section.
- To rename a property or view a preview of its data, click the property under Dataset fields, then enter a new name into the name field on the right. Renaming a property will update its name only in this dataset. This allows you to customize how these fields appear in the report builder, but will not impact the name of the existing property.
Add formula fields
Formula fields are specific to the dataset and can be used to calculate values based on properties in the dataset. Learn more about building formulas using flexible expressions.
- To create a formula field, click Add derived field > Formula field.
- At the bottom, enter a name for the field.
- Enter your formula. You can reference properties that you've added to the dataset as well as other HubSpot properties outside of the dataset, and use functions to calculate based on property and literal data. Learn more about formula syntax and definitions below.
- To manually enter your formula, begin typing in the Formula field. By default, HubSpot will display auto-complete options as you enter text.
- To insert a property you've added to the dataset, click the Dataset field dropdown menu, then select the property.
- To insert a property that isn't in the dataset, click the HubSpot property dropdown menu, then select the property.
- To insert a function, click the Function dropdown menu, then select a function.
- To insert a snippet, click the Snippet dropdown menu, then select a pre-built formula. Snippets are populated based on the sources included in the dataset. Therefore different snippets will be available for datasets made from the contacts data source compared to the deals datasource.
- To manually enter your formula, begin typing in the Formula field. By default, HubSpot will display auto-complete options as you enter text.
- As you create formulas, the Formula field will display any issues it detects. When a formula is invalid, an indicator will display. Click [X] issues to view error details.
- Once you've set up your fields, click Review in the upper right.
Add conditional fields
Conditional fields enable users to group or bucket data based set conditions. These fields can only be used in datasets or the custom report builder. You can use conditional fields to calculate different commissions based on the size of a deal or translate feedback responses in to categories, for example, a label between 1-6 is labeled as a detractor.
The conditional field allows you to build a formula using the IF() function. Any fields built using a conditional field can be recreated using an IF() function in a standard formula creation field.
- To create a conditional field, click Add derived field > Conditional formula field.
- At the bottom, enter a name for the field. This name will be displayed when building a report using this segment.
- In the Condition section, enter your IF condition. For each row of data, if the condition set in this section is true, the row will be labeled by the value set in the Then field. If the condition is false, the row will be labelled by the value set in the Default value field. You can reference properties that you've added to the dataset as well as other HubSpot properties outside of the dataset, and use functions to calculate based on property and literal data. Learn more about formula syntax and definitions below.
- To manually enter your formula, begin typing in the IF field. By default, HubSpot will display auto-complete options as you enter text.
- To insert a property you've added to the dataset, click the Dataset field dropdown menu, then select the property.
- To insert a property that isn't in the dataset, click the HubSpot property dropdown menu, then select the property.
- To insert a function, click the Function dropdown menu, then select a function.
- In the Then field, enter the value you want to assign if the condition is true.
- To manually enter your formula, begin typing in the IF field. By default, HubSpot will display auto-complete options as you enter text.
- To add another conditional field, click Add condition block. Conditional blocks are processed sequentially, if the first condition is true the value defined in that block will be set. If it is not true, HubSpot will go to the next block and so on. Any rows that do not meet the conditions set will be assigned the Default value.
- In the Default section, enter a value for Else if the condition is not met.
- Once you've set up your fields, click Review in the upper right.
Filter
Refine your data further by adding filters to your fields.
To add a filter:
- Navigate to the Filters tab.
- From the left sidebar, click and drag fields.
- Click a field to view its filter options. Select a filter, then click Apply.
- You can group filters together by clicking a field, then clicking the Group with another filter dropdown menu. Select another active filter, then click Apply.
-
- You can change how the filters are used by clicking the Include data if it matches dropdown menu:
- All of the filters below: data must meet all of the filters set.
- You can change how the filters are used by clicking the Include data if it matches dropdown menu:
-
-
- Any of the filters below: data can be included if it meets one or more of the filters set.
- Custom filter rules: write your own filter rules. You can use boolean expressions like "1 and (2 or 3)".
-
-
- To ungroup filters, click a filter, then click the Group with another filter dropdown menu. Select None, then click Apply.
- After setting up your filters, click Next.
Review
Review your dataset before saving it.
- Under Sources, view the data sources you’ve selected.
- Under Fields, view the fields in the dataset. This includes:
- Field: the name of the field.
- Derived: whether the field is a standard HubSpot field or a custom calculated field.
- Data type: the type of data contained in the field.
- Input: the field’s expression.
- Source: the source of the data (e.g., Contacts).
- In the Preview panel, view the preview of your data. You can click View table relationships to see how the data is connected.
- After reviewing your data, save the dataset by clicking Save.
- In the right panel, enter a name and description for the dataset.
- Click Apply.
You’ll then be brought to the report builder where you can create a report based on your dataset.
View and manage datasets
On the datasets dashboard, you can view and edit existing datasets.
- In your HubSpot account, navigate to Reports > Data Management, then select Datasets.
- You can filter your existing datasets by using the filters at the top of the table.
- To edit a dataset, hover over the dataset, then click Edit. You’ll then be brought to the dataset details page.
- On the Preview tab, view a preview of the dataset’s data.
- On the Metadata tab, view the data sources and fields included in the dataset. This includes:
- Field: the name of the field.
- Derived: whether the field is a standard HubSpot field or a custom calculated field.
- Data type: the type of data contained in the field.
- Input: the string used to bring data into the field.
- Source: the source of the data (e.g., Contacts).
- On the Reports tab, view the reports that’ve been created using the dataset.
-
From the datasets details page, you can also create a new report using the dataset by clicking Build report.
Create a report using a dataset
Once a dataset has been created, you can create a report based on the dataset either from the report builder or from the datasets tool.
- To create a report from within a dataset:
- In your HubSpot account, navigate to Reports > Data Management, then select Datasets.
- Click the name of a dataset.
- In the upper right, click Build report.
- Continue building your report in the custom report builder.
- To create a report from the custom report builder:
- In your HubSpot account, navigate to Reporting > Reports.
- In the upper right, click Create report.
- Select Custom Report Builder.
- At the top, click the Datasets tab to view your existing datasets.
- Select the dataset you want to use, then click Next.
Reference
Syntax
Within a function, you can use data from properties and fields or literal data. Property and field data will be dynamic based on the individual data sources, while literal data is constant. For example:
2021-03-05
is a literal date, which is constant.[CONTACT.createdate]
is a property-based date, which is dynamic for each contact record.
Functions can include both literal and property/field data, as long as the data types are compatible with the function’s required arguments. For example:
DATEDIFF("MONTH", "2021-01-01", “[CONTACT.createdate]”)
Below, learn about the syntax for literals and property/field data, and how to incorporate them into formulas.
Literal syntax
Use literals to add specific, static strings of text, numbers, true or false values, and dates in your calculations.
- String literal: text surrounded by quotation marks. For example,
“My cool string”
.
- Number literal: numbers without quotation marks. For example,
42
. - Boolean literal:
true
orfalse
. - Date literal: a string formatted as
"YYY-MM-DD"
, or a date timestamp number (e.g,1635715904
).
Property syntax
Property references allow you to directly include values from the properties of your selected data sources. You don't need to add the property as a dataset field to reference it.
Use the following syntax when referencing properties:
- Reference expressions are always surrounded by square brackets (
[
and]
). - Property references specify the object or event name, followed by a period and the internal property name. For example:
[CONTACT.lifecyclestage]
[COMPANY.name]
[e_hs_scheduled_email_v2.__hs_event_native_timestamp]
Field reference syntax
You can reference fields in a formula by surrounding the field name with square brackets. For example:
-
[Field 1]
[My awesome custom field]
You can reference fields in formulas as long as the formula’s operators and functions accept the field’s data type. For example, if you create a new field that contains a string, you can reference the field in a function that accepts strings:
- If Field 1 is
[DEAL.name]
, it contains a string value (the deal's name). CONCAT([Field 1], "Q4")
would be valid because it contains two string values.CONCAT([DEAL.name], 2012)
would not be valid because it contains both a string and a number value.
Operators
You can use operators with literal and property/field values, and operators are evaluated in the standard PEMDAS order of operations. This allows you to nest operators using parentheses. For example:
- Adding a number with a property reference:
1 + [DEAL.amount]
- Using parentheses to nest operations:
(1 + 2) * (3 + 4)
Operator | Description | Example usage |
|
Add numbers. Returns a number. |
|
|
Subtract numbers. Returns a number. |
|
|
Multiply numbers. Returns a number. |
|
|
Divide numbers. Returns a number. |
|
|
Negates a number. |
|
|
Checks if both of two boolean values are true. Returns a boolean. |
|
|
Check if either of two boolean values are true. Returns a boolean. |
|
|
Negatives a boolean value. Returns another boolean value. |
|
|
Equality operator. Returns a boolean value. |
=true = true |
IF
IF logic is a set of rules performed if a certain condition is met. You can use IF logic to differentiate data. For example, you can use IF logic to:
- Calculate different commissions based on the size of a deal (i.e. give a higher % for a bigger deal).
- Bucket deals into tiers for analysis and action in your reports.
- Translate feedback responses into categories (e.g. label 1-6 is detractor).
- Define contact priority based on how many days they’ve been marked as lead.
LABEL
The LABEL function converts enumeration property interval values into user-friendly values. Some HubSpot defined properties, such as Deal and Contact owner are represented as internal values. This make analysis difficult. When used with HubSpot's defined properties that support translation, the LABEL function will provide the translation based on portal settings, not user settings. For example, you can use the LABEL function to:
-
Access contact or deal stage names directly in fields
LABEL([DEAL.dealstage]) = "Closed Won"(10)
- Reference HubSpot owners by name directly in fields
LABEL([DEAL.hubspot_owner_id]) = "John Smith"
Numerical functions
Function | Definition | Arguments | Example usage |
|
Compute the absolute value of a number. Returns a number. |
number: the number to take the absolute value of. |
|
|
Round a decimal value up to the nearest integer. Returns a number. |
number: the number to take the ceiling of. |
|
|
Divide a number, but return zero when the divisor is zero. Returns a number. |
dividend: the number to use as a dividend in the division operation. divisor: the number to use as a divisor in the division operation, with zero resulting in a zero value overall |
|
|
Computer Euler's number raised to a value. Returns a number. |
exponent: the exponent to raise Euler’s number to. |
|
|
Round a decimal value down to the nearest integer. Returns a number. |
number: the exponent to raise Euler’s number to. |
|
|
Compute the natural logarithm of a value. Returns a number. |
number: the value to take the natural logarithm of. |
|
|
Compute the logarithm of a value within a specified base. Returns a number. |
base: the base to be used in the logarithm calculation of the value. value: the number to take the logarithm of. |
|
|
Raise a base value to a specified power. Returns a number. |
base: the number to compute the power of. exponent: the number to raise the base by. |
|
|
Take the square root of a non-negative number. Returns a number. |
number: the number to take the square root of. |
|
|
Bin numerical values into buckets of equal width. Returns the number of the bucket that the value falls into. If returned value is below the minimum, returns zero. If returned value is above the maximum, returns the number of buckets +1. |
value: the number to compute in the bin number. minValue: the minimum value to start binning from. maxValue: the maximum value to bin to. bucketCount: the desired number of equal-width buckets to bin values into between minValue and maxValue. |
|
String functions
Function | Definition | Arguments | Example usage |
|
Determine whether a string contains a case sensitive substring. Returns a boolean value. |
string: the string value to test. substring: the value to check for inside the string. |
|
|
Concatenate two strings. Returns a string. |
string1: the string value to which string2 will be appended. string2: the string value to append to string1. |
|
|
Compute the length of a string. Returns a number. |
string: the string value to calculate the length of. |
|
|
Remove leader and trailing whitespace from a string. Returns a string. |
string: the string value to trim whitespace from. |
|
Date functions
Function | Definition | Arguments | Example usage |
|
Create a date value from year, month, and day parts. Returns a date. |
year: the year part of the desired date. month: the month part of the desired date. day: the day part of the desired date. |
|
|
Return the number of time units between the first date value and the second for a specified time unit. Returns a number. |
datePart: the year, quarter, month, week, or day unit to use in the difference calculation. date1: the starting date value to subtract from date2. date2: the ending date value that date1 will be subtracted from. |
|
|
Extract the year, quarter, month, week, or day from a date value. Returns a number. |
datePart: the year, quarter, month, week, or day unit to extract. date: the date value to extract a date part from. |
|
|
Truncate a date value to its year, quarter, month, week, or day. |
datePart: the year, quarter, month, week, or day unit to truncate to. date: the date value to truncate. |
|
|
Create a timestamp value from year, month, day, hour, minute, and second parts. Returns a datetime value. |
year: the year part of the desired date. month: the month part of the desired date. day: the day part of the desired day. |
|
|
Calculate the week number within a year for a date. Returns a number. |
date: the datetime value with which to compute the week number within a year. |
|
|
Returns the current time based on your account's timezone. Returns a datetime value. |
|
|
|
Returns the number of week days (Monday - Friday) between two dates. |
value1: the starting datetime value. value 2: the ending datetime value. |
|