Using HubL Filters for Data Normalization and Sanitization

Written by Alyssa Wilie | March 02, 2026

CMS-driven data is rarely authored with a single output in mind. The same value might be entered manually, pulled from a table, or reused across pages, emails, and modules. HubL filters allow us to normalize and sanitize that data at render time so it’s usable, consistent, and safe to render where it’s displayed

Normalization: Shaping Data for Reuse

"Normalization" is not about correcting data, but transforming dissimilar data into a consistent presentation or shaping it for a specific use case. This allows multiple data sources to be used together and reused across different contexts without maintaining separate versions just to change how they render.

Text Transforms

While CSS can handle many text transformations, HubL operates at render time on the server. That means values are transformed before they ever reach the browser. This is important not just for display, but for logic such as filtering, comparisons, and URL parameters.

Case normalization (lower, upper, capitalize) transforms text into a specific letter case to ensure uniform display and avoid case-sensitive mismatches in comparisons and conditionals.

Trimming and whitespace cleanup (trim, truncate, truncatehtml, wordwrap) constrains text length and formatting so values render within expected visual and layout boundaries.

Simple substitutions or removals (cut, replace) remove or swap known substrings such as labels, prefixes, separators, or other text that isn’t relevant in the current rendering context.

Alteration

Alteration filters change how a value is represented, not just how it’s formatted. These filters convert data into a different structure or display format so it can be used in a specific rendering context.

Programmatic replacement (regexreplace) reshapes values using pattern-based matching, making it possible to rewrite or extract portions of a string when simple substitutions are insufficient.

Representation conversion (format, filesizeformat) alters raw values into human-readable strings, allowing numeric or structured data to be displayed in a form appropriate for UI and content output.

Specialized format conversion (convert_rgb) transforms values into alternate representations required for specific rendering needs, such as converting hex colors into RGB values so opacity and color channels can be manipulated independently.

Structural or contextual encoding (xmlattr) converts intentionally structured values into a format suitable for embedding directly in markup, such as transforming a dictionary into HTML attribute syntax.

Programmatic replacement (regexreplace) reshapes values using pattern-based matching, making it possible to rewrite or extract portions of a string when simple substitutions are insufficient.

Deduplication

Some data will intentionally contain duplicate values that you may want to collapse when rendering. This commonly appears with shared tags or labels applied across multiple line items, or when merging results from multiple data sources.

Deduplication and set operations (unique, difference, intersect, union) operate on sequences to include or exclude values based on overlap, allowing intentional duplicates to be collapsed or compared when combining multiple data sources.

It’s important to distinguish this from unintentional duplication. Repeated values caused by data entry issues, schema design, or improper associations should be resolved at the source. Deduplicating in HubL is appropriate only when duplicates are expected and meaningful for the underlying data, not as a corrective measure.

Normalization helps values behave consistently at render time, but it does not correct inaccurate or poorly maintained data. When underlying values are wrong, the issue should be addressed at the source rather than masked at render time.

Sanitization: Making Data Safe to Render

Data is often entered or submitted without any awareness of where or how it will ultimately be displayed. Content editors, form submissions, and CRM updates are typically focused on meaning, not markup structure, layout constraints, or rendering context. As a result, values that are technically valid can still produce broken layouts, invalid HTML, or unexpected output once rendered.

This is especially common with rich text fields, where markup may be incomplete, inconsistently structured, or formatted for one context but reused in another. Sanitization filters help clean and constrain this data at render time to ensure tags are properly closed and markup is safe to output.

Encoding (md5, urlencode, urldecode) transforms a value into a different encoded representation for transport, comparison, or URL usage.

Escaping (escape_attr, escape_html, escape_jinjava, escape_js, escape_url) escapes characters so content can be safely rendered in a specific context without being interpreted as code or markup.

Stripping (striptags) removes markup entirely, reducing content to plain text to avoid layout issues or invalid HTML.

While HubSpot performs significant security sanitization on data at the point of submission, additional sanitization at render time can still be useful when displaying user-submitted or editor-generated content. In these cases, sanitization is less about preventing attacks and more about ensuring safe, well-formed output that won’t interfere with the surrounding template or markup.

View full post