Skip to main content

Adobe Analytics Implementation: Adobe Analytics Tags only include ASCII characters

A guide to validating character encoding within Adobe Analytics beacons to prevent data corruption and broken reporting.

Written by Luiza Gircoveanu
Updated over 2 weeks ago

Overview

This check ensures that all data sent within an Adobe Analytics tags—including eVars, props, page names, and events—consist exclusively of ASCII characters. ASCII (American Standard Code for Information Interchange) is the standard character set for electronic communication.

While modern browsers and Adobe's servers can often handle UTF-8 encoding, certain "non-standard" or hidden characters can cause tracking requests to be malformed, truncated, or rejected by the Data Collection Servers (DCS).

Why it is important

The presence of non-ASCII characters (such as special symbols, emojis, "smart quotes," or unencoded foreign characters) in your tracking hits can lead to several critical issues:

  • Data Truncation: If a character is not correctly interpreted, the server may stop reading the variable at that point, cutting off the rest of the data.

  • Garbage Data in Reports: Non-ASCII characters often render as "gibberish" or replacement characters (e.g., ``) in Workspace, making the data unreadable for analysts.

  • Request Failures: In extreme cases, a malformed character in a URL parameter can cause the entire HTTP request to fail (Status Code 400), leading to total data loss for that hit.

  • Processing Rule Breaks: If you use Processing Rules to logic-check certain strings, hidden non-ASCII characters can prevent those rules from matching correctly.

Implementation

ObservePoint scans the content of every Adobe Analytics request to verify that the query string parameters contain only valid, readable characters.

To do so, you can create an Audit to scan the pages you want to validate for character encoding.

Also, check the pre-built report in ObservePoint for Pages with Adobe Analytics tags capturing non-ASCII characters.

Remediation

If non-ASCII characters are detected in your tags, follow these steps to sanitize your data:

1. Clean the Data Layer

Most non-ASCII characters originate in the CMS or the database and are passed directly into the Data Layer.

  • Action: Work with the development team to ensure the Data Layer is "sanitized" or stripped of special characters before it is rendered on the page.

  • Action: Use a regular expression to strip non-ASCII characters from high-risk variables (like internal search terms or product names).

2. Avoid "Smart Quotes"

A common issue occurs when copying/pasting code or configuration values from word processors into your Tag Manager.

  • Action: Ensure all hard-coded values in Adobe Launch are entered using standard "straight" quotes (' or ") rather than "curly" or "smart" quotes ( or ).

3. URL Encoding

If you must send characters from foreign languages (e.g., Japanese or Cyrillic), they must be properly URL-encoded.

  • Action: Ensure your implementation is using encodeURIComponent() for custom variables that might contain special characters. Adobe’s standard libraries usually handle this, but custom code blocks often require manual encoding.

4. Audit "doPlugins" Stripping

You can implement a global "scrubbing" function within your doPlugins logic to catch and remove common non-ASCII characters before the hit is sent.

  • Example: s.pageName = s.pageName.replace(/[^\x00-\x7F]/g, ""); (This regex removes any non-ASCII character).

Conclusion

Data integrity starts with data readability. By ensuring your Adobe Analytics tags only include ASCII characters, you prevent the "silent" data corruption that occurs when special symbols or hidden formatting characters interfere with the network request. Regular automated Audits via ObservePoint ensure your reports remain clean, professional, and accurate for all end-users.

Did this answer your question?