Include and Exclude Lists

Include and Exclude list in an audit define what pages to scan or not scan after the Starting URLs. Any item in the Include List restricts the scan to only the pages that match that item. Any item in an Exclude List prevents any pages that match that item from being scanned.

Note: sometimes these lists are also referred to as filters.

The Include and Exclude lists can be full URLs, partial URLs or regular expressions that match a valid page.

Order of Precedence

Starting URLs take precedence over everything else and will always be visited during an audit, even if a URL matches an item in the Exclude List. Starting URLs are always visited before any other URLs.

Exclude URLs override any URL in the Include field and eliminates them from eligibility.

Include URLs must be found on a starting page, otherwise they cannot be discovered and won't be visited.

Starting pages
An audit discovers links from the starting page's  document.links property. These links are eligible to be visited. 
Include List
Adding an Include Filter restricts the eligible URLs to those that match the filter. Now only five links are eligible to be visited.
Exclude List
Adding an Exclude Filter further removes URLs from the eligible links. Now only three links are eligible to be visited.

Starting URLs

The Starting URL list can be one or more URLs which are always visited before any other URLs. Any links discovered from the starting pages are eligible to be visited, subject to the Include and Exclude filters. If an Exclude item matches a Starting URL, it will be ignored.

Note: Auditor by Adobe in Adobe Marketing Cloud only allows a single page for the Starting URL. Simply turning on the Auditor-Adobe Reports Button in the ObservePoint interface will not limit the audit to one Starting URL.

Include List

The Include List limits what pages are eligible to be scanned during an audit. It can be a fully qualified or partial URL, or regular expression matching a full or partial URL.

Adding any URL or partial URL automatically limits what pages are eligible to be scanned in the audit. However, there is no guarantee that all the eligible pages or directories listed will actually be visited.

Default Include Filter

The default Include List allows any page from the primary domain of the Starting URL to be scanned. By default it is a modified version of the Starting URL:

^https?://([^/:?]*.)?mysite.com

This makes any link found on the the Starting URL page to be eligible for visiting. It matches on any page on any subdomain from the Starting URL. For example, if the Starting URL is http://mysite.com, the following pages would be eligible to be scanned by default (note the bold characters):

http://mysite.com
https://mysite.com
http://www.mysite.com/home
https://dev.mysite.com/home
http://my.mysite.com/products/products_and_services.html

 

The Include List can contain exact URLs, partial URLs, or regular expressions.

If more than one Starting URL is listed (not available in Auditor by Adobe), the default Include List changes to allow any URL from any domain:

^https?://.*

Typically you won't change anything in this box unless you want to direct your audits to specific areas of the site. In that case, replace the default value with the directory(ies) that you want the audit to scan. You can also use this to perform cross-domain auditing where you need to start the audit on one domain and end on another. To do this, type in the domains you want to traverse. But in any case, for any Include List URLs to be found, they must be discovered on a page that is audited.

For complex URL patterns, use ObservePoint's regular expression tester.

Also refer to the Regular Expressions document for common pattern matching use cases.

Exclude List

The Exclude List prevents URLs from being audited. You may use exact URLs, partial URLs, or regular expressions, just as you would in the Include List. Any URL that matches an item in the Exclude List will not be visited unless it is expressly defined in the Starting URL field.

Did this answer your question? Thanks for the feedback There was a problem submitting your feedback. Please try again later.