ResourcesKnowledge Base

Working with data

The ability to access and use external data sources within templates is one of Mailkit's most powerful features. From external data sources it is possible to load data into recipient lists as well as into templates and thus create personalized message content. It is in templates where you can make full use of your BI data and where the power of the template language can be fully manifested.

Companies have a surplus of data, so marketers want to use it. Whether it's simple data to help marketers speed up content creation, or customer data to help manage personalization. Mailkit's superpowers are proven whenever it comes to working with data.

Let's start by creating basic integrations using data sources. In this document, we will focus on the use of XML and SQL data sources, because these are the most commonly used types of data sources in most companies. XML feeds can be both simple RSS feeds and complex feeds generated from Google Sheets. SQL resources are usually large datasets — whether they are data from product feeds or any other dataset, where using SQL resources to access the data provides an advantage.

This document will use sample data sources to help you understand the basics of working with data.

1) Start by creating a data source named WIKINEWS, with XML type, with the Target set to Template and URL source: https://en.wikipedia.org/w/api.php?action=featuredfeed&feed=onthisday

Add data source – template

2) Create a data source called PRODUCTS, with SQL type, Google Merchant Feed source format and URL source: https://static.mailkit.eu/templates/google-feed.xml

Add data source – SQL

3) Then open the newly created PRODUCTS data source and click the Import button. Once the import is complete, the products from the feed will be available in the SQL database. You can also set the source to update automatically according to a schedule so that the process is automated and the data is always up to date.

Once you have set up data sources in the Profile / Data sources menu, you will need to assign those sources to a template. Data sources can be assigned to a template both when creating a template and after creating it in its details. Just click the Edit button in the Template Info block.

By assigning data sources to a template, it will be possible to access data from these sources directly from the template.

The next step is to go to the template and start working with the data. Because each data source has a different structure, the template language allows you to list the data that is available for a template. The preparation of the template should therefore begin by obtaining a statement of the data structure, the so-called dump:

<pre>[% USE Dumper; Dumper.dump(data) -%]</pre>

In the above case the Dumper function dumps the content of the data structure but it could be used to dump any other structure as well. The most common structures are shared (for unstructured variables from API calls/delivery feeds), data (structured data from data sources/API calls), Event & EventItems (eventAPI event data), recipient (recipient data) and email (campaign variables).

In the case of our data sources the template output would be something like:

 

In the above case, the Dumper function lists the contents of the data structure, but it can also be used to dump any other structure. The most common structures are shared (for unstructured variables from API calls/delivery feeds), data (structured data from data sources/API call), Event & EventItems (eventAPI event data), recipient (recipient data) and email (campaign variables).

In the case of our data sources, the output of the template will be something like:

$VAR1 = {
    'DS_XML_WIKINEWS' => {
        'xmlns:dc' => 'http://purl.org/dc/elements/1.1/',
        'version' => '2.0',
        'channel' => {
            'lastBuildDate' => 'Fri, 14 Jun 2019 07:38:20 GMT',
            'title' => 'Wikipedia "On this day..." feed',
            'link' => 'https://en.wikipedia.org/wiki/Main_Page',
            'language' => 'en',
            'item' => [
                {
                'guid' => {
                    'content' => 'http://en.wikipedia.org/wiki/Special:FeedItem/onthisday/20190605000000/en',
                    'isPermaLink' => 'false'
                },
                'link' => 'http://en.wikipedia.org/wiki/Special:FeedItem/onthisday/20190605000000/en',
                'dc:creator' => {},
                'title' => 'On this day: June 5',
                'pubDate' => 'Wed, 05 Jun 2019 00:00:00 GMT',
                'description' => "
    ...
    ...
}

As you can see in our example, the structure is named after the data source (DS_XML_WIKINEWS) and corresponds to the original XML file. The data is in the data.DS_XML_WIKINEWS structure. To access the title element, follow the structure and use the following to list the title:

[% data.DS_XML_WIKINEWS.channel.title -%]

In the same way we can list the date of publication of pubDate using:

[% data.DS_XML_WIKINEWS.channel.pubDate -%]

Or better yet, we use the date & time functions to format the output as needed:

[% date.format(data.DS_XML_WIKINEWS.channel.pubDate) -%]

A more interesting part of XML is the list of articles in the data.DS_XML_WIKINEWS.channel.item structure. If we used a simple data access to list the data (as we did for title or pubDate):

[% data.DS_XML_WIKINEWS.channel.item -%]

The output would look similar to "ARRAY (0x56032b920168)" because it is a structured array that must be processed item by item. This is done by iterating through an array of items:

[% FOREACH data.DS_XML_WIKINEWS.channel.item -%]
[% END -%]

In this way we iterate over the content of the field and we can list the data in the structure according to their name:

[% FOREACH data.DS_XML_WIKINEWS.channel.item -%]
    <div><a href="[% link -%]">[% title -%]</a></div>
[% END -%]

Now that we have a simple output of the channel, it's time to see how we can optimize it. For example, we may want to see only 5 records from an array, highlight even and odd records, etc. This can be easily achieved with loops using the loop.count and loop.parity directives.

Using SQL data sources

The same iteration applies to all field data and its processing within templates, regardless of the source. Another source we've added is an SQL data source that needs to be initialized first. This is done by creating an SQL connection to the data source by its name:

[% USE dbh_sql = DBI (dbh = shared.DBI_DS_SQL_PRODUCTS) -%]

Now that the connection is established, SQL statements can be used to retrieve data from SQL table(s) of data source(s). Because all customer tables reside in the same database, it is not necessary to create a separate connection for each SQL data source. In order to run SQL statements, we must be able to identify the base table name. The naming scheme is ds_ID_name, where ds_ is the prefix, ID is the ID number of the data source, and name is a closed branch with entries. In the case of product feeds such as Google, Heureka, etc., the name is always "products". In our case, the data source is a Google Merchant source and has an ID of 3223, resulting in the database table name ds_3223_products. So we can easily query the database using the SQL statement:

[% query_items = dbh_sql.prepare('SELECT * FROM ds_3223_products ORDER BY RAND() LIMIT 5') %]

and execute the query loading the data into an items array:

[% items=query_items.execute() -%]

Now we can easily display records using a dumper:

[% FOREACH items -%]
<pre>[% USE Dumper; Dumper.dump(items) -%]</pre>
[% END -%]

The listing will give us information about the column names:

'ITEM' => {
    'custom_label_2' => 'pneu',
    'price' => '10982 CZK',
    'custom_label_1' => '38',
    'brand' => 'Pirelli',
    'custom_label_0' => 'osobni-zimni',
    'image_link' => 'https://www.mojepneu.cz/pics/pneu/pirelli_sottozero270.jpg',
    'link' => 'https://www.mojepneu.cz/pirelli-winter-270-sottozero-serie-ii-245-35-r20-95w-xl',
    'gtin' => undef,
    'condition' => 'new',
    'shipping' => undef,
    'id' => '283136',
    'availability' => 'in stock',
    'description' => "Pirelli WINTER 270 SOTTOZERO SERIE II - Pneumatika Sottozero Serie II, kter\x{e1} vych\x{e1}z\x{ed} p\x{159}\x{ed}mo z pneumatiky Sottozero W270 (uvedena v roce 2007) a ze \x{161}pi\x{10d}kov\x{fd}ch znalost\x{ed} segmentu U.H.P., byla vyvinuta jako \x{201e}modern\x{ed} zimn\x{ed} pneumatika\x{201c}, kter\x{e1} by splnila po\x{17e}adavky koncov\x{fd}ch u\x{17e}ivatel\x{16f} na v\x{fd}konnost a bezpe\x{10d}nost, p\x{159}i\x{10d}em\x{17e} do\x{161}lo k p\x{159}ijet\x{ed} ur\x{10d}it\x{fd}ch technick\x{fd}ch diferenciac\x{ed} mezi velikostmi W a H/V, a pneumatika tak v\x{17e}dy nab\x{ed}z\x{ed} to, co \x{159}idi\x{10d}i \x{17e}\x{e1}daj\x{ed}. Pirelli WINTER 270 SOTTOZERO SERIE II",
    'title' => "Pirelli WINTER 270 SOTTOZERO SERIE II 245/35 R20 95W  XL - zimn\x{ed} pneumatiky",
    'identifier_exists' => 'FALSE'
},

as well as some additional iteration details, such as the next item, dataset size, etc. This example selects 5 random items, but the select statement can be significantly more complex as needed:

[% query_items = dbh_sql.prepare('
    SELECT * FROM ds_3223_products WHERE `availability`="in stock"
    AND `condition`="new" ORDER BY RAND() LIMIT 5') %]

The next step would be to combine this query with recipient data to create your own output. First, you need to ensure that the recipient information is always available or there is alternative content. In this example, we assign a variable value from the recipient's own field No. 2 (CUSTOM2), or if the recipient has this value empty, we return all records.

[% IF recipient.CUSTOM2 !='' -%]
    [% brand = recipient.CUSTOM2 -%]
[% ELSE -%]
    [% brand = '%' -%]
[% END -%]

And update our SQL query:

[% query_items = dbh_sql.prepare('
    SELECT * FROM ds_3223_products WHERE `availability`="in stock"
    AND `brand` LIKE ? ORDER BY RAND() LIMIT 5') %]

and add the brand variable to the command:

[% items=query_items.execute(brand) -%]

The result will now be only 5 random products of a specific brand listed in the CUSTOM2 recipient field list, or 5 random products if the recipient has no value set in CUSTOM2.

Keep in mind that this is just a simple example, and the correct template should include additional alternate processing. Consideration should be given not only to failing to select data from the database due to missing values, but also to have empty result sets processed and to ensure that there is always backup content for such cases.

Treating empty results can be very easy:

[% items=query_items.execute(brand) -%]
[% IF items.size == 0 -%]
	[% items=query_items.execute('%') -%]
[% END -%]

First, an SQL query is run with the brand parameter corresponding to the user value. Subsequently, the size of the query result is evaluated. If the result of the query is zero, the query will be repeated, this time on all records without brand restrictions. However, there could be a completely different query to get a completely different set of data.

Always keep in mind that queries MUST be effective. Queries will be executed repeatedly for each recipient of your message, so in the case of poorly designed queries, the distribution of messages can be significantly slowed down. Always use only indexed fields in queries for your WHERE and GROUP conditions. The range of indexed fields varies depending on the type of resource used. For predefined product feeds, the product ID and name, category, availability, and brand fields are indexed depending on the availability of these fields in the feed. For a custom format, you can select the primary key and indexed fields when setting up the data source.

If you have any questions regarding the use of data in templates, feel free to contact our customer support.