ResourcesKnowledge Base

Working with data

The ability to access external data sources and use the data within the templates is one of the most powerful features of Mailkit. It allows not only to ingest data into mailing list but to generate hyper-personalized message content within templates. This is where you leverage your BI data and the power of the templating language comes to life.

Companies have an overflow of data - so much that it created a whole new business around data analysis. No wonder that marketers are eager to use the data they have. Whether it’s simple data that helps marketers speed up the content creation or customer data to drive personalization. Mailkit’s super-powers come to life whenever data is involved.

Let’s start our data driven journey by getting our data source integrations in place. In this document we will be focusing on using XML and SQL data sources as those are the most common use cases for most companies. XML data sources can range from short RSS feeds to complex feeds generated from Google Sheets. SQL feeds are usually large datasets - whether it would be from product feeds or any other dataset where using SQL for access provides an advantage.

This document will be using sample data sources to help you understand the concept of working with data.

1. Start by creating a data source named WIKINEWS, with a type XML, Target set to Template and URL of https://en.wikipedia.org/w/api.php?action=featuredfeed&feed=onthisday

2. Create a data source called PRODUCTS, with a type SQL, Source format Google Merchant Feed and URL of https://static.mailkit.eu/templates/google-feed.xml

Adding a SQL data source from Google Merchant Feed

3. Open the newly created data source PRODUCTS and click the import button. Once the import is complete the products from the feed will be available in SQL database. You can also set your feed to automatically update on schedule to automate the process and keep your data up to date.

Once you have setup your data sources in Profile/Data sources you will have to assign the sources needed to your template. This is done by clicking the edit button in the template info block of your template.

By assigning your data sources to your template you make the data from these sources directly accessible from the template.

The next step is to go into the template and start working with the data. Since each source has a different structure the templating language allows you to debug the data available to the template. Your template development should start by getting a debug dump:

<pre>[% USE Dumper; Dumper.dump(data) -%]</pre>

In the above case the Dumper function dumps the content of the data structure but it could be used to dump any other structure as well. The most common structures are shared (for unstructured variables from API calls/delivery feeds), data (structured data from data sources/API calls), Event & EventItems (eventAPI event data), recipient (recipient data) and email (campaign variables).

In the case of our data sources the template output would be something like:

$VAR1 = {
    'DS_XML_WIKINEWS' => {
        'xmlns:dc' => 'http://purl.org/dc/elements/1.1/',
        'version' => '2.0',
        'channel' => {
            'lastBuildDate' => 'Fri, 14 Jun 2019 07:38:20 GMT',
            'title' => 'Wikipedia "On this day..." feed',
            'link' => 'https://en.wikipedia.org/wiki/Main_Page',
            'language' => 'en',
            'item' => [
                {
                'guid' => {
                    'content' => 'http://en.wikipedia.org/wiki/Special:FeedItem/onthisday/20190605000000/en',
                    'isPermaLink' => 'false'
                },
                'link' => 'http://en.wikipedia.org/wiki/Special:FeedItem/onthisday/20190605000000/en',
                'dc:creator' => {},
                'title' => 'On this day: June 5',
                'pubDate' => 'Wed, 05 Jun 2019 00:00:00 GMT',
                'description' => "
    ...
    ...
}

As you can see in our example the structure has a name of the datasources (DS_XML_WIKINEWS) and matches the original XML file. Data are all within the data.DS_XML_WIKINEWS structure. To access the title element we follow the structure and use:

[% data.DS_XML_WIKINEWS.channel.title -%]

to output the title. In the very same fashion we can print the pubDate using:

[% data.DS_XML_WIKINEWS.channel.pubDate -%]

or even better use the date & time functions to format the output as needed by:

[% date.format(data.DS_XML_WIKINEWS.channel.pubDate) -%]

The more interesting part of the XML lies in the list of articles in the data.DS_XML_WIKINEWS.channel.item structure. If we used a simple data access to print the data as we did for title or pubDate:

[% data.DS_XML_WIKINEWS.channel.item -%]

The output would look similar to “ARRAY(0x56032b920168)” as the item is an array structure that needs to be processed item by item. This is done using iteration over the array items:

[% FOREACH data.DS_XML_WIKINEWS.channel.item -%]
[% END -%]

This way we iterate through the array content and can print data within the structure by their name:

[% FOREACH data.DS_XML_WIKINEWS.channel.item -%]
    <div><a href="[% link -%]">[% title -%]</a></div>
[% END -%]

Now that we have a simple output of the channel it’s time to look at how we can optimize the output. For example we may only want to see 5 records from the array, highlight even and odd records, etc. This can be done by simply using the loop functions like loop.count and loop.parity directives.

Using SQL data sources

The very same iteration applies to all array data and their processing within templates no matter what the source is. The other source we have added is a SQL data source that needs to be initialized first. This is done by establishing the SQL connection to the data source by it’s name:

[% USE dbh_sql = DBI (dbh = shared.DBI_DS_SQL_PRODUCTS) -%]

Now that the connection is established the SQL commands can be used to fetch data from the SQL table/s of the data source/s. Since all tables of a customer reside in the same database there is no need to create a separate connection for each SQL data source. To run SQL commands we need to be able to identify the underlying table name. The naming scheme is ds_ID_name where ds_ is a prefix, ID is the ID number of a data source and name represents the enclosing branch with items. In case of product feeds eg. Google, Heureka, etc. the name is always “products”. In our case the data source is a Google Merchant feed and has an ID of 1234 resulting in a database table name ds_1234_products. We can easily query the database using an SQL statement:

[% query_items = dbh_sql.prepare('SELECT * FROM ds_3223_products ORDER BY RAND() LIMIT 5') %]

and execute the query loading the data into an items array:

[% items=query_items.execute() -%]

Now we can easily see the results by dumping the records using:

[% FOREACH items -%]
<pre>[% USE Dumper; Dumper.dump(items) -%]</pre>
[% END -%]

Giving us the information about the names of the columns:

'ITEM' => {
    'custom_label_2' => 'pneu',
    'price' => '10982 CZK',
    'custom_label_1' => '38',
    'brand' => 'Pirelli',
    'custom_label_0' => 'osobni-zimni',
    'image_link' => 'https://www.mojepneu.cz/pics/pneu/pirelli_sottozero270.jpg',
    'link' => 'https://www.mojepneu.cz/pirelli-winter-270-sottozero-serie-ii-245-35-r20-95w-xl',
    'gtin' => undef,
    'condition' => 'new',
    'shipping' => undef,
    'id' => '283136',
    'availability' => 'in stock',
    'description' => "Pirelli WINTER 270 SOTTOZERO SERIE II - Pneumatika Sottozero Serie II, kter\x{e1} vych\x{e1}z\x{ed} p\x{159}\x{ed}mo z pneumatiky Sottozero W270 (uvedena v roce 2007) a ze \x{161}pi\x{10d}kov\x{fd}ch znalost\x{ed} segmentu U.H.P., byla vyvinuta jako \x{201e}modern\x{ed} zimn\x{ed} pneumatika\x{201c}, kter\x{e1} by splnila po\x{17e}adavky koncov\x{fd}ch u\x{17e}ivatel\x{16f} na v\x{fd}konnost a bezpe\x{10d}nost, p\x{159}i\x{10d}em\x{17e} do\x{161}lo k p\x{159}ijet\x{ed} ur\x{10d}it\x{fd}ch technick\x{fd}ch diferenciac\x{ed} mezi velikostmi W a H/V, a pneumatika tak v\x{17e}dy nab\x{ed}z\x{ed} to, co \x{159}idi\x{10d}i \x{17e}\x{e1}daj\x{ed}. Pirelli WINTER 270 SOTTOZERO SERIE II",
    'title' => "Pirelli WINTER 270 SOTTOZERO SERIE II 245/35 R20 95W  XL - zimn\x{ed} pneumatiky",
    'identifier_exists' => 'FALSE'
},

As well as some additinal details about the iteration like the next item, size of the dataset, etc. This example selects 5 random items but the select statement can be as complex as necessary:

[% query_items = dbh_sql.prepare('
    SELECT * FROM ds_3223_products WHERE `availability`="in stock"
    AND `condition`="new" ORDER BY RAND() LIMIT 5') %]

The next step would be to combine the this query with recipient data to produce custom output. Let’s just first make sure that the recipient data is present at all times or there is a fallback. In this case we are assigning the brand variable the value of recipient’s CUSTOM2 field or we fall back to all records if no value is present in CUSTOM2:

[% IF recipient.CUSTOM2 !='' -%]
    [% brand = recipient.CUSTOM2 -%]
[% ELSE -%]
    [% brand = '%' -%]
[% END -%]

And update our SQL query:

[% query_items = dbh_sql.prepare('
    SELECT * FROM ds_3223_products WHERE `availability`="in stock"
    AND `brand` LIKE ? ORDER BY RAND() LIMIT 5') %]

and it’s execution to pass the brand variable:

[% items=query_items.execute(brand) -%]

The result will now only return 5 random products of a specific brand listed in recipient’s CUSTOM2 field or 5 random products if the recipient has no value set in CUSTOM2.

Keep in mind that this is just a simple example and the proper template would probably contain more fallback handling. The things to consider are not only the select query failing completely due to missing values but also dealing with empty result sets and making sure there is always a fallback content. The handling can be as simple as this:

[% items=query_items.execute(brand) -%]
[% IF items.size == 0 -%]
	[% items=query_items.execute('%') -%]
[% END -%]

This will first execute the SQL statement with the brand of the recipient and evaluate the size of the result. If that result is empty the query will be executed again this time with a lookup for all records. This could also be a completely different query to provide fallback data.

Always keep in mind that queries MUST be efficient. Your select statements will be executed individually for each recipient of the message. Inefficient queries will slow down your message delivery. You should always use indexed fields for your WHERE and GROUP clauses. The fields that are indexed vary by the type of feed used. For predefined product feeds the indexed fields are always the ID and Name of the product, category, availability and brand depending on whether these fields exist in the feed. For custom feed format you can choose your own primary key and indexing during data source setup.

Feel free to contact our customer support for assistance using data sources in templates.