ResourcesKnowledge Base

Data sources

One of the most powerful features of the platform is its support for external data sources. Data sources can help easily automate various activities - from updating mailing lists, fetching dynamic content for your campaign, loading product information to driving deliveries.

data_sources-en.pngTo start updating your mailing lists and feeding your templates with content you need to setup a data source. Mailkit currently supports XML, RSS, JSON and CSV data sources. Some formats have limited use, for example RSS data source can only feed template content, while CSV can only be used to update a mailing list. XML data sources are the most universal and can be used to update mailing lists, provide content, feed an SQL as well as drive delivery.

Data source management can be found in the menu Profile/Data sources where you can find all your data sources separated into groups of those Active (currently in use), Unused (sources that have not been updated in a while) and All.

Using data source to update mailing lists

You can easily create a new (or update an existing) mailing list using data source in XML, JSON or CSV format. We strongly suggest using XML or JSON over the CSV as the former two are structured formats and are less prone to errors while a small change in CSV can lead to mixed up data.

create_ds.pngStart by clicking the Add data source button and you will be presented with a dialog box that will change as you select various setting.

  • Name - the name for your new data source. This name will be used as a name of a mailing list if new list is to be created. If the data source will be used in templates this is the name how you will address the data.
  • Description - description of data source.
  • Source - URL address where your source data reside. We strongly suggest you take necessary security precautions not to expose your data unencrypted to the public and use https protocol and authorisation or IP access limitation whenever the source may contain any private data. 
  • Type - the type of data source. The options are CSV, RSS, XML, SQL and JSON. Based on the choice of data source new fields will be revealed.
  • Target (only available for XML feeds) - allows you to select how will the data be used. Options are Template, Mailing list and Delivery
  • Mailing list (only available for mailing list feeds (CSV,JSON,XML) - allows you to select the mailing list this data source will update or feed a new list.
  • Empty fields - select how to treat empty fields during consequential updates - you could either Reset or Keep current values.
  • Authorization - select in case your source requires access authorization using username and password.
  • Scheduled update - allows you to set up a update schedule for the data source. Keep in mind you have to make sure the source needs to be updated at the same (or more frequent) schedule.
  • Auto update - select to update the feed right before campaign delivery. Keep in mind this will delay the scheduled delivery time as the update of the data source needs to take place first and may take some time to complete.
  • Last update - information about the last time the data source has been updated

The settings of your data source will be saved once you click the Save button. The data source will then be ready for the next steps - for mailing list data source that would be assigning the individual values to mailing list fields.

How to prepare a data source

Mailkit does not expect data sources to have a specific structure but rather to follow some simple rules. Data sources for mailing lists can be provided in structured XML or JSON format or unstructured CSV. The data sources must be in UTF8 encoding and validate using the respective standard - watch out for proper encoding of characters like &, <, > and other entities or special characters. Be very careful with CSV file format as it is an unstructured format and columns can be easily swapped, removed or added and the system won't be able to detect such change.

The data source file must be provided at an URL accessible from Mailkit servers a protected from 3rd party access (especially with PII). Sources can be protected by allowing only Mailkit IP addresses to access (IP network 185.136.200.0/22) or by HTTP authentication using username and password.

Because we know our customers are using various systems with different possibilities, our data sources are built to be universal and do not require specific structure. We leave it to our customers and their systems how are the branches of the structure named as those individual branches can be easily mapped in the user interface to specific mailing list fields. There are some basic requirements for the data sources to adhere to:

  1. Fully valid XML, JSON or CSV
  2. XML data sources do not support XML attribution (eg. first_name="Jane" gender="F" country="USA")
  3. UTF8 encoding is recommended
  4. Don't use "," as a separator of multiple values but use "|" character instead
  5. E-mail address is the unique record identifier and an only required field. In case multiple records have the same e-mail the records will overwrite each other
To make it easier for you to prepare a data source we have created some sample structures as commonly used and needed in e-commerce.

XML

<?xml version="1.0" encoding="utf-8"?>
<contacts>
  <contact>
    <email>email@sample.com</email>
    <client_id>ID</client_id>
    <first_name>John</first_name>
    <last_name>Doe</last_name>
    <gender>m</gender>
    <mobile>+1xxxyyyzzzz</mobile>
    <street>One mailkit way</street>
    <city>Utopia</city>
    <zip>12345</zip>
    <state>California</state>
    <country>USA</country>
    <birthdate>12/31/2000</birthdate>
    <reg_date>01/31/2018</reg_date>
    <first_sale>02/14/2018</first_sale>
    <last_sale>03/18/2018</last_sale>
    <last_active>06/21/2018</last_active>
    <top_category>|ID|ID|ID|</top_category>
    <top_brands>|ID|ID|ID|</top_brands>
    <top_products>|ID|ID|ID|</top_products>
    <bonus_points>123</bonus_points>
  </contact>
</contacts>

JSON

[
	{
		"email":"email@sample.com",
		"client_id":"ID",
		"first_name":"John",
		"last_name":"Doe",
		"gender":"m",
		"mobile":"+1xxxyyyzzzz",
		"street":"One Mailkit way",
		"city":"Utopia",
		"zip":"12345",
		"state":"California",
		"country":"USA",
		"birthdate":"12/31/2000",
		"reg_date":"01/31/2018",
		"first_sale":"02/04/2018",
		"last_sale":"03/18/2018",
		"last_active":"06/21/2018",
		"top_category":"|ID|ID|ID|",
		"top_brands":"|ID|ID|ID|",
		"top_products":"|ID|ID|ID|",
		"bonus_points":"123"
	}
]
As we mentioned before e-mail is the only required field and all others are optional yet very important. In general you should follow the "the more the better" rule while keeping in mind "too much is too much". The data source should contain as many relevant information about the recipient that you know you will be needing for your current and future campaigns. If you use the structured data sources you can always safely add more data to the data source without any risk. In the above example we have following fields and their use cases:
  • email - it would be rather difficult to send emails whithout having an email address so it is a required field. If your system produces a data source that includes records for all your clients, including those with no email contact, you don't have to worry - Mailkit will simply silently ignore these records the same way as if the email address was invalid
  • client_id - internal client identified, eg. customer code or loyalty programm number
  • first_name a last_name - Firts and Last name of the client is very important not only for personalization of content but it impacts deliverability as well. Having a full name in the sent email instead of just plain email address not only looks better for recipients but some spam filters take this into account as it indicates existing relationship with customer. If your system can't produce the First and Last name as separate fields in the data source you can use the Fullname field which will be split into First and Last name automatically (First name must be first)
  • gender - recipient's gender can be used not only for personalised greeting but for segmentation or advanced personalization of dual content - have a separate design for women and men. The value in the gender field must be m for male and f for female. The importance of knowing the gender is often underestimated yet it's very easy to obtain using automated methods during subscription whether you use our subscribe forms or third party tools like genderize.io
  • mobile - mobile phone in international format can be used to run SMS campaigns and increase your reach with recipients who no longer engage with your emails
  • street, city, zip, state, country - street, city, zip, state, country - those are all great data points to be use for segmentation and personalization. You can easily include the nearest store location or sales events based on location in your emails
  • birthdate - date of birth will let you do campaigns for anniversaries offering discounts, send birthday cards, etc. The date format should match the format you intend to use in your campaigns. Remember that the full date is required to calculate the anniversary not only the year
  • reg_date and first_sale - date of registration (sign up) and date of first order of your clients will allow you to send shopping anniversary campaigns and use the information for segmentation. The date format should match the format you intend to use in your campaigns
  • last_sale and last_active - knowing the date of last order and last activity can help you control your regular sales campaigns and reduce the frequency after a sale has been done, send aftersale satisfaction polls and other aftersale and reactivation campaigns. The date format should match the format you intend to use in your campaigns
  • top_category, top_brand and top_products - most popular category, top brands and top products allow you to do efficient segmentation but more importantly automate the content of your campaigns. Combined with a data source with product feed you can easily add dynamic content matching the interrests of each recipient into the campaigns.
  • bonus_points - if you have a loyalty programme for your clients it's nice to let them know their current status in every email and let them know about how they can use the points
This is just an example of possible values. Every company has a different data set and different needs - that's why Mailkit is using data sources in a universal way and in case of XML and JSON sources data enhancements have no impact on existing functionality. Quite contratry - if you setup your data source with just the basic fields and add additional fields later on, all you have to do is to open the data source in and click the Display structure button and add mappings for the new fields.

Importing data from data source

data_source_mapping.pngOnce your data source has been setup you need to map the values to individual fields in the mailing list. To start mapping the fields or see the current mapping click on the Display structure button. This will analyze the data source and display it's structure and available fields. After completing the mapping of the fields click Save. Once the mapping is done you can initiate the import manually using the Import button. In case this is data source was setup to use a new mailing list as a target the list will be created (with a name matching the name of the data source) and the data from the data source will start importing in the background.

If you have setup a schedule for your data source it will be automatically updated according to your schedule. If you have opted for auto update of data source it will be updated before delivery of a campaign that uses the data source. Keep in mind that this may delay the delivery of your campaign as it may take up to several minutes to process your data sources. It is generally recommended to use the scheduled updates rather than auto update.

Using XML & RSS data sources in templates

Seting up XML & RSS data sources for use in Templates is very similar to the setup described above but without the need to do any field mapping. Since the access to the data from the template is given by the structure of the XML/RSS data source and the names of individual branches the setup is easier.

[% FOREACH data.DS_RSS_EXAMPLE -%]
<div>
	<a href="[% URL -%]"><img src="[% ENCLOSURE -%]" alt="[% TITLE -%]"></a>
	<a href="[% URL -%]">[% TITLE -%]</a>[% DESCRIPTION -%]
</div>
[% END -%]

The sample template code above will load a RSS data source named EXAMPLE. FOREACH command instructs the loop to parse all the records of the data source and within each record output a specific HTML code with the values obtained from the data source. The standard RSS tags can be easily accessed directly and used within an HTML code. This is a very trivial example that applies to a simple RSS data source. The templating language of Mailkit allows for very complex loops, if/else statements, mathematical operations as well as advanced data manupilation. More information on the templating can be found in the Using templates section.

Product data sources

Data sources can be also used to load your product portfolio using one of many supported product feeds directly into Mailkit's SQL database for later use of product information in campaigns. This is where the power of data sources and programmable templates truly shine as it lets you combine data from multiple sources and automatically create a truly dynamic content for each recipient.

Mailkit supports product information provided in many standard product feed formats like Google Merchant feed, Heureka, Shopero or you can create your own feed with the information relevant to your business. Because the product feeds can contain a huge amount of information about hundreds of thousands of items and the speed of access is critical, this type of data sources is loaded directly into our SQL database and can be accessed using SQL queries. To setup your product feed select SQL type of data source and proceed with data source setup. For the standard feed formats there are no options and the system will process them automatically. If you choose to use custom format you can use the Display structure to choose your primary key and up to 3 indexed fields. Feel free to contact our helpdesk in case you need assistant setting up your feeds.

Deliver data sources

Deliver data sources are feeds that are meant to pass structured information to drive campaign delivery. While normally the campaign uses mailing lists to drive delivery in case of delivery data source is the delivery driven by the feed and sent only to the e-mails present in the data source. It is an alternative to an mailkit.sendmail_mass API call - a way of passing highly structured data to your campaign. It's most commonly used by CRM systems to send invoices, recommendation systems to send emails to specific users with specific content, etc. These feeds must adhere to a specific structure as describe bellow closely resembling that of the mailkit.sendmail_mass API.

<?xml version="1.0"?>
<deliveryFeed>
  <feedItem>
    <recipient>
    <email>recipient email (required)</email>
      <first_name>First name (optional)</first_name>
      <last_name>Last name (optional)</last_name>
      <gender>M (optional)</gender>
      ... other standard recipient fields
    </recipient>
    <subject>subject (optional)</subject>
    <message_data>static message content (optional)</message_data>
    <attachment>
      <file_url>url (optional - only for transactional messages)</file_url>
      <file_url>url (optional - only for transactional messages)</file_url>
      <file_url>url (optional - only for transactional messages)</file_url>
    </attachment>
    <content>
      <!-- XML structured values (optional) example    -->
    </content>
  </feedItem>
</deliveryFeed>
The recipient fields in the delivery data source take precedence over the values present in the mailing list and will replace the original values in the list during delivery. In this sense the delivery data source can both drive the delivery and update the underlying mailing list.