Blog Banner

Handling large data imports in Sitefinity

In Sitefinity, several instances require the need to import large amounts of data, such as moving the website from another CMS to Sitefinity, or importing data into Sitefinity from another source. Importing thousands or millions of pieces of data is always time consuming and there is a possibility of a timeout. Consider processing each item that takes a lot time or each item that is process intensive, like adding an image or document or a video to a dynamic item.

The idea is to loop through each record quickly and delaying the actual processing of each item and leave it to Sitefinity’s Scheduled Task to handle the heavy processing. Each time the scheduled task is called, there is an entry made in the database in the table sf_scheduled_task. While querying this table, it will be interesting to see entries for scheduled tasks used by Sitefinity itself internally, like a scheduled task for Sychronization, Sitemap generation, Item deletion etc. This particular tutorial is very helpful:

http://docs.sitefinity.com/for-developers-publishing-system/tutorial-schedule-a-timed-task-to-upload-content

In order to do that we need pass custom data to the scheduled task. In the following code snippet, item title and file path of the pdf file that should be attached as a related media file for the dynamic module item is passed on as custom data. This custom data is set as a string separated by a tab.

Sitefinity Data Import

In the below example, for simplicity and code reusability, we use comma separated data format (CSV) for the data to be imported. With that we can use TextFieldParser to read the records and for each record we call the scheduled task to take care of the time consuming operation like extracting text from a pdf file and attaching pdf file as a related media to the dynamic item. Doing this will reduce the time to loop through all the records.  In this case we reuse the scheduled task by passing custom data, just like we pass variables to a method. 

Sitefinity Data Import v2

It is helpful to query the sf_scheduled_task table to inspect the status of each task. There are two columns in the table – status and status_message, which gives information about the task status and error messages, if any.  Note that a task which successfully completes is deleted from the table, so only those errored out and those which are scheduled to run remain on the table. There is another useful column called progress. Note that custom data we pass on is stored in task_data. There are also timestamp columns like execute_time. Understanding this table will provide valuable information about the data that is being imported. We can inspect for those with errors and we can reschedule to import that data again. We can also build a dashboard page that just queries this table and visually show the progress and list the data that is waiting to be executed. Below is sample data from this table:

Sitefinity Data Import v3

Contact Us Today!

Mani is a Technical lead at Americaneagle.com, with over 11 years of experience in programming.

He has Masters Degree in Information Systems, with an Engineering background.

He is currently focusing on Sitefinity and has led successful implementation of several Sitefinity upgrades. He loves problem solving and believes in simple solutions for complex problems.

1 review

Write a review
  1. kelvin koh | Jun 25, 2018
    5.0000000000
    Hi Mani,Do you know if Sitefinity has its own import data function into its widgets?I've tried scouring the web but it seems there isn't one and the one i found requires me to have access to root directory which i don't have.https://www.sitefinity.com/blogs/tim-williamsons-blog/2013/11/20/importing-data-into-custom-modulesI want to be able to import an excel sheet and customise the data to fit my website.Help much appreciated

    Write a review