UPDATE: here’s an official blog post going in more technical detail on how they achieved the improvements in startup time
If you want to run SSIS packages in Azure Data Factory, you need the Azure SSIS Integration Runtime (quite the mouthful), which is basically a cluster of virtual machines handling the packages like an SSIS scale-out cluster. You can read more about it in the article Configure an Azure SQL Server Integration Services Integration Runtime.
Previously, it took about 20-30 minutes to start the runtime, which was less than ideal. If you wanted to run your ETL multiple times a day, you’d lose quite some time, unless you keep the runtime running the entire time, which costs money.
Luckily, the team behind the IR made some changes and the runtime now starts in about 4-5 minutes. Quite the improvement! Now it’s easier to have multiple batches in a day and still save money. Normally you don’t have to do anything, the change is automatic, but I did recently upgrade the virtual machines of the runtime to a newer version:
The startup time depends on the size of the cluster and on any custom setup you have configured.
You can find all the session materials for the presentation "Indexing for Dummies" that was…
The slidedeck and the SQL scripts for the session Indexing for Dummies can be found…
You can find the slides of my session on the €100 DWH in Azure on…
I've used Logic Apps a couple of times over the past years for simple workflows,…
I'm giving two online sessions soon on virtual events that are free to attend. The…
I wanted to try out the new JSON index which is for the moment only…
View Comments
I was really excited to hear this but it must be pointed out (info in the provided link to Microsoft's blog) which explains the benefits can only be seen on those SSIS-IR provisioned outside a VNet. Unfortunately the start up time for those inside a VNet will remain at 20-30 mins.
It probably has to do something with the pool of VMs they have standing idle. Much harder to do for a VNet.