In his Azure Data Week session, Azure Data Factory - Movement to and in the Cloud, Chris Seferlis takes us through a traditional SSIS package that ETLs the data and presents it for reporting, then compares it to the process in Azure Data with some great tips and roadmap.
There were many questions he was unable to answer during his session and we're happy to share them with you now. If you missed Chris' session or the entire week, you can still purchase access to the recordings by visiting azuredataweek.com.
Below are the Q&A from session attendees:
Is the data movement activity currently a lot costlier than SSIS if we need to make multiple API calls?
A: It depends on your design. There is a max parallel of 4 for activities within a pipeline, so if it's a complex solution you will still get better performance from SSIS and should consider running that in an ADF integrated runtime.
With the new ADF features, would we be able to migrate SSIS packages or would we have to rewrite them?
A: I don't have any information from Microsoft on this, however, given that SSIS packages are in XML and ADF are in JSON, there's a potential for conversion, but I would say it's pretty unlikely anytime soon.
Can you migrate from ADFv1 to ADFv2 or do you have to recreate the objects?
A: There is a migration tool from v1 to v2 and it can be found here: https://www.microsoft.com/en-us/download/details.aspx?id=57070
Did you first copy from http to Azure Blob Storage and then move that to Azure SQL DB? What are the benefits of having the blob storage step in between?
A: I did first copy to blob storage. The reason I did this is because I wanted to keep the original file for archival reasons after I was done with my extraction. Also, originally there were some challenges extracting a file of this size "on the fly" and would cause the pipeline to fail. This may have been fixed, but I haven't tried since.
Do you still use SSRS? Can you get better performance with SSRS when querying SSAS or a Data Warehouse vs Transactional DB?
A: No, did away with SSRS for this use case because of the interface and handling so many records. I'm sure it would load quicker if I loaded into AS cube, but even the export and navigation options of SSRS perform poorly and just don't provide a good experience to the end user.
Mac!? Boo Hiss :)
A: lol… shows the power of the web! I can do ADF pipelines on any platform!