0 votes
1 view
in Azure by (17.6k points)

I'm setting up a pipeline in an Azure "Data Factory", for the purpose of taking flat files from storage and loading them into tables within an Azure SQL DB.

The template for this pipeline specifies that I need a start and end time, which the tutorial says to set to 1 day.

I'm trying to understand this. If it were a CRON job in Linux or a scheduled task in Windows Server, then I'd simply tell it when to start (i.e. daily at 6 am) and it would take however long it takes to complete.

This leads me to several related questions:

  • Why would I need to specify an end time?
  • What if I don't know how long it will take to run?
  • If I set it too far in the future, do I run the risk of the data pipeline not completing in a timely manner?
  • If I set it too soon, will the pipeline break?
  • Why is it hardcoded as a date instead of a frequency (i.e. it says to use this format -- "2014-10-14T16:32:41Z")

1 Answer

0 votes
by (14.3k points)

So, according to your 1st part of the question - You actually do not need to specify any end time, if you want you can run the pipeline indefinitely.

As per 2nd question - The pipeline have start and end as a date but not as frequency is because it the overall date interval of your pipeline which stays active.

According to 3rd question - Once the activities start, they will run till completion.

For the 4th question - No, completing with timely manner only deals with your cluster size, data volume and as well as concurrency setting.

I hope this will help.

Want to become an Salesforce Expert? Join Salesforce Training now!!

Welcome to Intellipaat Community. Get your technical queries answered by top developers !


Categories

...