Oozie and Flume
It runs both as a server and a client which submits a workflow to the server directly. This workflow based on a DAG of action nodes and control flow nodes. An action node executes a workflow task similar as moving files in HDFS, running a MapReduce job or running a Pig job.
A control-flow node handles the complete workflow execution between actions by allowing such constructs as conditional logic or parallel execution. When the workflow is finished then Oozie can make an HTTP callback to the client to notify it constantly workflow status. Hence it’s possible to get callbacks every time the workflow enters or exits an action node.
It permits the failed workflows to run from a random point. When the early actions made in the workflow are time consuming to execute then this part is useful for handling with redundant errors
Why Oozie Security?
- User are not allowed to alter job of another user
- Hadoop does not support the authentication of end user
- Oozie has to verify and confirms its user before transferring the job to Hadoop
Apache Flume is a continuous data ingestion system which is intended basically for big data ecosystem. It has consists of the following features:
- Open source