Oozie and Flume
It runs as a server and a client which submits a workflow to the server. In this a workflow is a DAG of action nodes and control flow nodes. An action node executes a workflow task such as moving files in HDFS, running a MapReduce job or running a Pig job.
A control-flow node handles the workflow execution between actions by permitting such constructs as conditional logic or parallel execution. When the workflow is finished then Oozie can make an HTTP callback to the client to notify it of the workflow status. It is also possible to get callbacks every time the workflow enters or exits an action node.
It allows failed workflows to be re run from an random point. When the early actions in the workflow are time consuming to execute then this is useful for handling with transient errors
Read this blog now to know how exactly Big Data Technologies can help you grab high-paying jobs!
Why Oozie Security?
- One user should not alter job of another user
- Hadoop does not authenticate end user
- Oozie has to verify its user before transferring the job to Hadoop
Apache Flume is a continuous data ingestion system which is intended for big data ecosystem. It has following features:
- Open source