Cascading is an application framework for Java developers to quickly and easily develop robust Data Analytics and Data Management applications on Apache Hadoop.
At it’s core, Cascading is a rich Java API for defining complex data flows and creating sophisticated data oriented frameworks. These frameworks can be Maven compatible libraries, or Domain Specific Languages (DSLs) for scripting.
Cascading allows developers to create and test rich functionality before tackling complex integration problems. Thus integration points can be developed and tested before plugging them into a production data flow.
The Process Scheduler coupled with the Riffle lifecycle annotations allows Cascading to schedule unit of work from any third-party application.