Around 2009 the Stratosphere research project started at the TU Berlin which a few years later was set to become the Apache Flink project. Often compared with Apache Spark in addition to that Apache Flink offers pipelining (inter-operator parallism) to better suite incremental data processing making it more suitable for stream processing. In total the Stratosphere project aimed to provide the following contributions to Big Data processing. Most of it can be found in Flink today:
- The Stratosphere platform for big data analytics
- Iterative Parallel Data Processing with Stratosphere: An Inside Look
- Large-Scale Social-Media Analytics on Stratosphere
- Massively-Parallel Stream Processing under QoS Constraints with Nephele
- Optimistic Recovery for Iterative Dataflows in Action
- Meteor/Sopremo: An Extensible Query Language and Operator Model
- PAXQuery: Parallel Analytical XML Processing
- “All Roads Lead to Rome:” Optimistic Recovery for Distributed Iterative Data Processing
- Flink Forward: Conference Around Apache Flink
- Quick Start: Setup