RDD: Resilient Distributed Dataset Basic abstraction in Spark Collection of elements that can be operated on in parallel