Apache spark notes

Assorted collection of notes on Apache spark - Spark Architecture, Programming concepts and best practices.

Preface

  • Architecture

  • Anatomy of a Spark Application

    • Types of Spark Application
    • Programming Languages
    • Life cycle
      • Application/Jobs/Stages/Tasks
      • RDDs/DataFrames/DataSets
      • Shuffles/Caching/Persist
      • Memory needs of the Application
    • Deep dive into Spark Application Configurations
  • Managing Spark Applications

    • Spark UI
    • Eventlog
    • driver/executor logs
    • Advanced Metrics
    • Best Practices
  • Spark and its EcoSystem

    • Hadoop
    • Kafka

    • Cassandra

results matching ""

    No results matching ""