Monday, May 18, 2015

Guidelines for productive full stack data engineers

Guidelines for productive full stack data engineers

Guidelines for productive full stack data engineers
Rafal Wojdyla
Spotify
Published in: Engineering
Transcript 1. How to Be Productive Data Engineer Rafal Wojdyla - rav@spotify.com Note: My views are my own and don't necessarily represent those of Spotify. 2. • Operations • Development • Organization • Culture 3. What is Spotify? For everyone: • Streaming Service • Launched in October 2008 • 60 Million Monthly Users • 15 Million Paid Subscribers + and for me: • 1.3K nodes Hadoop cluster 4. Automation 5. ME ADAM 6. Apache Ambari Cloudera Manager 7. + Puppet 8. Not Invented Here 9. Never Invented Here 10. Wild Wild West 11. Apache Bigtop 12. Enable log aggregation 13. To enable log aggregation yarn.log-aggregation-enable = true yarn.log-aggregation.retain-seconds = ? 14. + + yarn.log-aggregation-enable + true + + + + yarn.log-aggregation.retain-seconds + 315569260 + + 15. Heap Memory used is 97% 16. Hellelephant 17. Custom logs • Profiling • Garbage collection 18. Right tool for the job 19. Right abstraction for the job 20. Scaling machines is easy, scaling people is hard 21. • Map split size • Number of reducers • HDFS data retention • User feedback (ongoing) Automation 22. Organization 23. Ownerless 24. Ownerless Squad 25. Ownerless Squad Upgrades 26. Ownerless Squad Upgrades Getting there 27. Culture 28. Experiment Fail Fast Embrace Failure 29. Spark But we have tried! Non grata 30. Spark spark.storage.memoryFraction (0.6) spark.shuffle.memoryFraction (0.2) In shuffle heavy algorithms reduce cache fraction in favour of shuffle. 31. Spark spark.executor.heartbeatInterval (10K) spark.core.connection.ack.wait.timeout (60) Increase in case of long GC pauses. 32. Learnings • Operations  Automation • Development  Abstraction • Organization  Team • Culture  Experiment 33. Join the band Engineers wanted in NYC & Stockholm http://spotify.com/jobs