文件名称:Big Data SMACK_ A Guide to Apache Spark, Mesos, Akka, Cassandra, and Kafka.pdf
文件大小:11.09MB
文件格式:PDF
更新时间:2021-04-24 04:25:30
Big Data SMACK
During 2014, 2015, and 2016, surveys show that among all software developers, those with higher wages are the data engineers, the data scientists, and the data architects. This is because there is a huge demand for technical professionals in data; unfortunately for large organizations and fortunately for developers, there is a very low offering. Traditionally, large volumes of information have been handled by specialized scientists and people with a PhD from the most prestigious universities. And this is due to the popular belief that not all of us have access to large volumes of corporate data or large enterprise production environments. Apache Spark is disrupting the data industry for two reasons. The first is because it is an open source project. In the last century, companies like IBM, Microsoft, SAP, and Oracle were the only ones capable of handling large volumes of data, and today there is so much competition between them, that disseminating designs or platform algorithms is strictly forbidden. Thus, the benefits of open source become stronger because the contributions of so many people make free tools more powerful than the proprietary ones. The second reason is that you do not need a production environment with large volumes of data or large laboratories to develop in Apache Spark. Apache Spark can be installed on a laptop easily and the development made there can be exported easily to enterprise environments with large volumes of data. Apache Spark also makes the data development free and accessible to startups and little companies. If you are reading this book, it is for two reasons: either you want to be among the best paid IT professionals, or you already are and you want to learn how today’s trends will become requirements in the not too distant future. In this book, we explain how dominate the SMACK stack, which is also called the Spark++, because it seems to be the open stack that will most likely succeed in the near future.