In order to read or download eBook, you need to create FREE account. eBook available in PDF, ePub, MOBI and Kindle versions
It's notoriously difficult to query Hadoop data using standard Map/Reduce programming techniques. Pig and the Pig Latin scripting language provide a SQL-like platform that simplifies query construction against data sets in Hadoop, eases the obstacle of Map/Reduce, and opens the door to processing large data sets for casual users, including experimentation on data sets. And it stands up well under stress—Yahoo uses Pig for over half the queries it runs on the world's largest Hadoop cluster. Pig in Action introduces Pig and the Pig Latin language while teaching the fundamentals of big data processing. Readers will explore the intersection of business and data science as they walk through practical questions like executing standard queries, establishing automated data management processes and policies, and developing useful reports. Most importantly, they'll learn techniques to extract valuable insights from data while mastering the features of Pig.