Big Data Analytics Using Hadoop Presentation Transcript

Slide 1 - Big Data AnalyticsUsing Hadoop Presented By Sarita Bagul TE Computer Seat No.T120414208 Under the guidance Asst. Prof. B.A. Khivsara A Seminar On
Slide 3 - 1. Introduction BIG DATA
Slide 4 - “A massive volume of both structured and unstructured data that is so large that it's difficult to process with traditional database and software techniques”.
Slide 6 - Big data analytics is the process of collecting, organizing and analyzing large sets of data (called big data) to discover patterns and other useful information.
Slide 7 - 2. Literature Survey
Slide 9 - 2006 - Yahoo! created Hadoop based on GFS and MapReduce (with Doug Cutting and team) 2007 - Yahoo started using Hadoop on a 1000 node cluster Jan 2008 - Apache took over Hadoop Jul 2008 - Tested a 4000 node cluster with Hadoop successfully 2009 - Hadoop successfully sorted a petabyte of data in less than 17 hours to handle billions of searches and indexing millions of web pages. Dec 2011 - Hadoop releases version 1.0 Aug 2013 - Version 2.0.6 is available Nov 2014: Release 2.6.0 available Dec, 2015: Release 2.6.3 available Oct, 2016: Release 2.6.5 available
Slide 12 - To overcome the disadvantages of RDBMS, Hadoop is introduced in market. Hadoop is an open source, Java-based programming framework that supports the processing and storage of extremely large data sets in a distributed computing environment.
Slide 13 - There are many old technologies already present used for big data handling but each one of them has some advantages and disadvantages. There are number of technologies are there few of them are mentioned below: Column-oriented databases NoSQL databases MapReduce Hive Pig WibiData PLATFORA Apache Zeppelin Hadoop
Slide 14 - MapReduce HDFS(Hadoop Distributed File System) YARN(Yet Another Resource Negotiator) Common Utilities or Hadoop Common Today lots of Big Brand Companies are using Hadoop in their Organization to deal with big data, eg. Facebook, Yahoo, Netflix, eBay, etc. The Hadoop Architecture Mainly consists of 4 components.
Slide 16 - A MapReduce Example
Slide 17 - NoSQL (originally referring to SQL. or relational.) database provides a mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relation databases (RDBMS). This is backend database of hadoop.
Slide 19 - Hadoop Technology In Monitoring Patient Vitals
Slide 22 - Hadoop which is an open source software is a popular framework tool to handle the big data and used for big data analytics.
Slide 25 - Thank You!!!