Hadoop Tutorials.CO.IN
Big Data - Hadoop - Hadoop Ecosystem - NoSQL - Spark

Featured Articles

Complex Event Processing (CEP) using Apache Flink - Oil & Gas Use case

by Tanmay Deshpande



CEP is a technique to analyze stream of disparate events occurring with high frequency and low latency. In Oil & Gas industry we can imagine it to be sensors data coming from drilling equipment or sensors data from upstream assembly sending information about the temperature, pressure etc.

Read More

Hadoop skills club an exclusive insight

by Vaishnavi Agrawal



Today we constantly experience the morbid shadow which is casted by Hadoop skill-gap, so much so in fact that now we regard it as an entity of its own legendary accord. It is believed that due to the restrictive and demanding standards of training, professionals are not motivated to extend their skill-set in the trending technologies.

Read More

Real Time Alerting using ElasticSearch Watcher

by Tanmay Deshpande



This article talks about how to generate real time alerts/notifications on certain conditions using ElasticSearch Watcher plugin. Step-by-step guide explains everything about the alert generations right from plugin installation.

Read More

Log Analytics using Elasticsearch, Logstash and Kibana

by Tanmay Deshpande



This series of article explains how to install Elasticsearch, Logastash and Kibana on Windows. Then it explains how to insert data from Apache log files to Elasticsearch using GeoCity database so that IP addresses from logs get to auto-mapped to Countries and Cities of the world.

Read More

Introduction to Apache Sqoop

by Tanmay Deshpande



To use Hadoop for analytics requires loading data into Hadoop clusters and processing it in conjunction with data that resides on enterprise application servers and databases. Loading GBs and TBs of data into HDFS from production databases or accessing it from map reduce applications is a challenging task. While doing so, we have to consider things like data consistency, overhead of running these jobs on production systems and at the end if this process would be efficient or not. Using batch scripts to load data is an inefficient way to go with.

Read More

Big Data Analytics - What is that ?

by Dipayan Dev



In a recent statistics, IBM estimates that every day 2.5 quintillion bytes of data are created - so much that 90% of the data in the world today has been created in the last two years. It is a mind-boggling figure and the irony is that we feel less informed in spite of having more information available today. The surprising growth in volumes of data has badly affected today's business. The online users create content like blog posts, tweets, social networking site interactions and photos. And the servers continuously log messages about what online users are doing.

Read More

Hadoop Fundamentals

by Rakesh Porwal



Welcome to the unit of Hadoop Fundamentals: Before we examine Hadoop components and architecture, let's review some of the terms that are used in this discussion. A Node is simply a computer. This is typically non-enterprise, commodity hardware for nodes that contain data. Storage of Nodes is called as rack. A rack is a collection of 30 or 40 nodes that are physically stored close together and are all connected to the same network switch. Network bandwidth between any two nodes in rack is greater than bandwidth between two nodes on different racks.A Hadoop Cluster is a collection of racks.

Read More




Search