Hadoop for Business Users Training Course in Montebello, California

Home Training Montebello-California Hadoop for Business Users

Hadoop for Business Users Training in Montebello

Enroll in or hire us to teach our Hadoop for Business Users class in Montebello, California by calling us @303.377.6176. Like all HSG classes, Hadoop for Business Users may be offered either onsite or via instructor led virtual training. Consider looking at our public training schedule to see if it is scheduled: Public Training Classes

Provided there are enough attendees, Hadoop for Business Users may be taught at one of our local training facilities.


We offer private customized training for groups of 3 or more attendees.
get pricing information
Course Description
This course enables participants to understand from a Business Users perspective what the Hadoop platform is and provides hands-on lab exercises to apply the concepts, plan, run, and use the platform. Apache Hadoop is the most popular framework for processing Big Data. Hadoop provides rich and deep analytics capability, and it is making in-roads into the traditional BI analytics world. This course will introduce the participant to the core components of the Hadoop Eco System and its analytics, as well as planning, running, and administering a Hadoop Cluster. It will emphasize the use cases of Hadoop and Data Warehousing, and provide best practices and guidelines on combining the two. Course Length: 2 Days Course Tuition: $1090 (US)

Prerequisites
Participants should be able to navigate the Linux command-line interface and have a basic knowledge of Linux editors, such as vi or nano. Also, basic knowledge of Java and understanding ETL are required.

Course Outline

Course Topics

• Hadoop Eco System Overview

• HDFS

• MapReduce

• Hive

• Pig

• Data Access, Integration

• Transformation, Aggregation

• Feature Generation

• Join various Data Sources

• Filter, Search, Transpose

• Binning and Smoothing

Course Objectives

Upon completion of this course, participants will be able to:

• Describe what the Hadoop platform is and its purpose.

• Describe the core components of the Hadoop Eco System.

• Plan, run, and use a Hadoop Cluster.

I. Hadoop Eco System Overview

A. Eco System Review

B. High-level Architecture

II. Hadoop Distributed File System (HDFS)

A. Concepts

B. Overview

C. Labs

III. MapReduce

A. Concepts

B. Overview

C. Labs

IV. Hive

A. Concepts and Architecture

B. Data Types

C. Meta Data Management

D. Joins, Partitions, Indexes, Bucketing

E. Text Analysis with Hive

F. Labs

V. Pig

A. Pig versus Java Map Reduce

B. Pig Latin Language Introduction

C. Understanding Pig Job Flow

D. Basic Data Analysis with Pig

E. Complex Data Analysis with Pig

F. Advanced Concepts

a. User-Defined Functions

b. Best Practices

D. Complex transformations using Pig that HIVE could not do gracefully

G. Labs

VI. Introduction

A. Data Access, Integration

a. Navigate in Hadoop

b. Access Data and Files in HDFS and Tables

B. Transformation, Aggregation

a. Consume large datasets/tables

b. Working with Dates/timestamps, Arrays,

c. Use group by and summarize various attributes

d. Converting strings to date/time, numbers

e. Concatenating columns

f. Parsing semi-structured data

C. Feature Generation

a. Create new attributes, mathematical calculations, windowing functions

b. Use Character and string functions

D. Join Various data sources

a. Join multiple files/tables, in an optimized way

E. Filter, Search, Transpose

a. Ways to limit the data use various predicate methods

b. Pivot the data in different ways wide to long and vice versa

c. Find missing values

F. Binning and Smoothing

a. Create buckets and groups for categorization

Course Directory [training on all levels]

Technical Training Courses

Software engineer/architect, System Admin ... Welcome!

Business Training Courses

Project Managers, Business Analysts, Paralegals ... Welcome!

Upcoming Classes

Gain insight and ideas from students with different perspectives and experiences.

Docker
17 August, 2026 - 19 August, 2026
ASP.NET Core MVC, Rev. 8.0
19 October, 2026 - 20 October, 2026
RED HAT ENTERPRISE LINUX SYSTEMS ADMIN II
27 July, 2026 - 30 July, 2026
Enterprise Linux System Administration
21 September, 2026 - 25 September, 2026
DOCKER WITH KUBERNETES ADMINISTRATION
28 September, 2026 - 2 October, 2026
See our complete public course listing

Hadoop for Business Users Training in Montebello

Course Description

Prerequisites

Course Outline

Course Directory [training on all levels]