Hadoop for Business Users Training in Harrisburg
 
                    Enroll in or hire us to teach our Hadoop for Business Users class in Harrisburg,  Pennsylvania by calling us @303.377.6176.  Like all HSG
                    classes, Hadoop for Business Users may be offered either onsite or via instructor led virtual training.  Consider looking at our public training schedule to see if it
                    is scheduled:  Public Training Classes
                    
                
                        Provided there are enough attendees, Hadoop for Business Users may be taught at one of our local training facilities.  
                    
                    | 
                	 We offer private customized training for groups of 3 or more attendees.
                 | ||
| Course Description | ||
| This course enables participants to understand from a Business Users
perspective what the Hadoop platform is and provides hands-on lab
exercises to apply the concepts, plan, run, and use the platform. Apache
Hadoop is the most popular framework for processing Big Data. Hadoop
provides rich and deep analytics capability, and it is making in-roads
into the traditional BI analytics world. This course will introduce the
participant to the core components of the Hadoop Eco System and its
analytics, as well as planning, running, and administering a Hadoop
Cluster. It will emphasize the use cases of Hadoop and Data Warehousing,
and provide best practices and guidelines on combining the two. 
                        Course Length: 2 Days Course Tuition: $1090 (US) | ||
| Prerequisites | |
| Participants should be able to navigate the Linux command-line interface and have a basic knowledge of Linux editors, such as vi or nano. Also, basic knowledge of Java and understanding ETL are required. | |
| Course Outline | 
| 
	Course Topics 
	• Hadoop Eco System Overview 
	• HDFS 
	• MapReduce 
	• Hive 
	• Pig 
	• Data Access, Integration 
	• Transformation, Aggregation 
	• Feature Generation 
	• Join various Data Sources 
	• Filter, Search, Transpose 
	• Binning and Smoothing 
	Course Objectives 
	Upon completion of this course, participants will be able to: 
	• Describe what the Hadoop platform is and its purpose. 
	• Describe the core components of the Hadoop Eco System. 
	• Plan, run, and use a Hadoop Cluster. 
	I. Hadoop Eco System Overview 
	A. Eco System Review 
	B. High-level Architecture 
	II. Hadoop Distributed File System (HDFS) 
	A. Concepts 
	B. Overview 
	C. Labs 
	III. MapReduce 
	A. Concepts 
	B. Overview 
	C. Labs 
	IV. Hive 
	A. Concepts and Architecture 
	B. Data Types 
	C. Meta Data Management 
	D. Joins, Partitions, Indexes, Bucketing 
	E. Text Analysis with Hive 
	F. Labs 
	V. Pig 
	A. Pig versus Java Map Reduce 
	B. Pig Latin Language Introduction 
	C. Understanding Pig Job Flow 
	D. Basic Data Analysis with Pig 
	E. Complex Data Analysis with Pig 
	F. Advanced Concepts 
	a. User-Defined Functions 
	b. Best Practices 
	D. Complex transformations using Pig that HIVE could not do gracefully 
	G. Labs 
	VI. Introduction 
	A. Data Access, Integration 
	a. Navigate in Hadoop 
	b. Access Data and Files in HDFS and Tables 
	B. Transformation, Aggregation 
	a. Consume large datasets/tables 
	b. Working with Dates/timestamps, Arrays, 
	c. Use group by and summarize various attributes 
	d. Converting strings to date/time, numbers 
	e. Concatenating columns 
	f. Parsing semi-structured data 
	C. Feature Generation 
	a. Create new attributes, mathematical calculations, windowing functions 
	b. Use Character and string functions 
	D. Join Various data sources 
	a. Join multiple files/tables, in an optimized way 
	E. Filter, Search, Transpose 
	a. Ways to limit the data use various predicate methods 
	b. Pivot the data in different ways wide to long and vice versa 
	c. Find missing values 
	F. Binning and Smoothing 
	a. Create buckets and groups for categorization | 
Course Directory [training on all levels]
Technical Training Courses
                                Software engineer/architect, System Admin ... Welcome!
                            - .NET Classes
- Agile/Scrum Classes
- AI Classes
- Ajax Classes
- Android and iPhone Programming Classes
- Azure Classes
- Blaze Advisor Classes
- C Programming Classes
- C# Programming Classes
- C++ Programming Classes
- Cisco Classes
- Cloud Classes
- CompTIA Classes
- Crystal Reports Classes
- Data Classes
- Design Patterns Classes
- DevOps Classes
- Foundations of Web Design & Web Authoring Classes
- Git, Jira, Wicket, Gradle, Tableau Classes
- IBM Classes
- Java Programming Classes
- JBoss Administration Classes
- JUnit, TDD, CPTC, Web Penetration Classes
- Linux Unix Classes
- Machine Learning Classes
- Microsoft Classes
- Microsoft Development Classes
- Microsoft SQL Server Classes
- Microsoft Team Foundation Server Classes
- Microsoft Windows Server Classes
- Oracle, MySQL, Cassandra, Hadoop Database Classes
- Perl Programming Classes
- Python Programming Classes
- Ruby Programming Classes
- SAS Classes
- Security Classes
- SharePoint Classes
- SOA Classes
- Tcl, Awk, Bash, Shell Classes
- UML Classes
- VMWare Classes
- Web Development Classes
- Web Services Classes
- Weblogic Administration Classes
- XML Classes
Business Training Courses
                                Project Managers, Business Analysts, Paralegals ... Welcome!
                            Upcoming Classes
                    Gain insight and ideas from students with different perspectives and experiences.
                    - RHCSA EXAM PREP 
 17 November, 2025 - 21 November, 2025
- Fast Track to Java 17 and OO Development 
 8 December, 2025 - 12 December, 2025
- Python for Scientists 
 8 December, 2025 - 12 December, 2025
- VMware vSphere 8.0 Skill Up 
 27 October, 2025 - 31 October, 2025
- Introduction to Spring 6, Spring Boot 3, and Spring REST 
 15 December, 2025 - 19 December, 2025
- See our complete public course listing 






