Loading...

IBM Open Source & Big Data Analytics Courses

* Courses listed below can also be customized based on user demandDon't see a course that you are looking for?
Click Here to Request It.

DW601G – IBM BigInsights Overview

This course provides an overview of IBM’s big data strategy, and reviews the importance of understanding and using big data. It will cover IBM BigInsights as a platform for managing and gaining insights from your big data, as well as how BigInsights offerings have been aligned to better suite user needs with IBM Open Platform (IOP). Students will also be introduced to the three specialized value-add modules that sits on top of the IOP: Big SQL, BigSheets, and Big R. The participant will be engaged with the product through interactive exercises.

  • In-class
  • Instructor Led Online
  • Self-Paced Virtual Classroom
Register for self-paced course

DW606G – IBM Open Platform with Apache Hadoop

IBM Open Platform (IOP) with Apache Hadoop is the first premiere collaborative platform to enable Big Data solutions to be developed on the common set of Apache Hadoop technologies. The Open Data Platform initiative (ODP) is a shared industry effort focused on promoting and advancing the state of Apache Hadoop and Big Data technologies for the enterprise. The current ecosystem is challenged and slowed by fragmented and duplicated efforts between different groups. The ODP Core will take the guesswork out of the process and accelerate many use cases by running on a common platform. It allows enterprises to focus on building business driven applications.

This module provides an in-depth introduction to the main components of the ODP core –namely Apache Hadoop (inclusive of HDFS, YARN, and MapReduce) and Apache Ambari — as well as providing a treatment of the main open-source components that are generally made available with the ODP core in a production Hadoop cluster.

  • In-class
  • Instructor Led Online
  • Self-Paced Virtual Classroom
Register for self-paced course

DW613G – IBM BigInsights Foundation

This course provides a foundation of IBM BigInsights through two separate modules: IBM BigInsights Overview and IBM Open Platform with Apache Hadoop.

In the first module, students will cover IBM BigInsights as a platform for managing and gaining insights from your big data, as well as value-add tools including Big SQL, BigSheets, and Big R.

In the second module, students will gain an in-depth introduction to the main components of the ODP core – namely Apache Hadoop (inclusive of HDFS, YARN, and MapReduce) and Apache Ambari.

  • In-class
  • Instructor Led Online
  • Self-Paced Virtual Classroom
Register for self-paced course

DW633G – IBM BigInsights Big SQL

This course introduces students to the capabilities of Big SQL, a part of IBM BigInsights that allows you to access your HDFS data by providing a logical view to it. You can use the same SQL that was developed for your data warehouse data on your HDFS data. This course will provide some context on why students would use Big SQL, followed by how to use Big SQL to access data. It will also cover Big SQL federation allowing students to join various data sources with Big SQL. Big SQL also integrates with a number of other components including Spark, HBase and BigSheets.

  • In-class
  • Instructor Led Online
  • Self-Paced Virtual Classroom
Register for self-paced course

DW644G – IBM BigInsights BigSheets

This course introduces students to the capabilities of BigSheets. BigSheets is a component of IBM BigInsights through the Analyst and the Data Scientist module. It provides the analyst the ability to be able to visualize and analyze data stored on the HDFS using a spreadsheet type interface without any programming.

  • In-class
  • Instructor Led Online
  • Self-Paced Virtual Classroom
Register for self-paced course

DW653G – BigInsights Analytics For Programers

This course is designed to aid programmers who are working with IBM’s InfoSphere BigInsights. Students will learn how to create annotators through the use of IBM’s Annotation Query Language (AQL). Analyzing data using Apache’s Hadoop normally requires that MapReduce programs be written. Students will learn how to use Jaql to create high level programs that are decomposed into Hadoop MapReduce programs. Students will learn a foundation to program using the Apache Pig language, and how to publish a text analytics application from the BigInsights development environment to a BigInsights server.

  • In-class
  • Instructor Led Online
  • Self-Paced Virtual Classroom
Register for self-paced course

DW654G – IBM BigInsights Text Analytics (V4)

This course will teach students how to use IBM BigInsights Text Analytics, an information extraction system, to extract information from unstructured and semi-structured documents. Using IBM BigInsights Text Analytics students can create extractors using a visual web interface. The visual extractors are then automatically translated into Annotation Query Language (AQL) rules to extract structured information from unstructured and semi-structured documents. Students can apply Text Analytics to big data at rest in IBM BigInsights and big data in motion in IBM Streams.

  • In-class
  • Instructor Led Online
  • Self-Paced Virtual Classroom
Register for self-paced course

DW724G – Programming for IBM InfoSphere Streams V4 with SPL

This course teaches students about the Streams Processing Language. It will begin with the basic concepts of InfoSphere Streams and the basic Streams Processing Language operators used in a Streams program. Students will learn how to access data from an external source using the Source type operators and write an output stream using the Sink type operators.

Students will then learn how and when to use the various Stream operators, like the Functor, Punctor, Aggregation, Sort, Join, Split, Barrier, Delay, and Switch operators. The second half of the course shows how to control the placement of processing elements and the debugging capabilities of the Streams Processing Language. Students will also learn about consistent regions and how to use them to process tuples at-least-once.

  • In-class
  • Instructor Led Online
  • Self-Paced Virtual Classroom
Register for self-paced course

DW732G – Administration of IBM Streams

This course enables students to acquire the skills necessary to administer an IBM Streams system. This course covers creating Streams domains and instances, using ZooKeeper in a high availability environment, viewing the state of Streams domain and instance services, stopping and starting processing elements, viewing the jobs and processing elements that are running, and a variety of other topics. In addition, it covers defining resource tags, adding a resource to a Streams domain and instance, setting the access control list for security objects to give permission to users to work with those objects, and submitting and cancelling Streams jobs.

  • In-class
  • Instructor Led Online
  • Self-Paced Virtual Classroom
Register for self-paced course