This course provides an overview of IBM’s big data strategy, and reviews the importance of understanding and using big data. It will cover IBM BigInsights as a platform for managing and gaining insights from your big data, as well as how BigInsights offerings have been aligned to better suite user needs with IBM Open Platform (IOP). Students will also be introduced to the three specialized value-add modules that sits on top of the IOP: Big SQL, BigSheets, and Big R. The participant will be engaged with the product through interactive exercises.
IBM Open Source & Big Data Analytics Courses
Click Here to Request It.
DW606G – IBM Open Platform with Apache Hadoop
IBM Open Platform (IOP) with Apache Hadoop is the first premiere collaborative platform to enable Big Data solutions to be developed on the common set of Apache Hadoop technologies. The Open Data Platform initiative (ODP) is a shared industry effort focused on promoting and advancing the state of Apache Hadoop and Big Data technologies for the enterprise. The current ecosystem is challenged and slowed by fragmented and duplicated efforts between different groups. The ODP Core will take the guesswork out of the process and accelerate many use cases by running on a common platform. It allows enterprises to focus on building business driven applications.
This module provides an in-depth introduction to the main components of the ODP core –namely Apache Hadoop (inclusive of HDFS, YARN, and MapReduce) and Apache Ambari — as well as providing a treatment of the main open-source components that are generally made available with the ODP core in a production Hadoop cluster.
DW613G – IBM BigInsights Foundation
This course provides a foundation of IBM BigInsights through two separate modules: IBM BigInsights Overview and IBM Open Platform with Apache Hadoop.
In the first module, students will cover IBM BigInsights as a platform for managing and gaining insights from your big data, as well as value-add tools including Big SQL, BigSheets, and Big R.
In the second module, students will gain an in-depth introduction to the main components of the ODP core – namely Apache Hadoop (inclusive of HDFS, YARN, and MapReduce) and Apache Ambari.
DW633G – IBM BigInsights Big SQL
This course introduces students to the capabilities of Big SQL, a part of IBM BigInsights that allows you to access your HDFS data by providing a logical view to it. You can use the same SQL that was developed for your data warehouse data on your HDFS data. This course will provide some context on why students would use Big SQL, followed by how to use Big SQL to access data. It will also cover Big SQL federation allowing students to join various data sources with Big SQL. Big SQL also integrates with a number of other components including Spark, HBase and BigSheets.
DW634G – IBM Big SQL for Developers (v5.0)
This course is designed to introduce the student to the capabilities of IBM Big SQL. IBM Big SQL allows you to access your HDFS data by providing a logical view to it. You can use the same SQL that was developed for your data warehouse data on your HDFS data.
This course provides context on why students would use Big SQL followed by how to use Big SQL to access your data. It also covers what Big SQL is, how it is used, and the Big SQL architecture. The course also covers how to connect to Big SQL, create tables with a variety of data types, load data in, and run queries against the data. The course also shows how to use Big SQL with other components of the Hadoop ecosystem.
DW644G – IBM BigInsights BigSheets
This course introduces students to the capabilities of BigSheets. BigSheets is a component of IBM BigInsights through the Analyst and the Data Scientist module. It provides the analyst the ability to be able to visualize and analyze data stored on the HDFS using a spreadsheet type interface without any programming.
DW653G – BigInsights Analytics For Programers
This course is designed to aid programmers who are working with IBM’s InfoSphere BigInsights. Students will learn how to create annotators through the use of IBM’s Annotation Query Language (AQL). Analyzing data using Apache’s Hadoop normally requires that MapReduce programs be written. Students will learn how to use Jaql to create high level programs that are decomposed into Hadoop MapReduce programs. Students will learn a foundation to program using the Apache Pig language, and how to publish a text analytics application from the BigInsights development environment to a BigInsights server.
DW654G – IBM BigInsights Text Analytics (V4)
This course will teach students how to use IBM BigInsights Text Analytics, an information extraction system, to extract information from unstructured and semi-structured documents. Using IBM BigInsights Text Analytics students can create extractors using a visual web interface. The visual extractors are then automatically translated into Annotation Query Language (AQL) rules to extract structured information from unstructured and semi-structured documents. Students can apply Text Analytics to big data at rest in IBM BigInsights and big data in motion in IBM Streams.
DW664G – IBM Big SQL for Administrators (v5.0)
This course is designed to introduce the student to some of the additional capabilities and the administration of IBM Big SQL. IBM Big SQL allows you to access your HDFS data by providing a logical view to it. You can use the same SQL that was developed for your data warehouse data on your HDFS data. This course covers Big SQL security using row and column access controls, impersonation, and data federation. The course also covers some of the best practices, performance tuning, and monitoring techniques, YARN integration and also includes an optional unit to explore a Big SQL installation.
DW724G – Programming for IBM InfoSphere Streams V4 with SPL
This course teaches students about the Streams Processing Language. It will begin with the basic concepts of InfoSphere Streams and the basic Streams Processing Language operators used in a Streams program. Students will learn how to access data from an external source using the Source type operators and write an output stream using the Sink type operators.
Students will then learn how and when to use the various Stream operators, like the Functor, Punctor, Aggregation, Sort, Join, Split, Barrier, Delay, and Switch operators. The second half of the course shows how to control the placement of processing elements and the debugging capabilities of the Streams Processing Language. Students will also learn about consistent regions and how to use them to process tuples at-least-once.