Program for Tuesday, October 18th

PROGRAM FOR TUESDAY, OCTOBER 18TH

Days:

previous day

next day

all days

View: session overview talk overview

00:00-00:59 Session 17

Location: Virtual Room A

00:00

Ichiroh Fujita

Data Transfer in Multi-Cloud Environments

ABSTRACT. Data Transfer in Multi-Cloud Environments Keynote

01:00-02:00 Session 18A

Location: Virtual Room A

01:00

Mudit Verma and Sandeep Hans

AI driven optimisations for Chaos Testing

PRESENTER: Mudit Verma

ABSTRACT. The problem: Chaos Testing is a popular method to gauge the resiliency of a system under adverse conditions by injecting faults into its different components. However, existing Chaos testing practices often utilize faults that are injected randomly or selected intuitively by the tester or SRE. Furthermore, the overall chaos-test space is so huge that it is practically not possible to cover all scenarios in a time-bound, cost-effective manner. Many of these faults may not even be valid or suitable for a given system under test.

Purpose: This session will discuss how AI for Chaos Testing can provide a guided approach to chaos engineering where faults are selected or omitted intelligently, in turn leading to reduced, yet effective test cases. Specifically, the session shall cover various novel chaos-testing optimization techniques such as historical incidents and outage analysis, application behavior analysis, and reinforcement learning-based fault injection.

Expected learning outcomes: - What Chaos testing is, the current landscape and processes involved in the microservices era - Why is it important to navigate the huge test space with intelligence - How AI-driven offline analysis such as historical incidents and past outage studies can help reduce the number of faults a tester/SRE has to work with - How studying the characteristics of different components in an application and infrastructure topology, such as finding critical services, Network, Compute, and Memory utilization can help prioritize realistic faults and scenarios - How reinforcement learning-based fault injection, driven by a closed-loop feedback mechanism with rewards and penalties can help generate effective test cases and optimize the selected faults. - How these techniques can be integrated and utilized into existing chaos practices and pipelines

Presentation delivery shall also include a couple of real-life demonstrations and our implementation of these techniques with a microservices-based application and Litmus Chaos Engineering tool.

01:00-02:00 Session 18B

Location: Virtual Room B

01:00

Adinarayana Haridas

Chaos Engineering Implementation on Openshift Container Platform

ABSTRACT. Chaos Engineering is the discipline of experimenting on a system to build confidence in the system’s capability to withstand turbulent conditions in production. When you start breaking things on purpose or introducing chaos into the system, we start with basic steps and then proceed by introducing all the layers in the system. The goal of chaos engineering isn't just to install and run a tool to stop the system or run any random experiment. It’s all about the unexpected resiliency occurring on the system. With Orchestration platforms getting more mature and evolving with clients moving towards Modernization, to improve platform efficiency, better ease of development, responding to security risks, and managing the platform, Consultants need to understand the reliability of the system and requires lot of effort to make sure that the platform is available as Always On kind of model. Red Hat Openshift Container Platform has gained popularity with huge number of benefits in terms of accelerating development, deployments, collaboration and speeding up the migrations.

With the help of this lecture, I will be presenting a Chaos Engineering Implementation on Openshift Container Platform with the help of Chaoskube tool, and this will be kind of 101 primer on how it can help users to understand the reliability and resiliency of the platform.

02:00-03:00 Session 19A

Location: Virtual Room A

02:00

Vincent Chung, Louis Huang, David Hsu, Huai Ying Xia, Yi Chen Huang and Ting Ting Zhan

Cognitive translation quality evaluation for continuous delivery in devOPs

PRESENTER: Vincent Chung

ABSTRACT. The problem Translation on UI has high impact on user experience. The bad translation on UI context leads users to misunderstand the feature or deliver incorrect information. How to evaluate translation quality on UI context is critical to software quality.

Purpose The story captures the journey on we leverage “Intelligent Localization Test System” developed by us to evaluate the translation quality on localized UI. Software localization test is the essential process before releasing to the product. The translation is generally verified & improved by human translators on UI directly to make sure it conveys the correct message to users based on UI context. It costs a lot to perform software localization test. The talk will cover how we automate software localization test from translation quality perspective. We’ll demonstrate the system architecture, how to leverage Natural Language Processing service to evaluate translation quality on multiple languages & how to train custom model for NLP service to evaluate translation context quality. The solution aims to skip human verification process on translation quality at devops process so that we could achieve continuous delivery & integration without human intervention.

Methods We adopt the automation testing tool to inspect the translation & language information on localized UI and get the context information (ex: UI element type) in the meantime. It’s our client-end tool in the system. In the back-end, we leverage NLP service named sBert to evaluate translation quality for multi-language. We pick-up the pre-trained model of the service from multiple models with the proper assessment in the begging. To enable string context evaluation ability, we’ll retrain the model with context information of UI, business type of the product & translation update history through human verification to enhance context evaluation ability. The performance of model will be assessed if it’s positive correlation with translation update history to prove the model with translator’s insights.

02:00-03:00 Session 19B

Location: Virtual Room B

02:00

Walter Dietrich and Michael Shute

How did IBM’s Critical Treasury Application Break the Latency Barrier using Spectrum Scale?

PRESENTER: Walter Dietrich

ABSTRACT. While moving a mission critical application from legacy data centers into IBM cloud data centers, our team was facing a dilemma. How could we provide the performance our users demanded while also meeting IBM’s disaster recovery requirements? In the legacy data centers, our team used synchronous mirroring for disaster recovery. Synchronous mirroring had been working well for ten years, but when we tried to use it in the cloud data centers, it made some batch jobs intolerably slow. The team found that certain kinds of jobs ran slower because the cloud data centers were 4 times further from each other than the legacy data centers. (The legacy data centers were only 300 miles (480 km) apart.) How were we going to mirror millions of flat files and terabytes of databases using data centers that were more than 1200 miles (1900 km) apart? We could have cobbled together a mixture of point solutions, but we wanted something robust that had built-in automation, security, and monitoring.

In this presentation, we will describe the IBM Spectrum Scale software that was key to solving our dilemma and explain why this software was the right fit for the application given the performance requirements, the HA and DR requirements and the architecture of the application. We will demo some of the features that were key to the successful conclusion of our project, including Active File Management Disaster Recovery. We will describe the team that came together to solve the challenges we encountered. We will conclude with lessons learned and future plans.

03:00-04:00 Session 20: Poster Session

Location: Virtual Room A

04:00-04:59 Session 21: 8890 Ichiroh Fujita. Data Transfer in Multi-Cloud Environments

Location: Virtual Room A

05:00-06:00 Session 22A

Location: Virtual Room A

05:00

Dharma Teja Atluri

Data Fabric/Mesh - Journey towards Federated data architecture

ABSTRACT. The presentation/session provides insight into what are the evolving patterns when it comes to different data warehouse based solutions and what is known as Data Fabric/Mesh.

The session will explore into the functional and non-functional requirements along with other architectural decision points to be taken into consideration for arriving at a well architected data fabric platform. The session will then also explore each of the patterns to expand with industry use cases and journey points for clients into different hyperscalers such as Azure, AWS and Google Cloud explaining what to expect from the solution.

It then provides insights into the ideal deployment scenarios when clients want to adopt shared data services, DataOps, AI/ML adoption along with data quality standards into the data fabric/mesh deployment.

Audience will be able to walk away understanding the patterns, the reference architecture along with the lessons learned from other client engagements.

Note to reviewers: this lecture is from an IBM Academy of Technology Initiative

05:00-06:00 Session 22B

Location: Virtual Room B

05:00

Thomas Ullrich

Setup a highly available and secure hybrid Cloud infrastructure vs. sustainable and cost efficiency

ABSTRACT. Building critical infrastructures that are subject to a large number of requirements is often very difficult, especially if they are "first of its kind" projects that have never been implemented before and need to be built in a time-critical manner. It is challenging to develop in a cost-efficient and sustainable way. This presentation provides tools, methods and thought to help keep costs under control, avoid common mistakes and at the same time keep sustainability in mind. This in turn must have no impact on the security or stability of the solution. This presentation uses tools from the following areas: - Application Performance Monitoring (APM) - Financial Operation (FinOps), - Application Resource Management (ARM) - Compliance Tools - Carbon footprint tracking - Automation tools and their pitfalls

06:00-07:00 Session 23A

Location: Virtual Room A

06:00

Jaspreet Singh

The 7 Habits of Highly Effective SRE

ABSTRACT. The Financial Services Cloud (FS Cloud) is an Industry Architecture provided by IBM to overcome challenges of moving applications into cloud for regulated finance and industries. As a senior SRE that works with the IBM Hyper Protect platform of products and services that power the FS Cloud, I witness firsthand, the demands of a truly resilient and secure services in a Hybrid World, that is always-on 24/7 365 days a year. Come join me as I share with you my top key attributes that help me and my team succeed everyday in supporting one the most complex and deep stack of Hybrid Cloud services. I will also leave you with practical tips and tricks that will keep you ahead of your game as you tackle your next challenge to keep services highly resilient and reliable.

06:00-07:00 Session 23B

Location: Virtual Room B

06:00

Robert Barron

4000 years of Observability - The Universe is the Largest Production Environment in Existence

ABSTRACT. While "Observability" is considered one of the newest and shiniest items in the SRE toolbox, humans have been observing the skies for thousands of years and, ever since the development of astronomical observability tools such as telescopes and spectroscopes, have been adopting the concept of "understanding the universe by observing the stars" or, as an SRE would say "the practice of assessing a system's internal state by observing its external outputs." While standing far away from the target of their observations, astronomers can understand their inner workings and predict precisely how they will change in the near and far future. This is a level of observability SREs dream of having!

Of course, there are differences between astronomers and SREs - for example, SREs initiative and plan Chaos experiments to test their environments while astronomers merely aim their telescope in a new direction and see an experiment in action - but in the end, they are both using scientific and engineering tools to understand the innermost workings of the environments they are responsible for.

Astronomers have simply been doing it for orders of magnitude longer than SREs!

07:00-08:00 Session 24: Panel:Cloud Computing Performance and Resiliency

Location: Virtual Room A

David Jonas, Boris Jelic, Craig Fulton, David Goad and Suresh Eswaran

Panel: Performance and Resiliency with cloud computing

PRESENTER: David Jonas

ABSTRACT. Hypervisors are advertising that moving your solution to the cloud makes it easier to scale and be more resilient and each offer frameworks and approaches to achieve this goal, during this panel we explore concepts and views to consider in this evolving landscape.

08:00-09:00 Session 25

Location: Virtual Room A

08:00

Vita Bortnikov

How Kubernetes can be used to build Robust and Scalable Orchestration systems

ABSTRACT. How Kubernetes can be used to build Robust and Scalable Orchestration systems Keynote

09:00-10:00 Session 26A

Location: Virtual Room A

09:00

Stefaan Van Daele

The role of Enterprise Security Architecture in secure application development

ABSTRACT. Enterprise Security Architecture (ESA) is a mostly unknown aspect for most application developers. The primary security focus of developers lies on passing application security tests, which is just one part of the picture.

This session provides a brief introduction to ESA as a governance model for the CISO office. We explore how design principles, like zero trust, could be applied to solution design and what governance could mean in practice in a DevOps context. Since an ESA looks after all aspects of IT Security, it provides more contextual guidance for application developers.

09:00-10:00 Session 26B

Location: Virtual Room B

09:00

Gregor Resing, Michel Nouguier, Peter Schel and Frank Krämer

DevOps for the software-defined-vehicle

PRESENTER: Gregor Resing

ABSTRACT. In this session we introduce the Software-Defined-Vehicle (SDV) hardware and software concepts and the related architectural changes.

With the SDV, automotive OEMs will become software companies. The development and operational processes will shift to more agile processes based on DevOps principles. We describe how DevOps principles can be applied to the whole ecosystem of vehicles and cloud-based services to develop and operate the future vehicles. DevOps for automotive includes security, backend vehicle services and over-the-air updates, a continuous integration and deployment process, the usage of containerized, hybrid integration testing and software- and configuration management. We show reference architectures for containerized virtual and hybrid testing of automotive software.

The transition to the SDV requires fundamental changes in the OEM organization from siloed domain-oriented departments to more cross functional and agile organizations. These changes and the changes in the collaboration of OEMs and suppliers are shortly summarized.

Note to reviewers: this lecture is from an IBM Academy of Technology Initiative

10:00-11:00 Session 27A

Location: Virtual Room A

10:00

Mark McGloin

Adapting to a new regional cloud landscape

ABSTRACT. As EU landscape on “sovereign cloud” regulations hots up, IBM Cloud is in a battle to ensure we can provide the necessary capabilities for our major financial clients or else risk losing their business. The regulations impact cybersecurity resiliency and data sovereignty requirements, both very much interlocked and require a rethink on technical and operational functions to meet these while ensuring IBM can still provide competitive and scalable cloud offerings. In this talk I will highlight the impending changes, the potential impact on design of our services and look at current capabilities and gaps. I will provide example of how our engagement with one major client is attempting to deal with these changes, an approach that can be generalized across other client engagements. While capabilities to allow client content in region, or even on client premises via our Satellite offerings, exist in IBM Cloud, the impact extends to all client data and the related operations/support. In turn this means having the necessary disaster recovery within region or even in-country and ensuring resiliency to any external influence or attack. Data privacy and protection from non-EU law is a key area and there are some focus points that can assist in discussing these worries with clients. While EU regulations will be the focus for this talk, the drivers for these regulations are being repeated in other regions/countries and that will be touched upon briefly. As a core member of IBM Cloud Sovereign Cloud squad under Hillery Hunter and as the Cloud security focal for BNP Paribas, I will provide client perceptions coupled with strategies to react to these.

10:00-11:00 Session 27B

Location: Virtual Room B

10:00

Samir Nasser

Customer AIOps Adoption Roadmap

ABSTRACT. Customers are at different levels of interest in AIOps: on one side of the spectrum, customers may not be quite interested in AIOps. On the other side of the spectrum, customers find it imperative to adopt AIOps as this may be the only way to keep up with the complexity of their production environment. Other customers have various interest levels within this spectrum. This talk presents a methodical approach to assess the customer IT Ops environment so that a high-level AIOps adoption roadmap, that is unique to each customer, can be drafted. This assessment looks into the maturity of IT Ops processes, tools, data, and culture. Once the assessment is done, the high-level AIOps adoption roadmap is drafted to improve the relevant processes, tools, data, and culture. Consequently, the first MVP conversation is initiated to start the improvement effort. So, transitioning IT Ops teams from the traditional world to the AIOps world does not have to be a painful transformation.

11:00-12:00 Session 28

Location: Virtual Room A

12:00-13:00 Session 29: 8648 Vita Bortnikov. How Kubernetes can be used to build Robust and Scalable Orchestration systems

Location: Virtual Room A

13:00-14:00 Session 30A

Location: Virtual Room A

13:00

Ana Biazetti and Boas Betzler

Addressing Concentration Risk for Improved Resilience of Hybrid Cloud

PRESENTER: Ana Biazetti

ABSTRACT. In the current Hybrid Cloud environment, where clients use a mix of on premises data centers as well as public cloud providers to fulfill their needs for computing resources for their solutions, there is an increased focus on cybersecurity risk management, which includes cloud concentration risk and its effect on the resilience of solutions. Traditionally, concentration risk was considered in terms of vendor assessment and supply chain risk, but in the new Hybrid Cloud solutions, the context of assessing risk needs to evolve to include CSP (Cloud Service Providers) lock-in, workload placement strategy, and data portability. This presentation proposes an operational resilience model across the digital supply chain through optimal workload placement and automation of regulatory compliance across multiple clouds and estates which addresses concentration risk and increases resilience. We can do so by ensuring service availability and business continuity, balancing requirements with cost, expecting failure and designing for it, understanding shared responsibility, and hardening integrated service management, driving resilience through automation and continuous testing, designing for resilient and secure data, and systematically balancing concentration risk.

Learning Objectives and Expected Outcomes for attendees: - Understand concentration risk and its effect in resilience in hybrid cloud environments. - Learn a taxonomy and associated model that harmonizes the approach to managing concentration risk. - Apply leading practices for concentration risk management to improve resilience

Session Type: Innovative Point of View Delivery Method: Lecture

Biographies:

Ana Biazetti is a Distinguished Engineer at Financial Services Cloud organization, where, as Chief Architect for Payments Solutions, she is responsible for driving technology innovation in payments and for developing the Financial Services solution architectures that bring the best of IBM and our ISV payments ecosystem to win in the market. Ana is passionate about leading teams in developing cutting edge technology solutions that address real world problems. With extensive experience in the complete dev/sec/ops lifecycle, including High Availability, Disaster Recovery and Reliability of solutions, she brings technical and business knowledge in leading global teams to create strategic, innovative architectures that support clients’ digital transformation. Ana is an IBM Master Inventor and member of the IBM Academy of Technology Leadership Team.

Boas Betzler Boas is Distinguished Engineer for Solution Architecture at IBM Cloud. Clients trust him because he has built and operated Cloud solutions since 2009. Innovators credit him with breaking all the rules when he ported Linux to the mainframe as a kid.

13:00-14:00 Session 30B

Location: Virtual Room B

13:00

Srinivas Tummalapenta, Arkadii Kosoburov and John Hendley

Prepare to Prevent business disruption

PRESENTER: John Hendley

ABSTRACT. In the age of ransomware, a client's business is likely to be disrupted causing catastrophic consequences including going out of business. The session will discuss an approach to take a protection first approach to prevent business disruption. During this session 3 form approaches, Exposure Management, Improving the effectiveness of security technologies, and Faster detection and response will be discussed.

The audience can learn about the modular ways to approach security implementation and protect clients' business.

14:00-15:00 Session 31A

Location: Virtual Room A

14:00

Iain Furneaux

Customers don't care about performance

ABSTRACT. We will explore what our customers and some consultants attitudes towards the performance of systems being delivered. The testing a proving of performance is almost always left to the last minute on purpose, do customers not care about it?

We will learn why this apparent lack of care is being demonstrated and what effect this has on consultants and on the customer

We will also learn some of the things that both the consultants and the customer can do about this to ensure better outcomes when they launch a product.

To conclude we will answer the statement "Customers don't care about performance"

14:00-15:00 Session 31B

Location: Virtual Room B

14:00

Rakesh Jain, Ian Watts and Jack Hu

Go Multi Cloud with Canopy - Cross Cloud Distributed Databases on Kubernetes based Platforms

PRESENTER: Rakesh Jain

ABSTRACT. As enterprises are moving their applications from on-premise to the cloud, deploying the critical applications to only one cloud creates cloud concentration risk. There are regulations like Digital Operations Resilience Act (DORA) of European Union which suggest that the financial enterprises should not rely on one cloud provider only for their critical business applications. There are similar upcoming regulations from other countries. In this talk, we present Canopy, a novel solution from IBM Research and IBM Consulting, which allows the enterprises to set up their NoSQL or SQL distributed databases across two or more clouds on Kubernetes based platforms, such that they can be used for active-active or active-passive setup and allow failover from one cloud to other. The key takeaways from this talk are - what are the technical challenges when dealing with multi cloud setup, and, when designing your multi cloud and cloud native strategy you have new technologies available which will help address your business continuity needs and also meeting the regulatory requirements. We will also demonstrate the real world scenario by setting up a distributed NoSQL database across two clouds, in two Kubernetes clusters, and show: • Read & write data cross cloud, • Simulate cloud-1 outage & read database data in cloud-2, • Write data during an outage in cloud-2 and be able to read it in cloud-1 when it is back up, • All data processed within milliseconds. • Monitoring of data replication and connectivity between clouds.

This technology has been applied to a leading financial institution in the United Kingdom. The delivery method for this session is: Lecture.

15:00-16:00 Session 32: Panel:Cloud Computing Performance and Resiliency

Location: Virtual Room A

16:00-16:59 Session 33

Location: Virtual Room A

16:00

Andrew Hately

Survivability for Always-On Cloud Architecture

ABSTRACT. IBM Cloud has focused on regulated workloads which led to a new set of principles for exceeding client security, confidentiality, availability and integrity requirements. The session will describe a culture shift in Cloud architecture as we moved from building services focused on systems of engagement and edge applications to bringing systems of record to this same digital platform through resilient systems engineering and a focus on stringent availability and cybersecurity principles. Balancing the cost of defense-in-depth implementation and the high development velocity in a distributed and segmented micro-services architecture has changed the way we build, test and operate.

17:00-18:00 Session 34A

Location: Virtual Room A

17:00

Yuri Gankin, Michael Bouzinier and Shamil Sunyaev

AnFiSA: An open-source computational platform for the analysis of sequencing data on IBM Cloud

PRESENTER: Yuri Gankin

ABSTRACT. Despite genomic sequencing rapidly transforming from being a bench-side tool to a routine procedure in a hospital, there is a noticeable lack of genomic analysis software that supports both clinical and research workflows as well as crowdsourcing. Furthermore, most existing software packages are not forward-compatible in regards to supporting ever-changing diagnostic rules adopted by the genetics community. Regular updates of genomics databases pose challenges for reproducible and traceable automated genetic diagnostics tools. Lastly, most of the software tools score low on explainability amongst clinicians. Researchers from Harvard Medical School, clinicians from Brigham and Women’s hospital, software developers from Quantori and engineers from IBM have created a fully open-source variant curation tool, AnFiSA, with the intention to invite and accept contributions from clinicians, researchers and professional software developers. The design of AnFiSA addresses the aforementioned issues via the following architectural principles: using a multidimensional database management system (DBMS) for genomic data to address reproducibility, curated decision trees adaptable to changing clinical rules, and a crowdsourcing-friendly interface to address difficult-to-diagnose cases. Originally, AnFiSA was developed for on-prem deployment and also deployed on Amazon Web Services (AWS) and was later migrated to IBM OpenShift cluster. We will discuss how we have chosen our technology stack and describe the design and implementation of the software and our experience in migration of the software to IBM hybrid cloud. Finally, we will show in detail how selected workflows can be implemented using the current version of AnFiSA by a medical geneticist.

17:00-18:00 Session 34B

Location: Virtual Room B

17:00

Mathews Thomas, Utpal Mangla, Praveen Jayachandran, Sai Srinivasan and Sharath Prasad

Building resilient 5G edge computing environments in a hybrid cloud environment

PRESENTER: Mathews Thomas

ABSTRACT. 5G edge computing is maturing and being adopted by multiple industries, but several challenges remain. One key challenge is operating a resilient 5G network across a true hybrid cloud environment spanning from the public cloud to core networks to multi-edge compute nodes to far edge devices. This session will discuss the challenges and how intelligent and optimized day 0 – day 2 operations of a 5G network across a hybrid cloud environment ensure network deployment and operational aspects are met to create a resilient 5G edge environment. The architecture is built on emerging and maturing technologies using IBM Cloud Satellite with various Cloud Paks including Cloud Pak for Network Automation, AIOps, Data, and Security integrated with 5G networking components to enable key players including Communication Service Providers, Network Equipment Providers, Edge application providers and System integrators to monetize on the investment they are making. Examples of client engagements with underlying architectures, technical research innovation, impact on emerging standards and lessons learned will be discussed. A brief demo will also be presented so that the challenges and solutions to address 5G computing in a hybrid cloud environment are clear to the audience.

18:00-19:00 Session 35A

Location: Virtual Room A

18:00

Mark Buckwell and Stefaan Van Daele

Architecting Security for Regulated Workloads in Hybrid Cloud

PRESENTER: Mark Buckwell

ABSTRACT. Clients have been slow to migrate regulated workloads to hybrid cloud due to the additional risk to the sensitive data and the need to adopt stronger security controls. Organizations need a systematic approach to architect security that integrates zero trust architectural principles. The approach needs to ensure effective security controls are embedded into the solution that is appropriate to regulated workloads hosted in hybrid cloud.

The main objectives of the session are to i) summarize the architectural thinking practices required to integrate security and compliance into regulated workloads on hybrid cloud and ii) provide the next steps to develop skills to enable confident architectural thinking for security.

The session will summarize how security can be integrated into an architectural thinking process by describing the techniques and concepts to use with standard architecture artefacts, together with additional artefacts specific to security.

The practices being discussed are from the Architectural Thinking for Security class that has been updated, based on recent client engagements, to take a cloud-first perspective with zero trust practices. Over 600 students have now completed the class, either as an internal IBM class or as an MSc degree module. The artefacts have also been updated to use the new IBM Design language for technical diagrams and the cloud deployment model recently integrated into the Cognitive Architect tool.

18:00-19:00 Session 35B

Location: Virtual Room B

18:00

Edward McCain, Jeffrey Willoughby and Matthew Pickering

IBM Z development transformation

PRESENTER: Edward McCain

ABSTRACT. This article discusses how the product development cycle is being transformed with “Artificial Intelligence” (AI) for the first time in zSeries history. This new era of AI, under the project name IBM Z Development Transformation (zDT), has allowed the team to grow and learn new skills in data science. This transformation forces change structurally in how data is prepared and stored. In z14, there were incremental productivity gains with enhancements to automation with eServer Automation Test Solution and a technology data analysis engine called zDataAssist. However, in z15, AI will significantly accelerate our efficiency. This article explains how Design Thinking and Agile principles were used to identify areas that are of high impact and feasible to implement: 1) what and how data is collected via System Test Event Logging and Analysis engine, Problem ticket management system (Jupitr), and Processor data analysis engine (Xrings); 2) problem identification, analysis, and management (AutoJup) along with Intelligent Recovery Verification Assistant; 3) product design documentation search engine (AskTheMachine); and 4) prototype microprocessor allocation processes Intelligent Commodity Fulfillment System using Machine Learning. This article details the approach of these areas for z15, the implementation of these solutions under the zDT project, as well as the results and future work.

19:00-20:00 Session 36: Poster Session

Location: Virtual Room A

20:00-20:59 Session 37: Andrew Hately Keynote

Location: Virtual Room A

21:00-22:00 Session 38A

Location: Virtual Room A

21:00

Kevin Yu

Recipe for SRE

ABSTRACT. Have you wondered how teams and organizations achieve that successful SRE outcome? What does success look like?

This session is a highlight of the Podcast Making of the SRE Omelette on Spotify where experts share how they influenced the culture and mindset shift that led to positive business and client success outcome from Site Reliability Engineering.

We will discuss what success look like and how it is measured. How automation and AIOps are in relation to SRE. Innovation and career insights from experts. As well as ingredients and recipes shared by the experts to achieve that SRE outcome.

21:00-22:00 Session 38B

Location: Virtual Room B

21:00

Sima Nadler, Alexey Roytman and Flora Gilboa Solomon

Fybrik: A cloud-native platform for addressing non-functional aspects of data usage

PRESENTER: Sima Nadler

ABSTRACT. Making the most of enterprise data is a huge challenge, especially in a multi-cloud and hybrid cloud environment, and in a world that highly regulates the use of sensitive data. Fybrik enables easier access to data, while orchestrating optimal data flows according to business needs and enforcing data governance policies. Fybrik (https://fybrik.io/v1.0.0/) is an open-source cloud-native infrastructure that enables enterprise wide data governance enforcement based on pre-defined rules. It is a key component of IBM Data and AI's Data Fabric. The use of Fybrik decreases the manual processes currently in place, providing access to data in seconds to minutes rather than months. In addition. Fybrik also negates the need for sharing credentials with data users, increasing security. And, it ensures that new data and copies are only written to allowed locations, based on IT admin preferences as well as data governance restrictions, and automatically registers them in the enterprise data catalog.

Fybrik supports hybrid and multi-cloud environments. To this end, it also provides capabilities for determining the optimal data path between a declared workload and the data sets it uses. Which capabilities should be included (read, write, copy, transform, etc), which storage accounts should be used when writing, and in which cluster/region each capability should run, are all decisions made by Fybrik as it optimizes the data plane for a given workload. It takes into account the context of the workload, metadata of the data, data governance rules, and infrastructure capabilities and the enterprise's priorities for how to leverage its infrastructure.

In this talk, we will introduce Fybrik and its architecture, as well as sharing an example use cases from a pilot done with a major European bank and invite participants to download and try out Fybrik.

22:00-23:00 Session 39A

Location: Virtual Room A

22:00

Sourav Mazumder and Yongli An

Scaling Cloud Pak for Data for Data Fabric usecases

PRESENTER: Sourav Mazumder

ABSTRACT. Enterprises can build different kinds of Data Fabric use cases using IBM Cloud Pak for Data. The use cases can range from Meta Data Ingestion, Meta Data discovery, Data Virtualization, to consumption of Data from Watson Knowledge Catalog or Project. It is extremely important for enterprises to test and benchmark these use cases for performance at scale in terms of data volumes and concurrent users. In this presentation we shall cover best practices around isolating various use cases of Data Fabric into different workloads, based on the flows covering key user actions, data preparation, and a list of APIs for performance and scalability testing of several key data Fabric use cases. We shall also cover the results and learning from the case study, and some performance monitoring best practices to help proactively identifying resource constraints and custom tuning.

22:00-23:00 Session 39B

Location: Virtual Room B

22:00

Cynthia Unwin

Nobody Calls Me If Things Are Going Well

ABSTRACT. Learning Objective: The repeated patterns that lead to teams needing to call the Major Incident Response and Site Reliability team.

Expected Outcome: By following a series of real world examples encountered during Site Reliability (Major Incident) engagements, listeners will learn to understand the importance of non-functional testing; managing technical debt; that observability and scads of irrelevant data are not the same; and that platforms need end to end technical owners. This session will focus on the overarching themes that carry through all of the stories to illustrate common issues causing failure in our customer's Enterprise systems while identifying the solutions that brought stability to these projects.

Session Type: Experiential Sharing of real world examples delivered as lecture.

Cynthia Unwin is an Executive Architect with the IBM Consulting Canada Hybrid Cloud CTO team. She is the Site Reliability Engineering and Major Incident Leader for Hybrid Cloud Services Canada (HCS).

23:00-23:59 Session 40

Location: Virtual Room A

Tamar Eilam, Laura Shwartz, Marc Peters and Vijaya Bashyam

PANEL:Responsible Computing: The Blueprint to Make Sustainability Part of Your Organization’s DNA

PRESENTER: Marc Peters

ABSTRACT. Responsible Computing goals are to ensure IT organization is contributing to – and recognized for – the planet’s sustainable development goals. A focus on responsible computing reduces costs and enhances operational efficiencies while addressing the most pressing challenges of our day environmental sustainability, efficient infrastructure, secure coding, and ethical and transparent systems that reflect our diversity.  The Responsible Computing blueprint ties IT decisions to environmental, social, and corporate governance (ESG) KPIs to meet sustainability goals that also make your organization more operationally efficient. Learn how to integrate your digital transformation efforts into an overall environmental sustainability strategy that transforms business processes into green, intelligent workflows.