DATATALKS 2019

2019

KATIE BOFSHEVER

Design Partner, GV

As VP of Digital Media at GV, Katie Bofshever crafts innovative and creative solutions for the digital space, specializing in minority consumers.

2:00pm

Understand

Lightning Talks on Business Goals

Learn to understand and tackle problems from many different points of view. Discussions should highlight business goals, success metrics, technical capabilities and potential challenges, and relevant user research.

ABOUT CLOUDERA

At Cloudera, we believe that data can make what is impossible today, possible tomorrow. We empower people to transform complex data into clear and actionable insights. Cloudera delivers an enterprise data cloud for any data, anywhere, from the Edge to AI. Powered by the relentless innovation of the open source community, Cloudera advances digital transformation for the world’s largest enterprises. Learn more at Cloudera.com.

July

8:30am

2017

RSVPs Closed

About this Session

Please join us for a very unique Tech Meetup hosted by Cloudera The enterprise data cloud company, on July 24th at the Novotel Bengaluru Outer Ring Road at 1:30 PM.

This Meetup will be unveiling and demo-ing Cloudera's flagship product, the Cloudera Data Platform/CDP (to be released in early Fall 2019), in addition to talks about YuniKorn (Next Generation Scheduler for Apache YARN & Kubernetes) and Ozone (Scaling HDFS to trillions of objects).

Our Distinguished panel consists of Cloudera Apache PMC members, Committers, and our India Site Leader.

If you are a technology professional, or just someone who is interested in learning more about Cloudera's new products, we encourage you to register and attend this event.

Agenda

01:30pm

Let's go!!!!

Registration

This is where we begin our partnership. Register and let's get started

02.00pm

Drum roll!!!! by chid kOLLENGODE

Opening Remarks

Our India Head, Chid will kick start the event.

02:15pm

product demo by hemanth yamijala

Cloudera Data Platform (CDP) - Overview And Demo

Cloudera Data Platform (CDP) is a new offering from Cloudera that will enable enterprise customers to consume and manage data in the cloud and on-prem environments through a consolidated suite of applications including Data Engineering, Analytics and Machine Learning. Whether the customer would like to consume their data completely on-prem, or burst some workloads to the cloud for a short term capacity expansion, or migrate from on-prem to cloud or one cloud to another for purposes of cost efficiency or corporate policy, CDP promises to partner with them through their data journey.

A cloud-first offering followed by an on-prem release, CDP provides a micro-service based control plane using which customers can manage their hybrid compute and data environments, and most importantly provide a security and governance framework for consistently managing these. CDP realizes synergy between the Hortonworks Data Platform (HDP) and the Cloudera Distribution of Hadoop (CDH), and further enhances the usability of the platform through a fresh product experience and ruthless automation of infrastructure setup.

03:00pm

Tech talk by vinod VAVILAPALLI & Sunil govindan

YuniKorn : Next Generation Scheduler for Apache YARN & Kubernetes

Resource Scheduler of a container orchestration system, such as YARN and Kubernetes, is a critical component that users rely on to plan resources and manage applications. YARN has two power schedulers (Fair and Capacity scheduler) and both serve many strong use cases in big data ecosystem. K8s default scheduler is an industry-proven solution to efficiently manage long-running services.

Fragmented resource scheduling is a main concern to have seamless Big Data user experience across any of the container orchestrators. At this point, there is no solution that exists to address the needs of having a unified resource scheduling experiences across platforms. That makes it extremely difficult to manage workloads running on different environments, from on-premise to cloud.

YuniKorn is a unified scheduler powered from YARN and K8s’s legacy capabilities and improving towards cloud use cases. YuniKorn will be a common scheduler for both YARN and Kubernetes.

03:45pm

Tech Session by mukul Singh

Ozone : Scaling HDFS to trillions of objects

Ozone is an object store for Hadoop. Ozone solves the small file problem of HDFS, which allows users to store trillions of files in Ozone and access them as if there are on HDFS. Ozone plugs into existing Hadoop deployments seamlessly, and programs like Hive, LLAP, and Spark work without any modifications. This talk looks at the architecture, reliability, and performance of Ozone.

In this talk, we will also explore Hadoop distributed storage layer, a block storage layer that makes this scaling possible, and how we plan to use the Hadoop distributed storage layer for scaling HDFS.

04:15pm

TECH SESSION BY Anishek agarwal

Data Warehouse Experience (DWX)

Data Warehouse Experience is a new Offering from Cloudera that will allow enterprise customers to run their Warehouse on Cloud Infrastructure. This offering will leverage Apache Hive at its core, to build a strong foundation for various Warehouse use-cases, along with the flexibility of a Cloud Service by allowing seamless autoscaling up/down of clusters in the cloud.

The service will contain tools to allow customers to identify what workloads are running in their clusters, how to debug problems with workloads, reporting on how well their data model is for their workloads etc.

04:45pm

winding up

Closing remarks / Q&A

Curtains down.... We will wind up with a Q&A session.

05:00pm

It's high tea time!!

High Tea and Networking

Will end the event with a High Tea and Networking meeting where you get to interact with fellow participants and the speakers, because We believe in We and that is Cloudera's Code.

Speakers

CHID KOLLENGODE

VP of Engineering / Site Leader-India

Chid Kollengode is serving as the VP Engineering and Country Head of Cloudera from January 2019. Chid is a 25+ year engineering and management professional who assembled the big data team at Nokia and centralized all of the company’s worldwide data.

Previously, he led the open source Hadoop MapReduce team at Yahoo! in building the scalable platform driving Yahoo! Search and user data analytics. As Senior Manager/Architect at Amazon A9 team, he built the company’s first non-Oracle system, one of the early big data systems, to store web search and advertisement data for rigorous analysis.

HEMANTH YAMIJALA

Director of Engineering & Apache Commiitter

Hemanth is currently leading the effort from Cloudera Bangalore to build the set of new generation capabilities called DataPlane Services which is a platform for building hybrid multi-cluster data and infra management applications and also leads the team that builds Hortonworks Data Steward Studio - an application that attempts to solve data governance and security problems for large organisations using the Hadoop stack.

His primary area of interest is in building large scale distributed systems and has experience both in building frameworks, and applications that use frameworks. He was an early contributor, committer and project lead of Hadoop MapReduce, Hadoop on Demand - the earliest provisioning system for Hadoop on a shared cluster, and the first version of the Capacity Scheduler - which continues to be one of the main schedulers in Hadoop today.

Vinod Vavilapalli

Director of Engineering & VP, Apache Hadoop

Vinod Kumar Vavilapalli has been contributing to Apache Hadoop project full-time since mid-2007. At Apache Software Foundation, he is V.P. of Apache Hadoop, a long-term Hadoop contributor, committer, member of the Project Management Committee, and a ASF member. He is Director of Engineering at Cloudera and runs the Compute platform teams there. Before Hortonworks, he was at Yahoo!, working in the Grid team that made Hadoop what it is today, running at large scale - upto tens of thousands of nodes.

Vinod loves reading books of all kinds and is passionate about using computers to change the world for better, bit by bit. He has a bachelor’s degree in computer science and engineering from the Indian Institute of Technology Roorkee.

Sunil govindan

Engineering Manager & Apache Hadoop PMC

Sunil Govindan is Engineering Manager at Cloudera leading Compute Platform team from Bengaluru, India. He is contributing to Apache Hadoop project since 2013 in various roles as Hadoop Contributor, Hadoop Committer and member Project Management Committee (PMC). He is majorly contributing in YARN Scheduling improvements such as Intra-Queue Resource preemption, Multiple Resource types support in YARN with Resource Profiles, Absolute Resource configuration support in Queues etc.

Mukul Singh

Engineering Manager & Apache Hadoop PMC

Mukul is currently associated with Cloudera as an Engineering Manager, where he is leading the HDFS team. He has also been working on Storage Systems and File systems for 9 years and has played various roles as open source contributer PMC member, researcher and Software developer.

He also has worked with Nimble Storage and NetApp and worked on WAFL and CASL filesystems respectively. He graduated from Carnegie Mellon University, where his thesis was on a file system for Shingled Magnetic recording disks.

Anishek Agarwal

Engineering Manager & Apache Hive Committer

Anishek has overall 15+ years of experience in software industry and is at present playing the role of Engineering Manager based out of our Bangalore office. He is looking at various teams at Cloudera including, Replication work for Apache Hive, Data Analytics Studio, Hive Warehouse Connector and DWX UI. He has been working in the Big data space for about 8 years, with experience in building entire data platforms. He is also an Apache Hive Committer.

Cloudera + Hortonworks, from the Edge to AI

To know more about CDP, please visit here.

RSVPs Closed