At Kavi Global, we believe the best way to learn is to share knowledge. We wanted to take this spirit of learning forward to the Data enthusiast community as well.

In this context, with the aim of spreading awareness about the emerging big data technologies , Kavi Global organized a meetup on Sep 21, 2017 on “Spark Machine Learning libraries”. Srinivasan Chandrasekaran and Neetha Sindhu from Kavi Global’s R&D Team introduced the machine learning concepts and demonstrated the models using some case studies.

The following topics were explained in detail:
• What is Machine Learning?
• Introduction to Spark
• Spark Machine Learning Libraries
• Feature Extraction, Selection and Transformation
• Regression, Classification, Clustering and Collaborative Filtering
• Pipeline Concepts
• Model Tuning
• Case Studies

Kavi Global Spark MLLib

The talk highlighted the difference between machine learning and statistical modeling. Srinivasan explained the real-life applications of various feature engineering and modeling algorithms – ranging from text mining, fraudulent insurance claims detection, and prediction of the success of college football players to stock price prediction.

The participants were then presented two case studies incorporating Spark’s machine learning libraries using the analytics tool – Advana:

1. Predicting a person’s favorite song using a song’s attributes. Classification algorithms like logistic regression, multi-layer perceptron model, decision tree classification were used for this objective.
2. Customer Segmentation analysis of a wholesale company using their spending pattern on various commodities. K-means clustering model was used to demonstrate this case study.

At the end, the Q&A session had the participants quizzing Srinivasan & Neetha about how machines learn intelligently and about the increasing adoption of machine learning algorithms in our daily lives.

The first meetup from Kavi Global was met with an enthusiastic reception and we are soon coming up with another – on MongoDB!