Submission #170

Submission information
Submitted by Anonyme (not verified)
Mon, 01/16/2017 - 10:26
Experimenting machine learning for automatic user segmentation
Shapr is a leading networking application developer. At Shapr, we believe networking should be a joyful experience, because it's about meeting new people, not just a trick to generate more sales or find some humdrum job. At Shapr, we see it as a lifestyle. And that's why we made an app. Shapr app brings you a daily dose of inspiring people to meet. Coming up with the most relevant recommendations for you, and you both mutually accept the connection, you can chat and meet.

Shapr has delivered millions of personalized suggestions to over 100,000 users on every continent. Over 80% of our users are based in the U.S. (New York, Los Angeles, Chicago, Washington..). The app is growing by over 100% a month and is one of the fastest-growing apps from the meeting industry. In 2017, Shapr will emphasize its growth strategy and enlarge to new U.S. cities as well as leading world capitals.
Shapr is headquartered in New York City with offices in Paris. With us, you will participate to an international, dynamic, and responsible company that fosters people’s inspiration around the world. You will integrate a team with a successful entrepreneurial background (the co-founders of Attractive World dating platform).

Shapr app comes up with a daily batch of people to meet. The underlying suggestion models should make the suggestions that most likely yield to matches (e.g., common interests). To this end, we are now experimenting new methods from machine learning in a data-driven fashion, which leverages issues about data quality. This internship is part of a larger project aiming to ensure high-quality and real-time data flow to the learning models.

Specifically, you will be in charge of implementing a full-stack process for automatic user segmentation. Segmentation aims at characterizing users with a score or a discriminant category depending on their job’s title, experience, and other profile information. The task can naturally be cast as a (supervised) machine learning task involving the prediction of some category (segment) from a bulk of structured and unstructured data.

As part of the Data Science team, you will be in tight collaboration with the Data Science Lead and the CTO. You mission will include but not limited to:

- Experiment a set of supervised learning models for automatic user segmentation (linear/logistic models, kernel-based models, latent-variable models…)
- Perform and evaluate relevant data preprocessing
- Evaluate the efficiency of the predictive models for real-time segmentation (quality, efficiency)
- POC: unsupervised learning models for user segmentation

As part of the data science process, you’ll be expected to implement some data access/data cleaning routines. A brief literature review will also be performed previous to the analytics.

Keywords: Machine learning, Text mining, Profiling, Categorization, Scoring, Descriptive statistics, Linear models.

Thu, 02/16/2017