Submission #366

Submission information
Submitted by Anonyme (not verified)
Sat, 12/08/2018 - 10:05
R&D Data Science Internship in Industry 4.0 data pipeline development
Computer Science
Ryax Technologies
Headquartered in Lyon, France, Ryax Technologies is an early stage startup providing software
that enables companies to industrialize their data science. The process of data science
industrialization needs strong data engineering foundations. Complex tasks such as data
analytics pipelines automations, hybrid infrastructure management, workflow scheduling,
distributed systems configuration and operation, virtualized and containerized environment
deployment, batch and stream processing workload orchestration, infrastructure and application
monitoring along with optimizations of Big Data and AI frameworks are some of the data
engineers’ responsibilities.

Our software platform, Ryax, implements the necessary data engineering plumbing by
abstracting the underlying infrastructure and systems complexity to provide a platform with a
simple to use interface for data scientists. It enables them to deploy their data analytics
pipelines by focusing only on their expertise which is how to retrieve more business value from
their data.

Industry 4.0 is the digital transformation of the industry. All entities involved in the smart
factories of Industry 4.0 era, such as machines, people, sensors, actuators, and software
modules, are connected through networking. This enables manufacturing data to be gathered,
monitored, analyzed, and computed to automatically and intelligently control and improve
manufacturing processes. In this context, SCADA (Supervisory Control and Data Acquisition)
plays a crucial role since it is the system that allows the industrial organization to monitor,
gather, and process real time data.

This internship will be focused on implementing the necessary integrations on Ryax software to
enable the usage of SCADA software frameworks within Industry 4.0 data analytics pipelines.
The intern will develop within the Ryax software making use of its container (Docker) based
environment bundling and deployment. At least one of these different known versions of SCADA
will be supported. The one is proprietary and will be connected through its open-source SDK
(Siemens WinCC) and the other one is open-source (Eclipse NeoSCADA). Other integrations of
external tools such as pre-propressing (ETL tools), temporary storage (SQL or NoSQL
databases), deep learning (Tensorflow, Keras, MXNet), visualization (Grafana, Tableau), etc)
may be implemented or used, to complete a full Industry 4.0 data analytics pipeline.
A realistic testbed will be configured using Raspberry Pi and Intel NUC gateways along with
public or private cloud infrastructures.

The intern will work on the state of the art on Industry 4.0 data analytics pipelines, she or he will
develop in Python, R or Go and will make use of known open source tools such as Kubernetes,
Docker, Tensorflow, Grafana, etc. After the developments, the intern will perform experiments
on the designed testbed to validate the implementations, provide a performance analysis and
describe possible paths for optimizations.

The intern should ideally have a data science background and must be confident in at least one
language such as Python, R or Go. No previous usage of Kubernetes, SCADA or other tools are
needed. However, experience with C, C++ or Java along with Docker containers and deep
learning frameworks will be a plus.

Apply by sending CV and motivation letter to Yiannis Georgiou:
577 euros/month
Tue, 01/15/2019