ML6 Internship Projects
As ML6, we focus on cutting-edge technologies. This can only be done by investing time in research. While our ML6 agents are doing research, they sometimes discover interesting topics. Unfortunately, an ML6 agent doesn't always have enough resources to investigate these topics in-depth.
That's why ML6 is opening up parts of their research in the form of internships and theses. As an intern/thesis ML6 agent you'll be guided by an expert on the topic and will be able to enjoy the full ML6 experience, while gaining valuable experience using the latest technologies in a fast-paced environment.
You will be a full member of our team and work on one of the following internal projects as a Machine Learning Engineer or Data Engineer. Keep a close eye on this page as we are continuously adding new exciting projects across our different locations.
Note: We also offer the opportunity to do an internship for longer than 3 months (up to 6 months). This means you will be working on one of the projects below for the first 3 months, after which you will be closely involved in the day-to-day activities of ML6.
1. ML for Bike Sharing Systems
Time series - Supervised Learning - Unsupervised Learning - Machine Learning - Artificial Intelligence
Bike and other transport sharing systems are becoming available in a lot of cities around the world. In this internship, we are going to focus on the systems available in Antwerp.
We’ve created or requested data sets from each provider. During this internship, we want to learn more about the opportunities and challenges of this provider. Based on this input, you will look for solutions using ML and related (open) data.
For this internship, you will work together in a team of ML6 interns which will be a mix of business and technical profiles. This team will be guided by Koen who is an ML6 mentor. During this internship, you will not only be able to learn from ML6 but also from the client (Antwerp transport providers) and your colleague interns.
We created a data set based on the availability of Velo bikes in Antwerp.
Next to that public weather local weather information is available in Google BigQuery.
In the internship, we would like to investigate the value ML can bring to all the stakeholders.
For example, better availability forecasting for Velo and Velo users, segmentation of the type of users for marketing/planning purposes, anomaly detection.
You will work with Google Cloud Platform and the ML libraries we typically use at ML6 (Tensorflow / Scikit-learn / …). As part of the internship, you will have time to check if other libraries for time series are available on the open-source market.
The goal is to present how you analysed and pre-processed the data sets. Then show what benefits ML can bring and explain what the customer could do to improve.
2. Drone Imagery Anonymization
Drone - Computer Vision - Anonymization - Machine Learning - Artificial Intelligence
The GDPR introduced new regulations on how to process and handle personal data. As such, meeting the GDPR requirements is key in selling software applications to customers. The anonymization of textual data is largely accepted as something that is possible, but this is not something that is true for images. For example sometimes a face detection algorithm fails and a face is not blurred. Further, vehicles are often recognizable not just by their number plates but also by logos or other characteristics. A specific research challenge with regard to drone imagery is the fact that the camera is mobile and that real-world footage often features rapid zooming in and out. In this thesis we want to mitigate these problems.
Detection and tracking of moving objects such as people and vehicles is challenging in the case of drone footage as camera movement can be sudden and fast and the possibility to zoom further complicates matters. The fact that no frames or edges of objects can be missed in the case of anonymization makes this a particularly challenging problem. One possibility for research could be to use motion sensors and/or drone input signals to help anticipate movement and improve tracking.
Research and create a Machine Learning algorithm that can anonymize faces and vehicles in images captured by a drone taking into account the above challenges. Technologies that can be used are Python, Tensorflow, Keras and in general the Python data science and machine learning track.
3. Mirror Of Erised
Mirror - Computer Vision - Object Detection - Machine Learning - Artificial Intelligence
During the last christmas project we built a smart mirror platform, complete with a basic set of implementations such as people recognition and google voice assistant. We used an existing open source project to implement our features (https://github.com/MichMich/MagicMirror). The hardware is fully functional and is an open canvas for impressive software implementations.
The ML6 office has agents and clients coming in and out all the time. With the Mirror of Erised, ML6 wants to personalise the welcome of every person that walks in or out of the office.
The final goal of this project consists of various milestones:
- Solidify the facial recognition module on the existing MagicMirror setup
- Integrate a mechanism to switch profiles, based on the detected ML6 employee
- A possible solution here is to have a separate database with an employee configuration, and a backend API which can interface with this database, and with which the MagicMirror can interface
- Define and integrate a mechanism to onboard new, previously unseen, ML6 employees
- Asses an appropriate candidate 3rd party library within the ML6 software ecosystem to integrate with
- To allow potential use case such as
- Employee goes home at the end of the day
- He or she passes by the mirror
- The mirror provides him or her information on his or her train schedule, based on the registered home city
- The mirror provides the employee with a notification that his timesheets are incomplete
- Create demo mode for conventions by implementing cool showcases on anonymous persons like the google vision API detecting your mood.
- Badge detection for all ML6 employees + stranger alerts !
Based on the previously defined milestones, a large number of interesting extensions can be developed, such as:
- The employee can fill-in his timesheets through a voice-guided assistant, integrated with the magic mirror
- ML6 has purchased a Leap Motion Controller (link) which could be integrated with the MagicMirror platform
A new and upcoming Meetup group is the ‘Home Automation’ meetup group (link). The result of this project would form a perfect talk at this event, if the student is up for the challenge.
4. Email Clustering
Emails - Natural Language Processing - Unsupervised Learning - Machine Learning - Artificial Intelligence
At ML6 we are launching a product that automates parts of big email inboxes (eg. typical info@ email inboxes).
We split the problem into 2 parts:
Finding relevant categories of emails (denoted as labels hereafter).
Training a classifier to correctly classify emails into labels and automate the processing of these emails.
At this moment we already have a working version of the 2nd part and we are manually looking for interesting clusters for each individual client. The functionality to automatically the retrieval of interesting clusters/labels is very important to make this a scalable product. This would be the topic of this thesis.
We found a dataset that contains English labeled emails. This dataset will be used to perform the unsupervised clustering upon. The labels could be used to assess the quality of the clustering algorithm.
Ideally the students should look into automating the retrieval of potential email clusters based on the content of the emails. Furthermore, the automatic retrieval of the correct “label”-names for a cluster might be interesting. We could also try to do this in a multi-language context instead of an only English context.
Create an unsupervised Machine Learning algorithm that can create clusters when given a dataset of emails. Technologies that can be used are Python, Tensorflow, Keras and in general the Python data science and machine learning track.
5. Office of the Future
Sensors - IoT - Hardware - Data visualisation - Data engineering
At ML6 we’re always working with the latest Machine Learning models and technologies. It’s time to make our office future proof as well. By using sensors and dashboards we would like to know if the dishwasher is running, if there are any open doors, what the temperature and humidity in the office is or if we need to order new snacks. We have a 3d printer and a hardware corner where we can prototype and build new IoT devices and sensors.
Your challenge is to analyse the needs of an office and look for new ways how IoT can make our office management more efficient.
- You’ll build an IoT infrastructure from scratch using Raspberry Pi’s & Arduino’s for the hardware and Python to write the software.
- You will look at different ways to set up a network of sensors securely and capture all the data with technologies like InfluxDB.
- Once the data is gathered you will extract relevant metrics and look into different ways of visualising the data with Grafana or Plotly.
- To connect our IoT network to the cloud we’ll integrate it with Google Cloud Platform’s IoT capabilities.
We are a team of AI experts and the fastest growing AI company in Belgium. With offices in Ghent, Amsterdam, Berlin and London, we build and implement self learning systems across different sectors to help our clients operate more efficiently. We do this by staying on top of research, innovation and applying our expertise in practice. To find out more, please visit www.ml6.eu