Keywords: Machine Learning, Online Platform, Collaboration, Open Science, Reproducible Research

OpenML (Vanschoren et al. 2014) is an online platform for collaborative and - as the name says - open machine learning. Users can upload, organize, search and download data sets and corresponding prediction tasks; they can run algorithms (flows) on the tasks and upload these runs for everyone to see and compare to other solutions. By connecting data, tasks, flows and runs online, OpenML allows for easy and fast collaboration to solve prediction tasks.

Aim of the tutorial

In this tutorial we want to introduce R users to the OpenML R package (Casalicchio et al. 2017) and show how everyone can easily make use of OpenML in their work, and share it with others online. At the end of the tutorial we want every attendee to be able to interact with the platform. We want to accomplish this by working hands-on.

Tutorial content

The tutorial will introduce the OpenML concept, the web interface of the server and how to interact with it from R scripts. A large part of the tutorial will demonstrate the different objects and features of the OpenML R package. Participants will learn how to run dozens of different machine learning algorithms from the mlr R package on many OpenML datasets and tasks with very few lines of code, and upload their experiments automatically to the server to share them with others. The last hour of the tutorial will be devoted to construct a demo project together with the participants to answer a realistic question from machine learning together with OpenML and R.

Target audience and prerequisites

We aim to address a broad audience, since OpenML is interesting for many different people, such as people who

The only prerequisite required is a very basic understanding of machine learning and of R.

The instructors

Joaquin Vanschoren is the founder of OpenML and professor of machine learning at the Eindhoven University of Technology.

Heidi Seibold is a PhD candidate in biostatistics at the University of Zurich.

Bernd Bischl is a professor for computational statistics at the LMU Munich, a long-term member of the OpenML project and the creator of the mlr package for machine learning.

OpenML is an open source community project. We gladly represent the amazing OpenML team.

OpenML at useR!

We believe that many useR! attendees will be interested in using OpenML for their work, research and studies. The open source philosophy of R aligns with OpenML. The form of a hands-on tutorial to introduce OpenML is perfect, since the platform is especially useful for people if they can actively participate. The OpenML R package makes participation easy for R users and three hours should get everyone into a state where they can use and benefit from the package.

References

Casalicchio, Giuseppe, Jakob Bossek, Michel Lang, Dominik Kirchhoff, Pascal Kerschke, Benjamin Hofner, Heidi Seibold, Joaquin Vanschoren, and Bernd Bischl. 2017. “OpenML: An R Package to Connect to the Networked Machine Learning Platform OpenML.” ArXiv E-Prints. https://arxiv.org/abs/1701.01293.

Vanschoren, Joaquin, Jan N. van Rijn, Bernd Bischl, and Luis Torgo. 2014. “OpenML: Networked Science in Machine Learning.” SIGKDD Explorations 15 (2). New York, NY, USA: ACM: 49–60. doi:10.1145/2641190.2641198.