DSS Online #4: Tech & Ethics for the Open Source AI: The Linux Foundation AI

Venerdi 23 Ottobre 2020 sempre a partire dalle 18:00, DataScienceSeed è tornato questa volta in versione internazionale, completamente in inglese, con degli ospiti di eccezione dagli U.S.!

The First internation DataScienceSeed event, in our 4th online meetup we have had the pleasure to meet the Linux Foundation AI, part of the Linux Foundation. Their mission is to build and support an open AI community, and drive open source innovation in the AI, ML and DL domains by enabling collaboration and the creation of new opportunities for all the members of the community.

Give us your feedback on the event at this link!

We started from an intro to LFAI, then we dig deeper in two of their projects, touching technical and ethical topics. Two sides of the same coin of Artificial Intelligence, now and more and more in the future.

LF AI and Open Source: Accelerating Innovation in the AI Market

Over the past two decades, open source software — and its collaborative development model — has disrupted multiple industries and technology sectors, including the Internet/web, telecom, and consumer electronics. Today, large scale open source projects in new technology sectors like blockchain and artificial intelligence are driving the next wave of disruption in an even broader span of verticals ranging from finance, energy and automotive to entertainment and government.

In this talk, Dr. Haddad provided a quick overview of the efforts of the LF AI Foundation in supporting the development, harmonization, and acceleration of open source AI projects and how to get involved.

Download Ibrahim’s slides (pdf)

The easiest way to get in touch with LFAI is to join the Slack channel

If you want to know more , you may want to have a look to the session held by Ibrahim at the AI for People summer workshop, which is where we met him the first time!

Ibrahim Haddad (Ph.D.) is the Executive Director of the LF AI Foundation. Prior to the Linux Foundation, Haddad served as Vice President of R&D and Head of the Open Source Division at Samsung Electronics. Throughout his career, Haddad has held several technology and portfolio management roles at Ericsson Research, the Open Source Development Lab, Motorola, Palm and Hewlett-Packard. He graduated with Honors from Concordia University (Montréal, Canada) with a Ph.D. in Computer Science, where he was awarded the J. W. McConnell Memorial Graduate Fellowship and the Concordia University 25th Anniversary Fellowship.

End-to-End Deep Learning Deployment with ONNX

A deep learning model is often viewed as fully self-contained, freeing practitioners from the burden of data processing and feature engineering. However, in most real-world applications of AI, these models have similarly complex requirements for data pre-processing, feature extraction and transformation as more traditional ML models.

Any non-trivial use case requires care to ensure no model skew exists between the training-time data pipeline and the inference-time data pipeline. This is not simply theoretical – small differences or errors can be difficult to detect but can have dramatic impact on the performance and efficacy of the deployed solution. Despite this, there are currently few widely accepted, standard solutions for enabling simple deployment of end-to-end deep learning pipelines to production.

Recently, the Open Neural Network Exchange (ONNX) standard has emerged for representing deep learning models in a standardized format. While this is useful for representing the core model inference phase, we need to go further to encompass deployment of the end-to-end pipeline. In this talk Nick introduced ONNX for exporting deep learning computation graphs, as well as the ONNX-ML component of the specification, for exporting both “traditional” ML models as well as common feature extraction, data transformation and post-processing steps. He covered how to use ONNX and the growing ecosystem of exporter libraries for common frameworks (including TensorFlow, PyTorch, Keras, scikit-learn and Apache SparkML) to deploy complete deep learning pipelines. Finally, I will explore best practices for working with and combining these disparate exporter toolkits, as well as highlight the gaps, issues and missing pieces to be taken into account and still to be addressed.

Nick Pentreath (Open Source Developer, Developer Advocate) – Principal Engineer, IBM CODAIT – Nick is a Principal Engineer at IBM. He is an Apache Spark committer and PMC member and author of Machine Learning with Spark. Previously, he co-founded Graphflow, a startup focused on recommendations and customer intelligence. He has worked at Goldman Sachs, Cognitive Match, and led the Data Science team at Mxit, Africa’s largest social network. He is passionate about combining commercial focus with machine learning and cutting-edge technology to build intelligent systems that learn from data to add business value.

Download Nick’s slides (pdf)

AI Fairness 360 – an open source toolkit to mitigate discrimination and bias in machine learning models

Machine learning models are increasingly used to inform high-stakes decisions. Discrimination by machine learning becomes objectionable when it places certain privileged groups at the systematic advantage and certain unprivileged groups at a systematic disadvantage. Bias in training data, due to prejudice in labels and under -or oversampling, yields models with unwanted bias. The AIF360 R package is a R interface to AI Fairness 360 – a comprehensive toolkit that provides metrics to check for unwanted bias in datasets and machine learning models and state-of-the-art algorithms to mitigate such bias. This session explored the metrics and algorithms provided in AI Fairness 360 toolkit, as well as a hands-on lab in R.

AIF360 is a sub-project of Trusted AI

Saishruthi Swaminathan (Developer Advocate, Open Source Developer) is a developer advocate and data scientist in the IBM CODAIT team whose main focus is to democratize data and AI through open source technologies. She has a Masters in Electrical Engineering specializing in Data Science and a Bachelor degree in Electronics and Instrumentation. Her passion is to dive deep into the ocean of data, extract insights and use AI for social good. Previously, she was working as a Software Developer. On a mission to spread the knowledge and experience, she acquired in her learning process. She also leads education for rural children initiative and organizing meetups focussing women empowerment.