This article is the first in a series on the application of artificial intelligence in education. We will discuss the challenges of data science and the operationalization of AI.
At Optania, caring is at the core of our business. We are innovating with ISA, a technology that uses machine learning to improve student retention with the well-being and success of students at heart. In education, improving college graduation rates requires the use of modern statistics combined with computer science (Castro & New, 2019). According to the Ministère de l’Enseignement supérieur, in 2015, the dropout rate was close to 29% after 5 years of study, all programs combined in Cégep.
The predictive model developed by the artificial intelligence team at Optania is a decision support tool to identify students who are at risk of course failure and dropout; all stakeholders and managers responsible for student success and retention in college will benefit from using this tool.
Because the ISA tool is predictive and based on historical student data, college staff members know the profile of their students' risk level even before the first results of the study session are entered. The educational stakeholders in charge will be able to make preventive interventions quickly.
A decision-making aid
Our main Cégep partners, Cégep de Chicoutimi, le Cégep de Rimouski and Cégep de Trois-Rivières have a total of nearly 10,000 students per year. Since the stakeholders have to manage a large number of student files, it is necessary for them to quickly and efficiently analyze the thousands of academic files in order to detect the early signs of a risk of failure and dropping out.
The machine learning model developed by Optania is calibrated to be more sensitive to detecting failure and dropout than to detecting success and persistence. Indeed, according to a study on educational persistence and success (Guillemette et al., 2018) "the consensus is that one would rather mistakenly label a potential graduate as a potential dropout (and provide them with interventions to prevent dropping out) than mistakenly label a dropout as a potential graduate and provide them with no interventions..."
The ISA dashboard does not replace the professional judgment of the practitioner, but rather accompanies it. It provides an overview of the student population at the beginning of each semester and its predictions adapt throughout the school year based on the information that is available.
Thanks to its dynamism, the machine learning algorithm accelerates the work of targeting stakeholders by raising red flags on individuals at risk throughout the session. The latter remain at the heart of the decisions, having the power to study the student's file by themselves and decide to carry out a preventive intervention according to their better judgment.
No algorithm without quality data
Before even thinking about producing an artificial intelligence algorithm, the team working on it must ensure that the data is of high quality. The data scientist must validate whether the training data meets the following main points:
- Ensure the consistency of the data;
- Investigate the relevance of extreme values;
- Study and rigorously manage the class misalignment: is the model more used to encountering individuals who persevere?
- Rigorously manage missing data;
- Étudier les questions reliées aux enjeux éthiques (Besse, 2020) :
- What is the level of accuracy, reproducibility of a decision from a learning artificial intelligence system?
- What explanation can be given for a decision made by a learning-based artificial intelligence system?
- What are the risks of discriminatory bias against a vulnerable group protected by law?
« Data is food for AI. » - Andrew Ng, associate professor in the Department of Computer Science at Stanford University
In Part 2, we will explore the importance of data processing in machine learning.
- Besse, P. (2020). Détecter, évaluer les risques des impacts discriminatoires des algorithmes d'IA.
- Castro, D., & New, J. (2016). The promise of artificial intelligence. Center for Data Innovation, 115(10), 32-35.
- Gil Press. (2021, 16 juin ). Andrew Ng Launches A Campaign For Data-Centric AI. Forbes.
- Guillemette, J., Bhatnagar, S., Dugdale, M., Bhatnagar, S., & Lasry, N. (2018). Persévérance et réussite scolaire par le forage de données d'éducation.
- James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An introduction to statistical learning (Vol. 112, p. 18). New York: springer.
- Ministère de l’Enseignement supérieur. (30–06-21). Taux d’obtention d’une sanction des études collégiales.
Here’s what we've been up to recently.