American Journal of Education and Information Technology
Volume 4, Issue 2, December 2020, Pages: 56-65
Received: Jun. 23, 2020;
Accepted: Jul. 15, 2020;
Published: Aug. 4, 2020
Views 145 Downloads 77
Song Guo, Department of Computing, The Hong Kong Polytechnic University, Hong Kong SAR, China; The Hong Kong Polytechnic University Shenzhen Research Institute, Shenzhen, China
Deze Zeng, School of Computer Science, China University of Geosciences, Wuhan, China
Shifu Dong, School of Computer Science, China University of Geosciences, Wuhan, China
Pedagogical data analysis has been recognized as one of the most important features in pursuing Education 4.0. The recent rapid development of ICT technologies benefits and revolutionizes pedagogical data analysis via the provisioning of many advanced technologies such as big data analysis and machine learning. Meanwhile, the privacy of the students become another concern and this makes the educational institutions reluctant to share their students' data, forming isolated data islands and hindering the realization of big educational data analysis. To tackle such challenge, in this paper, we propose a federated learning based education data analysis framework FEEDAN, via which education data analysis federations can be formed by a number of institutions. None of them needs to direct exchange their students' data with each other and they always keep the data in their own place to guarantee their students' privacy. We apply our framework to analyze two real education datasets via two different federated learning paradigms. The experiment results show that it not only guarantees the students' privacy but also indeed breaks the borders of data island by achieving a higher analysis quality. Our framework can much approach the performance of centralized analysis which needs to collect the data in a common place with the risk of privacy exposure.
Pedagogical Data Analysis Via Federated Learning Toward Education 4.0, American Journal of Education and Information Technology.
Vol. 4, No. 2,
2020, pp. 56-65.
M. Ciolacu, A. F. Tehrani, R. Beer, and H. Popp. Education 4.0 fostering student’s performance with machine learning methods. In 2017 IEEE 23rd International Symposium for Design and Technology in Electronic Packaging (SIITME), pages 438–443, 2017.
“students’ academic performance dataset”. [Online]. Avaliable: https: //www.kaggle.com/aljarah/xAPI-Edu-Data.
“student grade prediction”. [Online]. Avaliable: https://www.kaggle. com/dipam7//student-grade-prediction.
M. F. Masood, A. Khan, F. Hussain, A. Shaukat, B. Zeb, and R. M. Kaleem Ullah. Towards the selection of best machine learning model for student performance analysis and prediction. In 2019 6th International Conference on Soft Computing Machine Intelligence (ISCMI), pages 12– 17, 2019.
S. C. Harris and V. Kumar. Identifying student difficulty in a digital learning environment. In 2018 IEEE 18th International Conference on Advanced Learning Technologies (ICALT), pages 199–201, 2018.
M. Hussain, W. Zhu, W. Zhang, J. Ni, Z. U. Khan, and S. Hussain. Identifying beneficial sessions in an e-learning system using machine learning techniques. In 2018 IEEE Conference on Big Data and Analytics (ICBDA), pages 123–128, 2018.
E. Tanuar, Y. Heryadi, Lukas, B. S. Abbas, and F. L. Gaol. Using machine learning techniques to earlier predict student’s performance. In 2018 Indonesian Association for Pattern Recognition International Conference (INAPR), pages 85–89, 2018.
J. Xu, K. H. Moon, and M. van der Schaar. A machine learning approach for tracking and predicting student performance in degree programs. IEEE Journal of Selected Topics in Signal Processing, 11 (5): 742–753, 2017.
J. Arunrerk W. Punlumjeak, N. Rachburee. Big data analytics: Student performance prediction using feature selection and machine learning on microsoft azure platform. In Journal of Telecommunication, Electronic and Computer Engineering, volume 9, pages 113–117, 2017.
Katy Jordan. “mooc completion rates: The data.”. [Online]. Avaliable at: http://www.katyjordan.com/MOOCproject.html.
Marius Kloft, Felix Stiehler, Zhilin Zheng, and Niels Pinkwart. Predicting mooc dropout over weeks using machine learning methods. pages 60–65, 2014.
N. Kondo, M. Okubo, and T. Hatanaka. Early detection of at-risk students using machine learning based on lms log data. In 2017 6th IIAI International Congress on Advanced Applied Informatics (IIAIAAI), pages 198–201, 2017.
K. J. de O. Santos, A. G. Menezes, A. B. de Carvalho, and C. A. E. Montesco. Supervised learning in the context of educational data mining to avoid university students dropout. In 2019 IEEE 19th International Conference on Advanced Learning Technologies (ICALT), volume 2161- 377X, pages 207–208, 2019.
Song Guo, Deze Zeng. Pedagogical Data Federation toward Education 4.0. In the 6th International Conference on Frontiers of Educational Technologies (ICFET 2020), June 5–8, 2020, Tokyo, Japan. ACM, New York, NY, USA, 5 pages.
Chuan Ma, Jun Li, Ming Ding, Howard H Yang, Feng Shu, Tony QS Quek, and H Vincent Poor. On safeguarding privacy and security in the framework of federated learning. IEEE Network, 2020.
Qiang Yang, Yang Liu, Tianjian Chen, and Yongxin Tong. Federated machine learning: Concept and applications. ACM Trans. Intell. Syst. Technol., 10 (2), January 2019.
Yu, Hsiang-Fu, Hung-Yi Lo, Hsun-Ping Hsieh, Jing-Kai Lou, Todd G. McKenzie, Jung-Wei Chou, Po-Han Chung et al. "Feature engineering and classifier ensemble for KDD cup 2010." KDD cup 11, 2010.
SIGKDD. “sigkdd, kdd cup 2015-predicting dropouts in mooc.”. [EB/OL]. http://www.katyjordan.com/MOOCproject.html.
H Brendan Mcmahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera Y Arcas. Communication-efficient learning of deep networks from decentralized data. pages 1273–1282, 2017.