Institutions | About Us | Help | Gaeilge
rian logo

Go Back
user2code2vec: embeddings for profiling students based on distributional representations of source code
Azcona, David; Arora, Piyush; Hsiao, I-Han; Smeaton, Alan F.
In this work, we propose a new methodology to profile individual students of computer science based on their programming design using a technique called embeddings. We investigate different approaches to analyze user source code submissions in the Python language. We compare the performances of different source code vectorization techniques to predict the correctness of a code submission. In addition, we propose a new mechanism to represent students based on their code submissions for a given set of laboratory tasks on a particular course. This way, we can make deeper recommendations for programming solutions and pathways to support student learning and progression in computer programming modules effectively at a Higher Education Institution. Recent work using Deep Learning tends to work better when more and more data is provided. However, in Learning Analytics, the number of students in a course is an unavoidable limit. Thus we cannot simply generate more data as is done in other domains such as FinTech or Social Network Analysis. Our findings indicate there is a need to learn and develop better mechanisms to extract and learn effective data features from students so as to analyze the students' progression and performance effectively.
Keyword(s): Artificial intelligence; Machine learning; user2code2vec; code2vec; Code Embeddings; Distributed Representations; Representation Learning for Source Code; Machine Learning; Computer Science Education
Publication Date:
Type: Other
Peer-Reviewed: Unknown
Language(s): English
Institution: Dublin City University
Citation(s): Azcona, David ORCID: 0000-0003-3693-7906 <>, Arora, Piyush ORCID: 0000-0002-4261-2860 <>, Hsiao, I-Han ORCID: 0000-0002-1888-3951 <> and Smeaton, Alan F. ORCID: 0000-0003-1028-8389 <> (2019) user2code2vec: embeddings for profiling students based on distributional representations of source code. In: The 9th International Learning Analytics & Knowledge Conference, LAK 2019, 4-8 Mar, 2019, Tempe, AZ, USA. ISBN 978-1-4503-6256-6/19/03
Publisher(s): ACM
File Format(s): application/pdf
Related Link(s):,
First Indexed: 2019-01-22 06:14:28 Last Updated: 2020-02-05 06:06:05