CareerMap- Mapping Tech Roles With Personality & Skills

Published: 3 April 2023| Version 1 | DOI: 10.17632/5z68cvxssn.1
, Arya Shah


The current research done in the field of Career Guidance system has been very nascent and limited in its approaches. In terms of ML Based approaches required to build an AI Career Guidance System, we require a good amount of labeled data. The requirements for which were not being able to be fulfilled in a single dataset. Hence, we took the ensemble approach of combining datasets to solve the problem of building a Career Guidance system. Based on Literature Review conducted, it has been observed that apart form proficiency skills in various domains and concepts of Computer Science, a candidate’s psychological traits also played an important role in predicting their career. Thus, with sufficient literature reviewed and individual datasets obtained, we form our problem definition as: “To combine datasets of psychological traits and proficiency in technological skills and apply Machine Learning models to predict the career of an individual in the field of Computer Science delivered through an interactive frontend.” The dataset consists of 9180 rows and 28 features. The features include: 1. Database Fundamentals 2. Computer Architecture 3. Distributed Computing Systems 4. Cyber Security 5. Networking 6. Development 7. Programming Skills 8. Project Management 9. Computer Forensics Fundamental 10. Technical Communication 11. AI ML 12. Software Engineering 13. Business Analysis 14. Communication skills 15. Data Science 16. Troubleshooting skills 17. Graphics Designing 18. Openness 19. Conscientiousness 20. Extraversion 21. Agreeableness 22. Emotional Range 23. Conversation 24. Openness to Change 25. Hedonism 26. Self-enhancement 27. Self-transcendence 28. Role


Steps to reproduce

Before combining datasets 1 and 2, there were numerous processes taken to prepare the datasets including the following: Label Encoding the proficiency level of technical skills was done as follows: There are 7 labels (1-7). These 7 labels were categorical and were encoded to numerical as follows: 1 - Not Interested 2 - Poor 3 - Beginner 4 - Average 5 - Intermediate 6 - Excellent 7 - Professional Elimination of non-technical roles: Due to the fact that the dataset also included non-technical roles, the non-technical roles were manually removed in order to conform to the functionality for predicting just computer science roles. Role mapping in dataset 2 to dataset 1 In order to successfully combine the datasets, it was necessary for the predicted roles (dependent variable) to be consistent across both the datasets. It was decided that the roles established in dataset 1 would make up the final dataset since the roles defined in dataset 2 had a greater number of occurrences and the roles defined in dataset 1 were more general. As a result, the roles in dataset 2 were mapped to the roles in dataset 1 in such a way that each role in dataset 2 was mapped to the role in dataset 1 that was the most similar to it. For example the following set of roles in dataset 2 were mapped to their corresponding roles in dataset 1: .NET Developer —> Software Developer Moreover, in order to normalize the dataset, the values of the skill ratings from dataset 1 were divided by the maximum value in order to bring the values to the range of 0-1, as in dataset 2. Merging (includes techniques considered) Several methods of merging were studied and tried: Merging each data row in dataset 1 with the average value of corresponding role in dataset2 Merging each data row in dataset 1 with all data rows with that role in dataset 2. Merging each row in dataset 1 with a random data row with corresponding role in dataset 2. After implementation of all three methods, it was found that option 3 was the most viable given that the other 2 options led to overfitting.


Narsee Monjee Institute of Management Studies University, Mukesh Patel School of Technology Management and Engineering


Machine Learning, Recommendation System, Assessment of Career Development, Assessment of Career Decision-Making, Personality Trait