Hey, I got bored today so I tried to do something I like ! I did some data science :p

This period is the time where we should pick our specialty for next year, so I tried to find who will be my classmate for next year!

The selection will be based on our choice, our marks, and the available places in each specialty, so here is what I know so far about the selection.

There are two specialties: IREV and ILOG.

IREV will have a maximum of 15 students. It's clear that most of students prefer ILOG and they will put it as a first choice so ILOG will have more students, but there must be students for IREV (They can't have an empty class :v !)

IREV will have at least no less than 12 students and I know for a fact that ILOG will have only one class so they have to make balance between the two specialties, we can't find 12 students in IREV and 30 students in ILOG, for now, I will go with 15 as the average of IREV and the rest will be ILOG around 20-25 students.

IREV will have at least no less than 12 students and I know for a fact that ILOG will have only one class so they have to make balance between the two specialties, we can't find 12 students in IREV and 30 students in ILOG, for now, I will go with 15 as the average of IREV and the rest will be ILOG around 20-25 students.

I did get the students marks from ISAMM website, they are public but I did hide the CIN while making the script for privacy. I did convert the PDF to CSV file using an online website, I did some cleaning and then I imported the files using pandas in python.

An this point I started making some manipulation, I did put the average marks for the students that came from other universities because they didn't study with us the first year so for their marks were NaN, Also I did merge the first year and the second year marks...

Finally I did apply what I said before and I did get the result ^^.

I can't use machine learning here because I don't have many features to depend on and everyone has his own choice so even by getting old data of past generation it won't make a difference. What I really need if I want to use machine learning is to know more about the person like does he like video games, is he into programming, does he prefer design...

If I think about it, I could extract this from each subject mark because pretty sure if someone like programming he will have good marks in programming, if he like games he will have good marks in game development subjects, well this is interesting but it will take a while.

Anyway, I will stop here, one hour of my life or let's say 2, including writing this post were well spent :D

I did upload the script (without the data) to GitHub here: https://github.com/HosniMansour/scripts/tree/master/ILOGOrIREV

