Project Overview
This collaborative project focused on creating a comprehensive
European database in the field of artificial intelligence.
Our team played a crucial role in collecting and organizing
data by leveraging web scraping techniques. Using powerful tools
like BeautifulSoup and Selenium in Python, we extracted job offers
from Glassdoor, carefully curating a substantial dataset for further analysis.
With thousands of job offers at our disposal, we embarked on data
cleaning and preprocessing to ensure the reliability and quality
of the collected information. This involved removing duplicates,
handling missing values, and standardizing the data to create a
robust and usable database.
To uncover valuable insights from this extensive dataset, we
employed various regression techniques and classification models.
Utilizing the power of R, we conducted rigorous analyses, including
logistic regression, linear regression, and other statistical methods.
These analyses aimed to identify meaningful patterns, correlations,
and trends within the dataset, enabling us to gain deeper insights
into the AI job market.
In parallel, we collaborated with a dedicated team in natural
language processing (NLP) to develop a sophisticated program.
This NLP solution efficiently extracted relevant information
from job descriptions, allowing us to focus on the specific
aspects that were crucial to our project objectives. This
collaboration enhanced the quality and relevance of the extracted
data, providing additional value to our database.
Although our team's involvement concluded at the end of the academic year, the project continued under the guidance of other students. Our contributions in web scraping using BeautifulSoup and Selenium, as well as data cleaning and analysis using R, laid a strong foundation for the subsequent stages of the project. This initiative showcased our ability to work collaboratively, utilize cutting-edge technologies, and address complex challenges in the field of artificial intelligence.