My name is Konstantinos and this is my website. I was born in Athens, in the early spring of 1999. In 2022, I received a bachelor's degree in computer science from the NKUA's Department of Informatics and Telecommunications.
Nowadays, I am M.Sc. student and member of the AI team at the Department of Informatics - NKUA. Artificial Intelligence and Machine Learning are two of the research areas I currently explore.
Courses: Data Structures & Algorithms in C,
Deep Learning for NLP
AI Team fellow & Research Associate.
R&D for HORIZON-Europe projects (STELAR).
Supervised by Dr. George Papadakis and Pr. Manolis Koubarakis.
Python, NLP, Entity Resolution, Hugging Face, Gensim
Developed a data analytics feature for Vodafone Greece. R&D on user linkage in telecom networks [EU-funded project].
Java, Spring Boot, Docker, Apache Karaf, SQL/NoSQL DBs
An open-source library that leverages Python’s data science ecosystem to build powerful end-to-end Entity Resolution workflows.
2022-now
DescriptionLink Discovery constitutes a crucial task for increasing the connections between data sources in the Linked Open Data Cloud. Part of this task is Entity Resolution (ER), which aims to identify relations between different entity descriptions that pertain to the same real world object. Due to its quadratic time complexity, ER is typically carried out in two steps: first, blocking restricts the computational cost to similar descriptions, and then, matching estimates the actual similarity between them. A plethora of techniques has been proposed for each step. To facilitate their use by researchers and practitioners, we present pyJedAI, an open-source library that leverages Python’s data science ecosystem to build powerful end-to-end ER workflows. We demonstrate how this can be accomplished by both expert and novice users in an intuitive, yet efficient and effective way.
A Winner-Take-All Hashing-Based Unsupervised Model for Entity Resolution Problems. [B. Sc. Thesis]
2021-2022
DescriptionIn this project, we propose an end-to-end unsupervised learning model that can be used for Entity Resolution problems on string data sets. An innovative prototype selection algorithm is utilized in order to create a rich euclidean, and at the same time, dissimilarity space. Part of this work, is a fine presentation of the theoretical benefits of a euclidean and dissimilarity space. Following we present an embedding scheme based on rank-ordered vectors, that circumvents the Curse of Dimensionality problem. The core of our framework is a locality hashing algorithm named Winner-Take-All, which accelerates our models run time while also maintaining great scores in the similarity checking phase. For the similarity checking phase, we adopt Kendall Tau rank correlation coefficient, a metric for comparing rankings. Finally, we use two state-of-the-art frameworks in order to make a consistent evaluation of our methodology among a famous Entity Resolution data set.