Career Profile
Incisive and resourceful Senior AI Engineer with extensive experience in Natural Language Processing (NLP) and Large Language Models (LLMs). Adept at visual storytelling and communicating complex information to diverse audiences. My professional journey spans both academic and industry settings, contributing to projects in both private and public sectors.
Currently, I'm Envisso's AI Lead, where I make use of Generative AI to transform merchant risk into a growth strategy for payments companies. Previously, I held positions as a Marie Skłodowska-Curie Fellow at University College London, where I earned my Ph.D., and as a Machine Learning Researcher at Queen Mary University of London. I also hold an M.Sc. in Information Security and an M.Eng. in Software Engineering.
Experience
- Leading the development and implementation of a Retrieval‑Augmented Generation (RAG) application that is designed to extract, process, and generate merchant information, enabling payments companies to enhance their growth strategies through data‑driven insights.
- Spearheading the company's strategic vision in utilizing Generative AI (GenAI) technologies to transform merchant risk into a growth strategy for payments companies.
- Spearheaded the development of a Retrieval-Augmented Generation (RAG) application using Large Language Models (LLMs) from inception to production for a major bank. Conducted rigorous investigation into the efficacy of several GenAI techniques and tools (e.g., chains, agents, vector databases) to determine the optimal architecture. Designed and implemented an evaluation framework for said application. Consistently followed the latest advancements in GenAI, including reading relevant academic papers, and provided weekly updates on key developments to stakeholders.
- Designed and implemented a prototype solution to address common bottlenecks in Document Understanding projects. Led a small team of Data Scientists and Data Engineers. Leveraged Synthetic Data generation methods to augment limited real-world labeled data.
- Designed and implemented Machine Learning (ML) and Natural Language Processing (NLP) models for the clients of a global Robotic Process Automation (RPA) company (fintech, logistics, filtration).
- Designed and implemented a methodology for de-listing and replacing products for a leading UK-based grocery chain through basket analysis and customer behavior analytics.
- Conducted interviews for prospective data scientist hires.
- Guided and supported the professional development of junior data scientists.
- Obtained security clearance to work on two government client projects, overseeing the collection, cleansing, and analysis of large-scale social platform data.
- Developed the project's classification pipeline, utilizing technologies such as BERT, XGBoost, Random Forest, and Feature Engineering.
- Visualized data and extracted actionable insights using Plotly's Dash for various stakeholders.
- Maintained data quality by building pipelines that ensured the accuracy, completeness, and consistency of deliverables.
- Utilized AWS services (S3, Athena) for data storage, management, and extraction.
- Collaborated with the grant proposal writing team, leveraging academic expertise to secure funding.
- Advised stakeholders on extremist speech detection, social platform data analysis, and far-right ideologies.
- Mentored and coached a Junior Data Scientist.
- Trained, evaluated, and optimized the performance of various Machine Learning and Natural Language Processing models (including XGBoost, Random Forests, BERT, and Logistic Regression) for the identification of online hate speech.
- Investigated the impact of engineered features on the accuracy and efficiency of hate speech classifiers, driving improvements in model performance.
- Assessed the influence of different dataset annotation techniques on the performance and reliability of online hate speech classifiers, contributing to the development of more robust detection methods.
- Awarded a prestigious Horizon 2020 Marie Skłodowska-Curie Fellowship as part of the Privacy & Usability Innovative Training Network.
- Employed Machine Learning and Natural Language Processing techniques (Word Embeddings, LDA, Sentiment Analysis, Computer Vision) to investigate the use of direct-to-consumer genetic testing by far-right groups for promoting racist ideologies.
Media Coverage The Times, StatNews - Conducted three large-scale studies on social media platform datasets (Twitter, Reddit, 4chan), analyzing over 1.3 million comments. Leveraged NoSQL (MongoDB) for data storage, management, and transformation.
- Performed a critical evaluation and synthesis of research within the genome privacy community, focusing on privacy-enhancing technologies for testing, storing, and sharing genomic data.
- Designed and ran a survey on the public perceptions of direct-to-consumer genetic testing.
Acted as a Teaching Assistant for the following courses:
- Database Structures I.
- Information System Analysis and Design.
- Comprehending Data Structures using C and/or C++.
- Learning the Java programming language.
- In 2011 I co-created OSArena.net which was at the time the largest Greek community on Open Source Operating Systems and Software, featuring news, guides, and opinion articles on Linux, Android, Hardware, Hacking, Privacy, and Security. I was an active author until November 2013.
Skills & Proficiency
Expertise
Technical Skills
Libraries: LangChain LlamaIndex HuggingFace LLMFlows Sklearn Pandas
Matplotlib Plotly
Databases: Vector Databases NoSQL SQL
Platforms & Tools: Google Cloud Platform (GCP) Amazon Web Services (AWS) Microsoft Azure Docker Git Bash
Soft Skills
Selected Publications
- Ella Guest, Bertie Vidgen, Mittos Alexandros, Nishanth Sastry, Gareth Tyson, Helen Margetts. An Expert Annotated Dataset for the Detection of Online Misogyny. Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics, 2021 (EACL).
-
Mittos Alexandros, Savvas Zannettou, Jeremy Blackburn, and Emiliano De Cristofaro. 'And We Will Fight For Our Race!' A Measurement Study of Genetic Testing Conversations on Reddit and 4chan. In Fourteenth International AAAI Conference on Web and Social Media, 2020 (ICWSM).
Media Coverage The Times, StatNews
Acceptance Rate: 21%
-
Mittos Alexandros, Savvas Zannettou, Jeremy Blackburn, and Emiliano De Cristofaro. Analyzing Genetic Testing Discourse on the Web Through the Lens of Twitter, Reddit, and 4chan. ACM Transactions on the Web, 2020 (TWEB).
Open-Source Projects
Machine Learning Abusive Speech Detection Feature Engineering
Java Cryptography
Full Stack Development
Awards
Received a Horizon 2020 Marie Skłodowska-Curie fellow scholarship for 3 years to investigate the societal challenges stemming from the rise of personal genomic testing. Acceptance Rate: 6%
Received a scholarship for my M.Sc. studies at the University of the Aegean.