top of page
Gird_Background.jpg

About

szia.ai (szia means hello/goodbye in Hungarian) is created by Marton Szel (Director of Data Science at Lynx Analytics), a data scientist since 2008, with experience across telecommunications, government, finance, retail, and life sciences. Over the years, he built models to detect tax fraud, optimize fiber network expansions, cluster SIM cards into households, prove mathematical theorems using AI, and predict cancer treatment outcomes from digital cell data. He leads data science teams mostly across Europe and Asia - including in Singapore, Hong Kong, Shanghai, Düsseldorf, and Budapest - and have also spent time working in San Francisco.

His recent focus is on applied, agentic AI: developing systems that combine large language models with graph-based methods to make retrieval and reasoning more efficient and grounded. He enjoys working on ideas that are slightly ahead of their time.


> szia.ai_ is a space to explore, explain, and apply AI - helping others make sense of it, and building something useful, open, and evolving together.

Projects Behind This Mindset

KEEP SCROLLING

 06 

[ 2023/24, Singapore] Generative AI for Mathematical Theorem Proving

MathlibImportGraph.png

Participation in developing reasoning systems using large language models and hybrid intelligent methods, including graph-based approaches, to assist in formalized mathematical theorem proving. Multiple strategies were explored to help LLMs navigate formal proofs: step-by-step proof generation, heuristic guidance, integration with search and reinforcement learning loops, and the creation of synthetic training data to enhance performance.

 

The work examined how generative AI can support symbolic reasoning - not only by identifying valid proof steps, but also by uncovering novel problem-solving approaches. Challenges included selecting theorems that open pathways for further work (e.g., Erdős-style problems) and addressing the demands of formalizing problems themselves. [picture source]

05

[ 2023/24, Singapore] Graph RAG for Multi Document Retrieval

Image by Ming Han Low

Developed a graph-based Retrieval-Augmented Generation (RAG) system where document chunks form nodes and trainable edges propagate similarity across the graph. This structure retrieves all relevant documents with fewer queries, outperforming standard embedding-only RAG systems.

 

The first version was presented at the NVIDIA GTC 2024 RAG Special Event and deployed in copilot chatbots that learned from agent feedback to refine retrievals. This reduced average response times from six minutes to one while achieving >80% acceptance of answers without modification - a strong benchmark in 2023. The latest iteration integrates ontology-based connections and artificial nodes generated by LLMs, significantly improving multi-document reasoning on benchmarks like MultiHopRAG (~2,500 questions, 609 documents).

04

[ 2021/22, China & Global] Assortment Optimization for a Global Fashion Retailer

Image by Martin Bammer

Developed an assortment optimization framework for a global fashion retailer to maximize store-level revenue within each market. Historical data and external sources were combined to model customer behavior and location context, capturing factors like weather patterns, luxury-brand clustering in premium malls, and the distinct shopping dynamics of city-edge outlets. These inputs shaped "store fingerprints" that guided assortment decisions and enabled accurate sales predictions even for products not previously stocked in a given store.

 

The optimization balanced multiple business constraints, such as limiting the number of assortments and ensuring minimum sales thresholds, while accounting for cross-product effects and addressing challenges in measuring performance during COVID recovery. The approach was first deployed in Chinese stores and later extended to US and EU markets following its success.

03

[ 2018, Germany] Fibre Network Extension Design with AlphaGo-Inspired Algorithms

Image by Elena Popova

Participation in developing an optimization framework for fibre network extension in Germany, targeting maximum expected revenue from new household connections while minimizing deployment costs. The problem was formalized as a Prize-Collecting Steiner Tree on a road network graph, with nodes representing intersections and edges representing streets with associated costs.

 

Early experiments applied Monte Carlo Tree Search (MCTS) techniques inspired by AlphaGo to explore the solution space. This was later replaced by a more effective dual ascent branch-and-bound method, introduced into LynxKite (the graph analytics tool of Lynx) as a new graph optimization function. Business constraints, such as clustering extensions to enable cost-effective equipment and team deployment, were incorporated into the design.

02

[ 2016, Hong Kong] Clustering SIM Cards into Households Using Graphs

Image by Chi Hung Wong

Participation in developing graph-based algorithms to cluster SIM cards into single customer and household views for a telecom provider. Positive and negative edges were constructed to model relationships between MSISDNs, devices, and services - capturing links such as shared addresses, common night locations, frequent co-location patterns, and calling behaviors, while excluding impossible connections (e.g., simultaneous activity in different locations or self-calls).

 

The approach combined behavioral, geographic, and registration data to infer relationships ranging from individual customers to multi-generational households. An optimization process on the resulting graph refined these clusters, balancing weak and strong signals to produce the most probable groupings.

01

[ 2013/14, Hungary] Graph Models for VAT Fraud Detection

Image by Alina Grubnyak

Participation in developing graph-based models to uncover carousel VAT fraud and estimate the illicit gains of hidden beneficiaries. Carousel fraud schemes involve chains of fictitious companies issuing and reclaiming VAT without real economic activity, with profits often concealed downstream in complex networks. Using real-world invoice data, company networks were constructed, and graph algorithms were applied to trace suspicious flows and highlight likely endpoints.

 

In parallel, early-warning models were created on a firm-level knowledge graph to flag high-risk companies at the time of their foundation, leveraging ownership links, shared addresses, and other firm-level features.

bottom of page