AI & Machine Learning Consultant

Jordy Van Landeghem

World-class AI researcher with a PhD in Computer Science, specializing in Document AI, Agentic Systems, and Generative AI. I help organizations build ambitious AI roadmaps and turn cutting-edge research into production-ready solutions.

10+ Years in AI/ML
3 Advanced Degrees
15+ Publications
Jordy Van Landeghem

Industry Experience

Academic & Research Collaborations

About Me

Bridging world-class research with real-world impact

I am a computer scientist with three advanced degrees from KU Leuven, including a PhD in Engineering Sciences focused on AI-driven document understanding. My research has been published at top-tier venues including ICML, ICCV, WACV, and ICDAR.

Currently, I serve as a Senior Software Engineer in Machine Learning at Instabase, where I lead initiatives in agentic AI systems, LLM/VLM development, and benchmarking frameworks. Previously, I spent seven years at Contract.fit as Lead AI Research Engineer, building production-grade Document AI systems and mentoring the next generation of AI engineers.

My unique combination of deep academic research and hands-on product engineering experience allows me to translate cutting-edge AI advances into practical, scalable solutions that deliver measurable business value.

Education

  • Ph.D. in Engineering Sciences, Computer Science KU Leuven, 2024 Thesis: Intelligent Automation for AI-driven Document Understanding
  • M.Sc. in Artificial Intelligence KU Leuven, 2017 — Magna Cum Laude Specialization: Speech and Language Technology
  • M.A. in Linguistics KU Leuven, 2015 — Magna Cum Laude Specialization: Corpus and Usage-based Methodologies

Experience

10+ years building production AI systems

2024 – now
Full-time

Senior Software Engineer, Machine Learning

Instabase — Remote

  • Designed and shipped DocumentReActAgent: agentic self-correction loop over 1M+ document repositories; drove straight-through processing to 100% by resolving the hard edge cases where document automation typically hits its ceiling.
  • Designed evaluation & benchmarking framework (multi-provider, LLM-as-a-Judge, structured logprobs); drove Gemini model selection and contributed to enterprise deal closure.
  • Authored PRD for Unified Extractor v2 architecture redesign (schema / state / prompts / engines / orchestration separation).
  • Technical lead, Agent Mode team; mentored engineers across Project Accuracy, Agent Mode, and AXIS teams.
  • Published two ICML 2025 technical blog posts on agentic document AI; ranked #1 in Cursor AI productivity company-wide.
2017 – 2024
Full-time · 7 years

Lead AI Research Engineer

Contract.fit — Brussels, Belgium

  • Led end-to-end Document AI engineering (NLP + CV) for insurance, finance, and legal domains across a production SaaS platform.
  • Designed and shipped production-grade ML pipelines for document classification and information extraction, maintained over 7 years of growth.
  • Secured 4 Flemish innovation grants (VLAIO) as lead researcher; co-wrote all applications.
  • Supervised 11 Master's AI/CS thesis internships across KU Leuven and VUB.
  • Translated academic advances (DUDE benchmark, uncertainty estimation) into scalable product features.
2017
Research Intern

Language Modelling Research

Nuance Communications — Aachen, Germany

  • Researched regularisation techniques for RNN language models; implemented biLSTM character-based word embeddings.
2016 – 2017
Research Intern

NLP for Dialogue Systems

Oracle — Barcelona, Spain

  • Investigated Seq2Seq neural networks for chatbot and virtual assistant technology.

Areas of Expertise

Deep technical knowledge across the AI/ML stack

Document AI

End-to-end document understanding systems combining NLP and computer vision. Expert in multimodal architectures, layout analysis, and information extraction from complex, visually-rich documents.

  • OCR & Layout Analysis
  • Document VQA
  • Information Extraction
  • Multi-page Processing

Agentic AI Systems

Design and implementation of autonomous AI agents with reasoning, planning, and self-correction capabilities. Pioneering work on document-centric agents achieving 100% accuracy through reflection loops.

  • ReAct Agents
  • Tool Use & Planning
  • Self-Correction
  • Multi-Agent Systems

LLM/VLM Engineering

Custom large language model development, fine-tuning, and deployment. Deep expertise in vision-language models, RAG architectures, and production-scale inference optimization.

  • Model Fine-tuning
  • RAG Systems
  • Prompt Engineering
  • Knowledge Distillation

Evaluation & Benchmarking

Rigorous AI evaluation methodologies including LLM-as-a-Judge, calibration assessment, and uncertainty quantification. Creator of the DUDE benchmark adopted by the research community.

  • LLM-as-a-Judge
  • Calibration Metrics
  • Uncertainty Quantification
  • Benchmark Design

ML Operations

Production-grade ML pipeline design and implementation. Experience deploying AI systems at scale in finance, insurance, legal, and enterprise document processing domains. Cloud infrastructure expertise across AWS, GCP, and Azure.

  • Pipeline Architecture
  • Model Deployment
  • AWS, GCP & Azure
  • Monitoring & Observability
  • Scalable Inference

Research & Innovation

Published researcher at top AI venues with expertise in translating academic advances into product features. Strong track record of identifying and executing high-impact research directions.

  • Research Strategy
  • Technical Writing
  • Innovation Roadmaps
  • Patent Development

Consulting Services

Partnering with you to unlock AI's full potential

GenAI Prototype Development

Rapid prototyping of generative AI solutions to validate concepts, demonstrate feasibility, and accelerate your path to production.

  • LLM/VLM application development
  • RAG system implementation
  • Custom model fine-tuning
  • Proof-of-concept delivery
Start a Project

Agentic Automation

Design and implementation of autonomous AI agents that reason, plan, and execute complex workflows with minimal human intervention.

  • Agent architecture design
  • Tool integration & orchestration
  • Self-correction mechanisms
  • Production deployment guidance
Explore Automation

Technical Due Diligence

Expert assessment of AI/ML systems, teams, and strategies for investment decisions, acquisitions, or internal audits.

  • Code & architecture review
  • Team capability assessment
  • Technology stack evaluation
  • Competitive positioning analysis
Request Assessment

Projects

Research and engineering that shipped

Benchmark · Dataset

DUDE

Document Understanding Dataset and Evaluation — a large-scale benchmark for real-world document QA spanning extractive, multi-hop, closed-world, and visually-grounded question types. Led design, dataset construction, and the ICDAR 2023 competition. Adopted by the Document AI research community (154 citations).

Agentic AI · Production

DocumentReActAgent

Agentic self-correction loop for large-scale document repositories. As the go-to engineer for last-mile automation challenges, I drove straight-through processing to 100% by tackling the edge cases where conventional approaches hit their ceiling. Designed and shipped at Instabase; integrated into production workflows handling millions of documents.

Open Source · Research

DRAG

Document Retrieval with Agentic Grounding — trains vision-language models on successful agentic search trajectories using SFT and DPO with LoRA fine-tuning. Bridges the gap between human and agent document reasoning strategies.

Selected Publications

Peer-reviewed research at top-tier venues

Google Scholar
ICML 2026 — Spotlight/Oral (top 2%)

Strategic Navigation or Stochastic Search? How Agents and Humans Reason Over Document Collections

Borchmann, L., Van Landeghem, Jordy, et al.

MADQA: 2,250 questions across 800 PDFs — largest human vs. agent head-to-head study in document understanding

ICML 2025

A Novel Characterization of the Population Area Under the Risk Coverage Curve (AURC) and Rates of Finite Sample Estimators

Zhou, Han, Van Landeghem, Jordy, et al.

ICDAR 2025

Where Layout Meets Language

Biescas, A., Biswas, S., Lladós, J., Van Landeghem, Jordy

ICDAR 2024

DistilDoc: Knowledge Distillation for Visually-Rich Document Applications

Van Landeghem, Jordy, et al.

WACV 2024

Beyond Document Page Classification: Design, Datasets, and Challenges

Van Landeghem, Jordy, Biswas, S., Blaschko, M., Moens, M.F.

ICCV 2023

Document Understanding Dataset and Evaluation (DUDE)

Van Landeghem, Jordy, Borchmann, L., Tito, R., et al.

Established benchmark adopted by the Document AI research community

ICDAR 2023

ICDAR 2023 Competition on Document UnderstanDing of Everything (DUDE)

Van Landeghem, Jordy, Stanisławek, T., et al.

IEEE Access 2022

Benchmarking Scalable Predictive Uncertainty in Text Classification

Van Landeghem, Jordy, Blaschko, M., Anckaert, B., Moens, M.F.

ICML 2020 Workshop

Predictive Uncertainty for Probabilistic Novelty Detection in Text Classification

Van Landeghem, Jordy, Blaschko, M., Anckaert, B., Moens, M.F.

Workshop on Uncertainty and Robustness in Deep Learning

Information 2019

Transfer Learning for Named Entity Recognition in Financial & Biomedical Documents

Francis, S., Van Landeghem, Jordy, Moens, M.F.

Talks & Presentations

Invited seminars, conference presentations, and tutorials

DAS 2026
Upcoming · Full-day Tutorial

Parse, Reflect, Retrieve, Compile: An Agent Stack for Enterprise Document AI

Document Analysis Systems 2026 — Industry Tutorial

WACV 2024
Oral Presentation

Beyond Document Page Classification: Design, Datasets, and Challenges

Winter Conference on Applications of Computer Vision, January 2024

ICDAR 2023
Oral Presentation

ICDAR 2023 Competition on Document UnderstanDing of Everything (DUDE)

International Conference on Document Analysis and Recognition, August 2023

Adobe Research
Invited Seminar

DUDE — What's Next?

Document Intelligence Seminar, October 2023

CVC / UAB
Invited Seminar

Calibration Primer for Document AI

Computer Vision Center, Barcelona, June 2023

Grants & Funding

Successfully secured research and innovation funding

2021
Company Innovation Grant

Leveraging Document Structure for Improved Document Understanding

HBC.2021.0787 — Contract.fit

2019
Company Innovation Grant

Development of a Performant and User-friendly API Self-service Portal and World-class Classification Modules

HBC.2019.2376 — Contract.fit

2017
Company Innovation Grant

Self-Learning Platform for Simplifying Data-intensive Client Interactions

HBC.2017.0264 — Contract.fit

Let's Build Something Ambitious

Whether you're exploring AI strategy, need help building a GenAI prototype, or want to discuss how agentic automation can transform your workflows, I'm here to help.

Belgium (EU) — Available Globally
6 Languages: Dutch, English, Spanish, French, German, Portuguese
Download CV