Customer Stories

Streamlining Mortgage Processing with Custom Generative AI Solutions

Client
Willow
Industry
Mortgage
Partners
Foundation model
End products
End products
Applications
Tax form information extraction
Near 100% extraction accuracy

Minimal errors in data extraction

10x faster document extraction

Time for an AI Agent to get the correct answer

90% 

Reduction in response time

Willow

Challenge

  • Manual extraction of information from various tax forms is time-consuming and error-prone;
  • The need for high accuracy and reliability in data extraction and strict regulatory standards in finance make automation difficult;
  • Complex tax documents and the need for precise data extraction presented big challenges in automating the process without AI;
  • Existing automation solutions don’t offer the precision and consistency needed to extract pages and document details accurately.

Solution

  • Development of a custom Document AI model to automate classification and extraction processes;
  • Implementation of a Document Embedding Model for precise document and page classification;
  • Use of a vector database to enhance the speed and accuracy of document classification;
  • Training with specific tax forms and schedules to ensure high extraction accuracy;
  • Automated extraction of key information from tax forms and schedules into structured JSON outputs.

Results

  • Creation of a complex geometric algorithm to accurately identify and extract relevant information;
  • Near 100% extraction accuracy with minimal errors in data extraction;
  • 10x faster document extraction speed enables rapid data processing;
  • 90% reduction in analyst time, significantly enhancing operational efficiency.

Summary

This client is a 3rd party loan partner in mortgage processing. They faced challenges with manual data extraction from various tax forms and schedules, which was time-consuming and error-prone. To resolve these issues, they partnered with us to develop a custom AI-driven solution.

Our Generative AI solution uses advanced document embedding models and vector databases to classify and extract data efficiently and accurately. This system accelerates document extraction 10x times and reduces analyst involvement by 90%, achieving near-perfect extraction accuracy.

This customized AI solution now enhances their document processing capabilities. It allows them to handle tasks quicker and more efficiently with less human involvement. By automating the extraction and classification of data with AI, they can speed up the whole process more than other available methods. 


The Need for Advanced AI Solutions in Mortgage Processing

Mortgage processing involves handling vast amounts of customer data essential for making lending decisions. This data comes from various documents submitted by customers, such as bank statements, tax returns, schedules, and more.

U.S. tax returns detail income, expenses, and other financial information over a fiscal year. These documents often include structured elements like tables, forms, headers, footers, and labels, each requiring precise data extraction.

Traditionally, their classification and extraction required extensive manual input. However, manual extraction from tax forms was slow, exhausting, and prone to mistakes, delaying further processing steps and decision-making.

The reliance on non-AI systems that lacked contextual understanding and advanced learning capabilities made accurate document processing nearly impossible. The mortgage industry's strict regulations demand high accuracy and reliability, which off-the-shelf automation systems often fail to meet.

Faced with these issues, the client needed a custom Generative AI solution that could handle this processing complexity. They partnered with us to create an application to automate their document-handling workflow.

Deploying Customized AI to Enhance Accuracy and Accelerate Document Processing 

We opted for a solution that leverages an Optical Character Recognition (OCR) system and a geometric extraction technique to develop an API. This API accepts a PDF file as input, which may contain one or multiple tax returns. It then intelligently identifies each page of the tax return and performs geometric extraction of predefined lines. That means our application can even process incomplete tax forms.

To tackle the specific challenges faced by the client, we created a system to automate the classification and extraction process with the following workflow:

  • Customers submit tax forms

For training, we used completed tax forms (1040, 1065, 1120, 1120s, 4562, 8825) and schedules (1040 Schedule C, 1040 Schedule D, 1040 Schedule E, 1040 Schedule F, 1065 Schedule K1, 1120 Schedule K1) given to us by the client. 

  • Our application classifies each page

We first extracted the text using OCR and then calculated its embedding. This embedding is compared to pre-calculated embeddings to classify the page. We chose a well-known closed-source vector database to store calculated embeddings to speed up this process. 

  • Our application then extracts relevant information the client wants to be extracted

E.g., line item data such as name, address, customer number, etc.

During the training process, we found it difficult to properly identify the correct information to extract from the page, so we created a complex geometric algorithm to identify and extract relevant information accurately.  

  • Our application then returns relevant information in structured JSON output

Line items that were extracted in a JSON output were key-value pairs (e.g., the key is “first name“ and the value is “John“).

This process required extensive customization to satisfy the customer’s specific needs. We opted for JSON, an open standard format that employs human-readable text to store and transmit data. This choice enabled us to effectively structure the extracted information, ensuring straightforward storage, analysis, and conversion.

Enhanced Accuracy and Speed Transform Mortgage Processing Workflow

We developed a custom Generative AI solution that simplifies data interaction, enhances processing accuracy, and reduces response times. Our advanced geometric algorithm precisely identifies and extracts relevant information, achieving nearly 100% accuracy in data extraction.

This acceleration in document processing is faster than other off-the-shelf solutions and increases the speed of operations 10x, marking significant time savings. Analyst involvement has been reduced by 90%, which dramatically enhances operational efficiency and frees up analysts' time to allocate to other tasks.

As the client completes final testing, we are confident the impact on mortgage processing industry practices will become more evident. Using custom AI solutions to meet our clients' needs demonstrates how automation plays a game-changing role in mortgage processing.

Client
Willow
Industry
Mortgage
Foundation Model
Product Types
Applications
Tax form information extraction

Ready to develop your own Generative AI model?