Hackathon of IT developers
"Digital solutions for optical recognition"

About project

The company’s development team participated in a hackathon from INTER RAO Energetika and was awarded a special nomination "For the most innovative solution".
YouTube канал
The organizers allocated 3 months for the implementation of the document recognition system solution. The Mindset team proposed their own solution using the "Transformers" models.
Transformer is an architecture of neural networks based on the attention mechanism proposed in the 2017 article "Attention Is All You Need". For transformer processing, the text is converted into a sequence of so—called tokens, which, in turn, are converted into numeric embedding vectors. The advantage of the transformers is that they don’t have recurrent modules and therefore require less time training than architectures such as RNN, LSTM and T. p. for through parallelization. Various versions of transformers have become widespread as the basis of large language models (LLM) — GPT, Claude, LLAMA and others.
О проекте
Creating a document recognition system.
Thanks to a separate nomination, several meetings were held following the hackathon with potential clients.
Task
Business effect
Document recognition using Transformer models, Donut models, Ureader for document recognition, as well as a Streamlit-based interface.
Decision
Technologies
A resource for demonstrating the visual component after learning a neural network. Thanks to Streamlit, the customer can test the neural network before launching the service. Thus, the product can be modified if the results are not satisfactory.
Streamlit
Streamlit
This is a model that can be used to extract text from a given image. This can be useful in various scenarios, for example, when scanning receipts.
Donut Model
This is a study in the field of universal language understanding based on the Multimodel Large Language Model (MLLM), which does not use optical character recognition (OCR). She is able to understand text that occurs visually, for example, on documents, web pages and photographs.
UReader Model
Stages of development
1.
Designing an MVP
3.
Interface implementation
2.
Revision of the solution
Project Features

  • Documents with markup in the form of a table
  • The only ones who have applied the multimodal model
  • The models achieved good results without a sufficient amount of dataset
The project team
Frank Sh.
Manager
Evgeniy M.
Analyst
Victor Sh.
Analyst
Developer
Nikolay D.
Analyst
Mikhail V.
Areas of use
To anyone who needs document recognition as a business process.
Мы на связи
We're in touch!
Email us for cooperation or if you have any questions.