We use cookies

We use cookies to ensure you get the best experience on our website. For more information on how we use cookies, please see our cookie policy.

Back to home

New Deepseek model drastically reduces resource usage by converting text and documents into images — 'vision-text compression' uses up to 20 times fewer tokens

Source

Tom's Hardware

Published

TL;DR

AI Generated

A new Deepseek AI model, DeepSeek-OCR, converts text into images using 'vision-text compression,' reducing token usage by up to 20 times while maintaining accuracy. The model consists of the DeepEncoder and DeepSeek3B-MoE-A570M decoder, which work together to understand textual context within images with fewer tokens. This approach is beneficial for handling tabulated data, graphs, and visual information in fields like finance, science, and medicine. While maintaining a 97% accuracy rate with less than a 10x token reduction, accuracy drops to 60% with a 20x reduction, indicating diminishing returns but potential cost savings for AI models. The model is available for exploration on platforms like Hugging Face and GitHub.

New Deepseek model drastically reduces resource usage by converting text and documents into images — 'vision-text compression' uses up to 20 times fewer tokens - Tech News Aggregator