Ocr Captcha

Ocr Captcha by xiaolv, a image-to-text model with OCR capabilities. Understand and compare OCR features, benchmarks, and capabilities.

Comparison

Feature	Ocr Captcha	Interfaze
Input Modalities	image	image, text, audio, video, document
Native OCR	Yes	Yes
Long Document Processing	No	Yes
Language Support	unknown	162+
Native Speech-to-Text	No	Yes
Native Object Detection	No	Yes
Guardrail Controls	No	Yes
Context Input Size	unknown	1M
Tool Calling	No	Tool calling supported + built in browser, code execution and web search

OCR Capabilities

Feature	Ocr Captcha	Interfaze
Text Bounding Boxes	No	Yes
Confidence Scores	No	Yes
Dense Image Processing	No	Yes
Low Quality Images	No	Yes
Handwritten Text	No	Yes
Charts, Tables & Equations	No	Yes

Scaling

Feature	Ocr Captcha	Interfaze
Scaling	Self-hosted/Provider-hosted with quantization	Unlimited

View model card on Hugging Face

介绍（Introduction）

**验证码识别模型（ocr-captcha）**专门识别常见验证码的模型，训练模型有2个：

1.small:训练数据大小为700MB，约8.4万张验证码图片，训练轮次27轮，最终的精度将近100%，推荐下载这个模型；

2.big:训练数据大小为11G，约135万个验证码图片，训练轮次1轮，最终的精度将近93.95%(由于资源问题，无法训练太久)；

数据分布

1.类型：1. 纯数字型；2. 数字+字母型；3.纯字母型（大小写）

2.长度：4位、5位、6位

数据微调

1.基座模型：基座模型参考达摩院发布的读光-文字识别-行识别模型-中英-通用领域

2.具体微调参考以上链接

模型体验链接

modelscope：验证码识别模型（ocr-captcha）

单独模型链接（modelscope）

1.验证码识别模型（小）-small

2.验证码识别模型（大）-big

快速使用（Quickstart）

代码提供web网页版：myself_train_model.py

from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks
import gradio as gr
import os


class xiaolv_ocr_model():

    def __init__(self):
        model_small = r"./output_small"
        model_big = r"./output_big"
        self.ocr_recognition_small = pipeline(Tasks.ocr_recognition, model=model_small)
        self.ocr_recognition1_big = pipeline(Tasks.ocr_recognition, model=model_big)


    def run(self,pict_path,moshi = "small", context=[]):
        pict_path = pict_path.name
        context = [pict_path]

        if moshi == "small":
            result = self.ocr_recognition_small(pict_path)
        else:
            result = self.ocr_recognition1_big(pict_path)

        context += [str(result['text'][0])]
        responses = [(u, b) for u, b in zip(context[::2], context[1::2])]
        print(f"识别的结果为：{result}")
        os.remove(pict_path)
        return responses,context




if __name__ == "__main__":
    pict_path = r"C:\Users\admin\Desktop\图片识别测试\企业微信截图_16895911221007.png"
    ocr_model = xiaolv_ocr_model()
    # ocr_model.run(pict_path)

联系我们（Contact Us）

如果你想给我们的研发团队和产品团队留言，请通过邮件（2240560729@qq.com）联系我们。