[Dependabot] Update(deps): Bump transformers from 4.57.3 to 5.1.0 by dependabot[bot] · Pull Request #2665 · pytorch/benchmark

dependabot · 2026-02-05T21:23:06Z

Bumps transformers from 4.57.3 to 5.1.0.

Release notes

v5.1.0: EXAONE-MoE, PP-DocLayoutV3, Youtu-LLM, GLM-OCR

New Model additions

EXAONE-MoE

K-EXAONE is a large-scale multilingual language model developed by LG AI Research. Built using a Mixture-of-Experts architecture, K-EXAONE features 236 billion total parameters, with 23 billion active during inference. Performance evaluations across various benchmarks demonstrate that K-EXAONE excels in reasoning, agentic capabilities, general knowledge, multilingual understanding, and long-context processing.

Add EXAONE-MoE implementations (#43080) by @nuxlear

PP-DocLayoutV3

PP-DocLayoutV3 is a unified and high-efficiency model designed for comprehensive layout analysis. It addresses the challenges of complex physical distortions—such as skewing, curving, and adverse lighting—by integrating instance segmentation and reading order prediction into a single, end-to-end framework.

[Model] Add PP-DocLayoutV3 Model Support (#43098) by @zhang-prog

Youtu-LLM

Youtu-LLM is a new, small, yet powerful LLM, contains only 1.96B parameters, supports 128k long context, and has native agentic talents. On general evaluations, Youtu-LLM significantly outperforms SOTA LLMs of similar size in terms of Commonsense, STEM, Coding and Long Context capabilities; in agent-related testing, Youtu-LLM surpasses larger-sized leaders and is truly capable of completing multiple end2end agent tasks.

Add Youtu-LLM model (#43166) by @LuJunru

GlmOcr

GLM-OCR is a multimodal OCR model for complex document understanding, built on the GLM-V encoder–decoder architecture. It introduces Multi-Token Prediction (MTP) loss and stable full-task reinforcement learning to improve training efficiency, recognition accuracy, and generalization. The model integrates the CogViT visual encoder pre-trained on large-scale image–text data, a lightweight cross-modal connector with efficient token downsampling, and a GLM-0.5B language decoder. Combined with a two-stage pipeline of layout analysis and parallel recognition based on PP-DocLayout-V3, GLM-OCR delivers robust and high-quality OCR performance across diverse document layouts.

[GLM-OCR] GLM-OCR Support (#43391)by @zRzRzRzRzRzRzR

Breaking changes

🚨 T5Gemma2 model structure (#43633) - Makes sure that the attn implementation is set to all sub-configs. The config.encoder.text_config was not getting its attn set because we aren't passing it to PreTrainedModel.init. We can't change the model structure without breaking so I manually re-added a call to self.adjust_attn_implemetation in modeling code

🚨 Generation cache preparation (#43679) - Refactors cache initialization in generation to ensure sliding window configurations are now properly respected. Previously, some models (like Afmoe) created caches without passing the model config, causing sliding window limits to be ignored. This is breaking because models with sliding window attention will now enforce their window size limits during generation, which may change generation behavior or require adjusting sequence lengths in existing code.

🚨 Delete duplicate code in backbone utils (#43323) - This PR cleans up backbone utilities. Specifically, we have currently 5 different config attr to decide which backbone to load, most of which can be merged into one and seem redundant After this PR, we'll have only one config.backbone_config as a single source of truth. The models will load the backbone from_config and load pretrained weights only if the checkpoint has any weights saved. The overall idea is same as in other composite models. A few config arguments are removed as a result.

🚨 Refactor DETR to updated standards (#41549) - standardizes the DETR model to be closer to other vision models in the library.

🚨Fix floating-point precision in JanusImageProcessor resize (#43187) - replaces an int() with round(), expect light numerical differences

🚨 Remove deprecated AnnotionFormat (#42983) - removes a missnamed class in favour of AnnotationFormat.

... (truncated)

Commits

3fa4da7 urllib 3
c781039 and sam
be15dcc fix sam hq
0895df7 v5.1.0
48d10c6 Fix EP post merge (#43730)
6c4f766 Fix T5 v1.1 detection (#43681)
452c179 Docs: fix Training step by removing tokenizer from trainer initialization (#4...
1744f8f Fix scheduler initialization order (#43711)
8dce310 Fix accelerate integration import (#43732)
d75266f 🚨 T5Gemma2 model structure (#43633)
Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR
@dependabot recreate will recreate this PR, overwriting any edits that have been made to it
@dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Bumps [transformers](https://github.com/huggingface/transformers) from 4.57.3 to 5.1.0. - [Release notes](https://github.com/huggingface/transformers/releases) - [Commits](huggingface/transformers@v4.57.3...v5.1.0) --- updated-dependencies: - dependency-name: transformers dependency-version: 5.1.0 dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com>

dependabot bot added dependencies Pull requests that update a dependency file python Pull requests that update python code labels Feb 5, 2026

dependabot bot temporarily deployed to docker-s3-upload February 5, 2026 21:23 Inactive

dependabot bot had a problem deploying to docker-s3-upload February 5, 2026 21:23 Failure

dependabot bot temporarily deployed to docker-s3-upload February 5, 2026 21:23 Inactive

dependabot bot mentioned this pull request Feb 5, 2026

[Dependabot] Update(deps): Bump transformers from 4.57.3 to 5.0.0 #2663

Closed

meta-cla bot added the cla signed label Feb 5, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Dependabot] Update(deps): Bump transformers from 4.57.3 to 5.1.0#2665

[Dependabot] Update(deps): Bump transformers from 4.57.3 to 5.1.0#2665
dependabot[bot] wants to merge 1 commit intomainfrom
dependabot/pip/main/transformers-5.1.0

dependabot bot commented on behalf of github Feb 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

Conversation

dependabot bot commented on behalf of github Feb 5, 2026

v5.1.0: EXAONE-MoE, PP-DocLayoutV3, Youtu-LLM, GLM-OCR

New Model additions

EXAONE-MoE

PP-DocLayoutV3

Youtu-LLM

GlmOcr

Breaking changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants