Skip to content

[Dependabot] Update(deps): Bump transformers from 4.57.3 to 5.1.0#2665

Open
dependabot[bot] wants to merge 1 commit intomainfrom
dependabot/pip/main/transformers-5.1.0
Open

[Dependabot] Update(deps): Bump transformers from 4.57.3 to 5.1.0#2665
dependabot[bot] wants to merge 1 commit intomainfrom
dependabot/pip/main/transformers-5.1.0

Conversation

@dependabot
Copy link
Contributor

@dependabot dependabot bot commented on behalf of github Feb 5, 2026

Bumps transformers from 4.57.3 to 5.1.0.

Release notes

Sourced from transformers's releases.

v5.1.0: EXAONE-MoE, PP-DocLayoutV3, Youtu-LLM, GLM-OCR

New Model additions

EXAONE-MoE

K-EXAONE is a large-scale multilingual language model developed by LG AI Research. Built using a Mixture-of-Experts architecture, K-EXAONE features 236 billion total parameters, with 23 billion active during inference. Performance evaluations across various benchmarks demonstrate that K-EXAONE excels in reasoning, agentic capabilities, general knowledge, multilingual understanding, and long-context processing.

PP-DocLayoutV3

PP-DocLayoutV3 is a unified and high-efficiency model designed for comprehensive layout analysis. It addresses the challenges of complex physical distortions—such as skewing, curving, and adverse lighting—by integrating instance segmentation and reading order prediction into a single, end-to-end framework.

Youtu-LLM

Youtu-LLM is a new, small, yet powerful LLM, contains only 1.96B parameters, supports 128k long context, and has native agentic talents. On general evaluations, Youtu-LLM significantly outperforms SOTA LLMs of similar size in terms of Commonsense, STEM, Coding and Long Context capabilities; in agent-related testing, Youtu-LLM surpasses larger-sized leaders and is truly capable of completing multiple end2end agent tasks.

GlmOcr

GLM-OCR is a multimodal OCR model for complex document understanding, built on the GLM-V encoder–decoder architecture. It introduces Multi-Token Prediction (MTP) loss and stable full-task reinforcement learning to improve training efficiency, recognition accuracy, and generalization. The model integrates the CogViT visual encoder pre-trained on large-scale image–text data, a lightweight cross-modal connector with efficient token downsampling, and a GLM-0.5B language decoder. Combined with a two-stage pipeline of layout analysis and parallel recognition based on PP-DocLayout-V3, GLM-OCR delivers robust and high-quality OCR performance across diverse document layouts.

Breaking changes

  • 🚨 T5Gemma2 model structure (#43633) - Makes sure that the attn implementation is set to all sub-configs. The config.encoder.text_config was not getting its attn set because we aren't passing it to PreTrainedModel.init. We can't change the model structure without breaking so I manually re-added a call to self.adjust_attn_implemetation in modeling code

  • 🚨 Generation cache preparation (#43679) - Refactors cache initialization in generation to ensure sliding window configurations are now properly respected. Previously, some models (like Afmoe) created caches without passing the model config, causing sliding window limits to be ignored. This is breaking because models with sliding window attention will now enforce their window size limits during generation, which may change generation behavior or require adjusting sequence lengths in existing code.

  • 🚨 Delete duplicate code in backbone utils (#43323) - This PR cleans up backbone utilities. Specifically, we have currently 5 different config attr to decide which backbone to load, most of which can be merged into one and seem redundant After this PR, we'll have only one config.backbone_config as a single source of truth. The models will load the backbone from_config and load pretrained weights only if the checkpoint has any weights saved. The overall idea is same as in other composite models. A few config arguments are removed as a result.

  • 🚨 Refactor DETR to updated standards (#41549) - standardizes the DETR model to be closer to other vision models in the library.

  • 🚨Fix floating-point precision in JanusImageProcessor resize (#43187) - replaces an int() with round(), expect light numerical differences

  • 🚨 Remove deprecated AnnotionFormat (#42983) - removes a missnamed class in favour of AnnotationFormat.

... (truncated)

Commits

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Bumps [transformers](https://github.com/huggingface/transformers) from 4.57.3 to 5.1.0.
- [Release notes](https://github.com/huggingface/transformers/releases)
- [Commits](huggingface/transformers@v4.57.3...v5.1.0)

---
updated-dependencies:
- dependency-name: transformers
  dependency-version: 5.1.0
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot dependabot bot added dependencies Pull requests that update a dependency file python Pull requests that update python code labels Feb 5, 2026
@dependabot dependabot bot temporarily deployed to docker-s3-upload February 5, 2026 21:23 Inactive
@dependabot dependabot bot temporarily deployed to docker-s3-upload February 5, 2026 21:23 Inactive
@meta-cla meta-cla bot added the cla signed label Feb 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla signed dependencies Pull requests that update a dependency file python Pull requests that update python code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants