Skip to content

v0.4.0: Export as HF `datasets.Dataset`; `unstructured` preprocessing task

Pre-release
Pre-release

Choose a tag to compare

@rmitsch rmitsch released this 25 Jan 18:57
· 196 commits to main since this release

✨ New features and improvements

  • Support for exporting task results in datasets.Dataset format for easy distillation/model training (#63)
  • Add new task: preprocessing documents with unstructured (#61)
  • Introduced strict mode raising errors on unsuccessful result parsing (#57)
  • Use reasoning traces/CoT for existing tasks (#59)
  • Simplify serialization implementation for tasks (#61)

🔴 Bug fixes

  • Fixed bugs in serialization mechanism (#61)

⚠️ Backwards incompatibilities

  • tasks.parsing and tasks.chunkers have been merged into tasks.preprocessing

📖 Documentation and examples

-

👥 Contributors

@rmitsch


Full Changelog: v0.3.0...v0.4.0