Skip to content

Releases: symflower/eval-dev-quality

v1.0.8

06 Mar 13:58
12b82ce
Compare
Choose a tag to compare

What's Changed

Full Changelog: v1.0.7...v1.0.8

v1.0.7

06 Mar 09:25
d9b0914
Compare
Choose a tag to compare

What's Changed

  • Collect usage metrics of each query to be able to calculate costs by @ahumenberger in #424

Full Changelog: v1.0.6...v1.0.7

v1.0.6

20 Feb 07:54
377f295
Compare
Choose a tag to compare

What's Changed

  • fix, Allow selecting models with attributes for openRouter as well by @ahumenberger in #421

Full Changelog: v1.0.5...v1.0.6

v1.0.5

13 Feb 09:37
99f5feb
Compare
Choose a tag to compare

What's Changed

  • Bump symflower version to v46727 (from 45435) by @Munsio in #418

Full Changelog: v1.0.4...v1.0.5

v1.0.4

12 Feb 13:22
d8e766e
Compare
Choose a tag to compare

What's Changed

Full Changelog: v1.0.3...v1.0.4

v1.0.3

09 Feb 15:35
eb7086b
Compare
Choose a tag to compare

What's Changed

  • Upgrade Go to 1.23.6 (from 1.21.5) by @zimmski in #413
  • fix, Do not allow total tests to be negative for Go by @zimmski in #413

Full Changelog: v1.0.2...v1.0.3

v1.0.2

03 Feb 17:28
d2683d5
Compare
Choose a tag to compare

What's Changed

  • fix, Multiply the "tests-passing" values of transpile tasks because we are transpiling for two languages and therefore we are also running twice the tests by @Munsio in #402
  • Allow to set reasoning_effort for models (e.g. OpenAI's o3-mini) by @zimmski in #408

Full Changelog: v1.0.1...v1.0.2

v1.0.1

27 Jan 16:44
09933af
Compare
Choose a tag to compare

What's Changed

  • Release script for tagging and updating the binary version by @bauersimon in #391
  • Clarify order of tasks for a release by @zimmski in #397
  • fix, Retry resetting repository if "git clean" fails by @ahumenberger in #400
  • Update "github.com/zimmski/osutil" to v1.3.0 to sync with local repository by @zimmski in #401

Full Changelog: v1.0.0...v1.0.1

v1.0.0

22 Jan 14:53
v1.0.0
32de99d
Compare
Choose a tag to compare

Highlights 🌟

  • Spring Boot 🥬 unit test support: evaluating how models generate tests for Spring
  • Code Migration 🔁 task: asking models to migrate i.e. tests from JUnit 4 to JUnit 5

Changes 💡

Details 🔍

  • Update symflower.com with DevQualityEval entries by @zimmski in #339
  • Bump golang.org/x/image from 0.11.0 to 0.18.0 by @dependabot in #340
  • Issue templates for bugs and feature requests by @bauersimon in #342
  • Overhaul bug and feature request issue templates by @bauersimon in #345
  • fix, Input form template does not allow special rendering by @bauersimon in #346
  • Fix typos in Ruby README paths and update to new deep dive by @bauersimon in #348
  • Write run count to CSV report by @bauersimon in #337
  • Script to convert all SVG files to PNG with Inkscape by @bauersimon in #349
  • Clarify wording, logs, debugging regarding symflower fix by @bauersimon in #352
  • Simplify assessment cleanup for tests by @bauersimon in #353
  • Always apply "symflower fix" and refactor "write test" task and LLM prompting to prepare for templates by @bauersimon in #354
  • "Write-test" task with Symflower Smart Test Template as base by @bauersimon in #351
  • Install and run some Go linters by @zimmski in #355
  • Remove category SVG from the report, because we are moving to a leaderboard by @zimmski in #356
  • fix, Use all the rules of Go linter "revive" but exclude some that do not make sense and exit "1" on errors, always by @zimmski in #359
  • Remove categorization and total score of reporting, and the "report" command by @zimmski in #357
  • "Write-test" task with Symflower Smart Test Template as base by @bauersimon in #358
  • VSCode recommended extensions and debugging configuration by @bauersimon in #361
  • fix, Solving Plain once (with or without template) is enough to not get disqualified by @bauersimon in #360
  • Enable Symflower linter by default in VSC by @ruiAzevedo19 in #362
  • Remove evaluation logs for older versions by @bauersimon in #371
  • Mention sponsoring in README by @bauersimon in #372
  • Update README.md with new "support us" message and call-to-action by @kristofhorvath in #373
  • Host CTA graphics directly on GitHub to avoid caching inconsistencies by @bauersimon in #374
  • Support Spring by @bauersimon in #367
  • New task for code migration by @ruiAzevedo19 in #376
  • Remove scoring from within the evaluation and remove and lint unused functions by @bauersimon in #369
  • Lint unused code in GitHub Actions by @bauersimon in #377
  • Remove assessment collection by @bauersimon in #378
  • Script to find max coverage score from result data by @bauersimon in #380
  • JSON, structured logging by @bauersimon in #379
  • Upgrade Symflower to v45435 to include license changes by @zimmski in #393
  • Clarify roadmap/release process by @bauersimon in #388
  • Clean up container images on GitHub by @zimmski in #394
  • fix, Do not clean up the "main" or "latest" container image tags because the point to the latest stable version by @zimmski in #395
  • Build container images on tagging releases by @zimmski in #396
  • Save maximum scores for cases and write CSV per case by @bauersimon in #392

New Contributors

Full Changelog: v0.6.2...v1.0.0

v0.6.2

09 Sep 13:08
bbf6ab9
Compare
Choose a tag to compare

Ruby Support

Fully added Ruby as a new language.

Further merge requests

Full Changelog: v0.6.1...v0.6.2