Releases: symflower/eval-dev-quality
Releases · symflower/eval-dev-quality
v1.0.8
What's Changed
- Fetch total costs from OpenRouter after query by @ahumenberger in #425
Full Changelog: v1.0.7...v1.0.8
v1.0.7
What's Changed
- Collect usage metrics of each query to be able to calculate costs by @ahumenberger in #424
Full Changelog: v1.0.6...v1.0.7
v1.0.6
What's Changed
- fix, Allow selecting models with attributes for openRouter as well by @ahumenberger in #421
Full Changelog: v1.0.5...v1.0.6
v1.0.5
v1.0.4
What's Changed
- Update Roadmap Template with Deep Dive Template by @bauersimon in #417
- Replace "Inkscape" with "svgexport" by @Munsio in #416
Full Changelog: v1.0.3...v1.0.4
v1.0.3
v1.0.2
What's Changed
- fix, Multiply the "tests-passing" values of transpile tasks because we are transpiling for two languages and therefore we are also running twice the tests by @Munsio in #402
- Allow to set reasoning_effort for models (e.g. OpenAI's o3-mini) by @zimmski in #408
Full Changelog: v1.0.1...v1.0.2
v1.0.1
What's Changed
- Release script for tagging and updating the binary version by @bauersimon in #391
- Clarify order of tasks for a release by @zimmski in #397
- fix, Retry resetting repository if "git clean" fails by @ahumenberger in #400
- Update "github.com/zimmski/osutil" to v1.3.0 to sync with local repository by @zimmski in #401
Full Changelog: v1.0.0...v1.0.1
v1.0.0
Highlights 🌟
- Spring Boot 🥬 unit test support: evaluating how models generate tests for Spring
- Code Migration 🔁 task: asking models to migrate i.e. tests from JUnit 4 to JUnit 5
Changes 💡
- Development & Management 🛠️
- Linters in development environment and CI #18 #355 #359 #377 @zimmski
- Bump golang.org/x/image from 0.11.0 to 0.18.0 #340 #@dependabot
- VSCode recommended extensions and debugging configuration #361 @bauersimon
- Enable Symflower linter by default in VSC #362 @ruiAzevedo19
- Script for tagging and releasing #344 #391 @bauersimon
- Add Symflower License in GitHub Actions so the CI works again #389 #393 @zimmski
- Clean up old container images #394 #395 @zimmski
- Tag container images of releases #396 @zimmski
- Documentation 📚
- Add issue templates for bug and feature requests #338 #342 #345 #346 @bauersimon
- Improve wording of templates and documentation #343 @bauersimon
- Update symflower.com with DevQualityEval entries #339 #348 @zimmski
- Roadmap issue template clarification on workflow and payment page #388 @bauersimon
- Roadmap issue template clarification on how to use the roadmap issue itself to create release notes #397
- Evaluation ⏱️
- refactor, Clarify wording, logs, debugging regarding symflower fix #352 @bauersimon
- refactor, Simplify assessment cleanup for tests #353 @ruiAzevedo19
- Cleanup logging and log to JSON for easier parsing #379 @bauersimon
- fix, Solving Plain once (with or without template) is enough to not get disqualified #360 @bauersimon
- Models 🤖
- none
- Reports & Metrics 🗒️
- Write run count to CSV report #337 @bauersimon
- Remove category SVG from the report, because we are moving to a leaderboard #356 @zimmski
- Remove categorization and total score of reporting, and the "report" command #357 @zimmski
- Remove scoring from within the evaluation and remove and lint unused functions #369 @bauersimon
- Remove assessment collection #378 @bauersimon
- Max scores per case in repository #390 #392 @bauersimon
- Operating Systems 🖥️
- none
- Tools 🧰
- Script to convert all SVG files to PNG with Inkscape #349 @bauersimon
- Script to find max coverage score from result data #380 @bauersimon
- Tasks 🔢
- "Write-test" task with Symflower Smart Test Template as base #350 #351 #354 #358 @bauersimon
- Spring Boot "write-test" #365 #367 @bauersimon
- Migration from JUnit4 to JUnit5 #375 #376 @ruiAzevedo19
Details 🔍
- Update symflower.com with DevQualityEval entries by @zimmski in #339
- Bump golang.org/x/image from 0.11.0 to 0.18.0 by @dependabot in #340
- Issue templates for bugs and feature requests by @bauersimon in #342
- Overhaul bug and feature request issue templates by @bauersimon in #345
- fix, Input form template does not allow special rendering by @bauersimon in #346
- Fix typos in Ruby README paths and update to new deep dive by @bauersimon in #348
- Write run count to CSV report by @bauersimon in #337
- Script to convert all SVG files to PNG with Inkscape by @bauersimon in #349
- Clarify wording, logs, debugging regarding symflower fix by @bauersimon in #352
- Simplify assessment cleanup for tests by @bauersimon in #353
- Always apply "symflower fix" and refactor "write test" task and LLM prompting to prepare for templates by @bauersimon in #354
- "Write-test" task with Symflower Smart Test Template as base by @bauersimon in #351
- Install and run some Go linters by @zimmski in #355
- Remove category SVG from the report, because we are moving to a leaderboard by @zimmski in #356
- fix, Use all the rules of Go linter "revive" but exclude some that do not make sense and exit "1" on errors, always by @zimmski in #359
- Remove categorization and total score of reporting, and the "report" command by @zimmski in #357
- "Write-test" task with Symflower Smart Test Template as base by @bauersimon in #358
- VSCode recommended extensions and debugging configuration by @bauersimon in #361
- fix, Solving Plain once (with or without template) is enough to not get disqualified by @bauersimon in #360
- Enable Symflower linter by default in VSC by @ruiAzevedo19 in #362
- Remove evaluation logs for older versions by @bauersimon in #371
- Mention sponsoring in README by @bauersimon in #372
- Update README.md with new "support us" message and call-to-action by @kristofhorvath in #373
- Host CTA graphics directly on GitHub to avoid caching inconsistencies by @bauersimon in #374
- Support Spring by @bauersimon in #367
- New task for code migration by @ruiAzevedo19 in #376
- Remove scoring from within the evaluation and remove and lint unused functions by @bauersimon in #369
- Lint unused code in GitHub Actions by @bauersimon in #377
- Remove assessment collection by @bauersimon in #378
- Script to find max coverage score from result data by @bauersimon in #380
- JSON, structured logging by @bauersimon in #379
- Upgrade Symflower to v45435 to include license changes by @zimmski in #393
- Clarify roadmap/release process by @bauersimon in #388
- Clean up container images on GitHub by @zimmski in #394
- fix, Do not clean up the "main" or "latest" container image tags because the point to the latest stable version by @zimmski in #395
- Build container images on tagging releases by @zimmski in #396
- Save maximum scores for cases and write CSV per case by @bauersimon in #392
New Contributors
- @dependabot made their first contribution in #340
- @kristofhorvath made their first contribution in #373
Full Changelog: v0.6.2...v1.0.0
v0.6.2
Ruby Support
Fully added Ruby as a new language.
- Mistakes repository for Ruby by @ruiAzevedo19 in #316
- Transpile repository for Ruby by @ruiAzevedo19 in #318
- Infer the language we want to transpile from the file extension, so other languages are supported for transpilation by @ruiAzevedo19 in #317
- Define the mistakes logic for Ruby by @ruiAzevedo19 in #326
- fix, Use another example for the missing import package, since the current one does not work because Ruby auto-loads the JSON module by @ruiAzevedo19 in #332
- Finalize Ruby support & @bauersimon by @ruiAzevedo19 in #327
Further merge requests
- Throw an error when trying to use a configuration file with containerized runtimes by @Munsio in #331
- Set up ruby inside the Github Actions Workflow by @Munsio in #333
- v0.6 results by @Munsio& @bauersimon in #324
Full Changelog: v0.6.1...v0.6.2