Contributing to Sklearn-Optuna¶

Thank you for your interest in contributing to Sklearn-Optuna! This document provides guidelines for contributing to the project.

Code of Conduct¶

We are committed to providing a welcoming and inclusive environment for all contributors. Please be respectful and considerate in all interactions.

Getting Started¶

Prerequisites¶

Python 3.11+
uv (recommended)
just (optional, for task automation)
Git

Development Setup¶

Fork the repository on GitHub
Clone your fork:

git clone https://github.com/YOUR_USERNAME/sklearn-optuna.git
cd sklearn-optuna

Install dependencies:

uv sync --group dev

Install the git hooks (required):

uv run prek install -f

Development Workflow¶

Making Changes¶

Create a new branch:

git checkout -b feature/my-feature

Make your changes
Run tests:

justnoxuv run

just test

uvx nox -s test

uv run pytest

Format and fix code:

justnoxuv run

just fix

uvx nox -s fix

uv run ruff format src tests
uv run ruff check src tests --fix
uv run ty check src

Commit your changes:

git add .
git commit -m "feat: add my feature"

We follow Conventional Commits for commit messages. The format is enforced by a commitizen commit-msg hook, which validates your commit messages automatically.

Valid commit message examples:

feat: add new feature
fix: resolve bug in calculation
docs: update installation guide
chore: update dependencies
test: add tests for new feature

Running Tests¶

Sklearn-Optuna uses pytest with markers to categorize tests into different types:

Fast tests: Unit tests that run quickly without subprocess calls or heavy I/O
Slow tests: Tests marked with @pytest.mark.slow that take longer to execute
Integration tests: Tests marked with @pytest.mark.integration that run subprocesses or test multiple components together

Test Commands¶

Run fast tests only (recommended during development):

justnoxuv run

just test-fast

uvx nox -s test_fast

uv run pytest -m "not slow and not integration"

Run slow and integration tests:

justnoxuv run

just test-slow

uvx nox -s test_slow

uv run pytest -m "slow or integration"

Run all tests:

justnoxuv run

just test

uvx nox -s test

uv run pytest

Run tests with coverage:

justnoxuv run

just test-cov

uvx nox -s test_coverage

uv run pytest --cov=sklearn_optuna --cov-report=html

Run tests across multiple Python versions:

nox

uvx nox -s test

Run example notebook tests:

justnoxuv run

just test-examples

uvx nox -s test_examples

uv run pytest tests/test_examples.py -m example -n auto

This runs all notebooks in the examples/ directory as Python scripts in parallel using pytest-xdist (-n auto). Each notebook is executed non-interactively to validate it runs without errors.

When to Mark Tests as Slow or Integration¶

Mark your tests appropriately to help maintain fast feedback during development:

Use @pytest.mark.slow for tests that:
- Take more than a few seconds to run
- Perform heavy computations
- Make network requests
- Access external resources
Use @pytest.mark.integration for tests that:
- Run subprocess commands
- Test multiple components working together
- Require complex setup or teardown
- Exercise end-to-end workflows
@pytest.mark.example is used in tests/test_examples.py to:
- Validate example notebooks execute without errors
- Run notebooks in the examples/ directory
- Test interactive documentation and tutorials

Example:

import pytest

@pytest.mark.slow
def test_large_computation():
    # Long-running test
    pass

@pytest.mark.integration
@pytest.mark.slow
def test_end_to_end_workflow():
    # Complex integration test
    pass

Test Organization¶

Follow these conventions when writing tests:

Class-based test structure: Group related tests into classes using the Test<Component><Scenario> naming pattern.

Fixture usage: Prefer fixtures from conftest.py over module-level data. See tests/conftest.py for available factories.

Property-based testing: Hypothesis is available for property-based testing of edge cases and invariants.

CI Test Strategy¶

The CI pipeline uses a two-tier testing strategy optimized for fast feedback:

Fast tests (test-fast job): Runs on minimum and maximum Python versions (3.11, 3.14) only:
- Draft PRs: Ubuntu only - Quick feedback in ~2-3 minutes
- Ready PRs/Main: All OS - Ubuntu, Windows, macOS - Cross-platform validation
Full test suite (test-full job): Runs all tests (fast + slow + integration) on Ubuntu across all Python versions (3.11-3.14) when the PR is not in draft mode or on the main branch. This comprehensive validation includes coverage reporting on the minimum supported Python version.

Code Quality¶

Run linters and type checkers:

justnoxuv run

just lint

uvx nox -s lint

uv run ruff check src tests
uv run ty check src

Format code and fix issues:

justnoxuv run

just fix

uvx nox -s fix

uv run ruff format src tests
uv run ruff check src tests --fix
uv run ty check src

Run all quality checks:

justuv run

just check

just fix && just test

Docstring Standards¶

All public functions, methods, and classes require NumPy-style docstrings. Coverage is enforced at 100% by interrogate.

Check docstring coverage:

uvx interrogate src

Required sections (as applicable):

Parameters - All function/method parameters with types and descriptions
Returns - Return value type and description
Raises - Exceptions raised
See Also - Related classes/functions
References - Academic references for algorithms or methods used
Notes - Implementation details, mathematical background
Examples - Usage examples (tested via pytest --doctest-modules)

See Also format:

Use standard numpydoc format with short names:

See Also
--------
OtherClass : One-line description of how it relates.
other_function : Another related object.

Names are linked to their API pages automatically, whether or not you wrap them in backticks. Names that cannot be resolved (a private helper, or a concept rather than an API object) are left as plain text rather than failing the build, so you can reference anything that reads well.

Fully qualified names work too (sklearn_optuna.module.OtherClass), and resolve to the same page as the short form. A member reference (OtherClass.method) links to that member on its class page. A name from another project (for example sklearn.linear_model.Ridge) links to that project's documentation when its inventory is configured in mkdocs.yml.

For hyperlinks, always use Markdown syntax: [text](url).

Glossary¶

A glossary is optional. Create docs/pages/explanation/glossary.md and define terms as a definition list, giving each one an explicit anchor:

Memory buffer { #memory-buffer .autolink }
:   The internal store of recent rows a stateful component maintains.

Step { #step }
:   One timestep.

A term marked .autolink has its first occurrence on every other page turned into a link to its definition. The glossary page is the only place terms are listed, so a definition and its links cannot drift apart.

Opting in is per term because defining a word and advertising it everywhere are different decisions. A glossary is free to define short, common words such as step above, and auto-linking those wherever prose happens to use them is noise. Text inside code, headings and existing links is never touched.

Documentation¶

Build documentation:

justnoxuv run

just build

uvx nox -s build_docs

uv run python docs_build/build.py prebuild && uv run zensical build

Serve documentation locally. just serve and nox -s serve_docs run the preview supervisor (docs_build/serve.py), which watches src/ and regenerates the API pages when you add or change a public symbol, so it appears in the preview without a restart. Raw zensical serve still works but is a static preview: it does not regenerate the API pages on a source edit, because that regeneration is not tied to the documentation engine.

justnoxuv run (static preview)

just serve

uvx nox -s serve_docs

uv run zensical serve

Empty site with no error? Check your inotify limits

If just build/just serve finishes successfully but produces an empty site (no pages, no error), the documentation engine could not register the source files to watch: your machine's inotify instances are exhausted, which is common on a desktop running an editor plus other file watchers. Raise the limit and rebuild:

sudo sysctl fs.inotify.max_user_instances=512
sudo sysctl fs.inotify.max_user_watches=524288

Continuous integration and Read the Docs run in fresh environments and are not affected; this only bites busy local machines.

View all available commands:

just --list

Adding Examples¶

All examples are interactive marimo notebooks that combine code, markdown, and visualizations.

Creating a Notebook¶

Create a new marimo notebook in examples/<name>.py:

justuv run

just example <name>.py

uv run marimo new examples/<name>.py

Required Structure¶

Notebooks serve tutorials or how-to guides only - never explanation or reference. The structure depends on the quadrant:

Tutorial notebooks (category: tutorial):

Title: # In this notebook, we will [goal]
Prerequisites: One-liner stating required prior knowledge
Numbered sections: ## 1. Section Name, ## 2. Section Name, etc. with visible output every cell
What We Built: Closing section summarizing what was accomplished and linking to next steps

How-to notebooks (category: how-to):

Title: # How to [Verb] [Object]
Prerequisites: One-liner stating required prior knowledge
Numbered sections: ## 1. Section Name, ## 2. Section Name, etc. with action-only prose
No closing summary - the notebook ends after the last step

Example intro cell (tutorial):

# Your First Pipeline

In this notebook, we will build a small Sklearn-Optuna pipeline end to end
and inspect what it produces.

**Prerequisites:** Python 3.11+ and basic familiarity with sklearn_optuna.

Example intro cell (how-to):

# How to Handle Missing Values

This notebook shows how to configure sklearn_optuna to drop incomplete
records before processing.

**Prerequisites:** Familiarity with the
quickstart ([View](../../examples/quickstart/) · [Open in marimo](https://marimo.app/github.com/stateful-y/sklearn-optuna/blob/5457d74b3a3b550f00d3efad26b143b2a31cebd5/examples/quickstart.py)).

Marimo Cell Conventions¶

Use hide_code=True on all markdown cells, import cells, and utility/helper cells
Use r"""...""" (raw triple-quoted strings) for markdown cell content

All notebooks declare dependencies using PEP 723 inline script metadata at the top of the file:

# /// script
# requires-python = ">=3.11"
# dependencies = [
#     "plotly",
#     "scikit-learn",
# ]
# ///

Dependencies are sorted alphabetically and only list third-party packages actually imported by the notebook.
marimo itself is NOT listed as a dependency (it is the runner, not a dependency of the script).
To add a dependency: uv add --script notebook.py <package> or edit the header manually.
To run in an isolated sandbox: uv run marimo edit --sandbox notebook.py.
Group all imports into a single hidden cell after the metadata header

Content Guidelines¶

Gallery metadata: Every example notebook should include a __gallery__ variable defining title, description, and category ("tutorial" or "how-to") for the example gallery. Add a companion key pointing to the matching doc page path when one exists.
Markdown density: Each numbered section should open with a short markdown cell (one to two sentences) before any code cells. Tutorial sections may be slightly longer; how-to sections should be action-only.
No emojis: Do not use emojis anywhere in notebooks whether it is in headings, content bullets, or concluding remarks.
API cross-links: When mentioning sklearn_optuna classes or functions in markdown cells, wrap them in backtick-link syntax pointing to the API page.
Voice: Tutorials use "we" (first-person plural). How-to guides use imperative or conditional imperatives ("If you need X, pass Y").

Testing and Documentation¶

Run the example test suite to verify your notebook passes:

justnoxuv run

just test-examples

uvx nox -s test_examples

uv run pytest tests/test_examples.py -m example

Add a link to your example in docs/pages/examples/index.md:

- [Example Name](../examples/<name>/) - Brief description

The build's prebuild step (docs_build/build.py prebuild) exports notebooks to HTML before the site is built; the export itself lives in docs_build/_notebooks.py, which you can also run on its own with uvx nox -s build_steps when you want to re-export without building the whole site. All notebooks in examples/ are automatically discovered and tested by test_examples.py using pytest's parametrization feature, which runs them in parallel for fast validation.

Before You Open a PR¶

Run just test-fast - all fast tests pass
Run just fix - code is formatted and linted
Write or update tests for your changes
If you changed docs, run just serve and verify they render
Use conventional commit messages
Keep the PR focused on a single concern

Submitting Changes¶

Push your changes to your fork:

git push origin feature/my-feature

Open a Pull Request on GitHub
Ensure all CI checks pass
Wait for review and address any feedback

Pull Request Guidelines¶

Write clear, descriptive PR titles following Conventional Commits
Include a description of the changes
Add tests for new functionality
Update documentation as needed
Ensure all tests pass
Keep PRs focused and atomic

Commit Message Convention¶

We use Conventional Commits enforced by commitizen:

feat: - New features (triggers minor version bump)
fix: - Bug fixes (triggers patch version bump)
docs: - Documentation changes
style: - Code style changes (formatting, etc.)
refactor: - Code refactoring
test: - Adding or updating tests
chore: - Maintenance tasks
perf: - Performance improvements
ci: - CI/CD changes

Breaking changes: Add ! after the type or add BREAKING CHANGE: in the footer to trigger a major version bump.

Example with scope:

git commit -m "feat(api): add new endpoint for user data"

Example with breaking change:

git commit -m "feat!: redesign authentication system

BREAKING CHANGE: authentication now requires API keys instead of passwords"

The commit-msg hook validates your commit messages and prevents commits that don't follow the convention. CI validates them again on single-commit PRs, which is the case where your commit message, not the PR title, becomes the squash commit and lands in the changelog. On a multi-commit PR the PR title ships instead, and the individual commit messages are free-form.

Release Process¶

Maintainers only

The release process is managed by project maintainers. Contributors do not need to create releases. Open PRs and a maintainer will handle versioning and publishing.

Releases are fully automated through GitHub Actions when a new tag is pushed, with a manual approval gate before publishing to PyPI to ensure quality control.

graph LR
    A[Push Tag<br/>v*.*.*] --> B[changelog.yml]
    B --> C[Generate<br/>CHANGELOG.md]
    B --> D[Build Package<br/>validation]
    C --> E[Create PR]
    E --> F[Review & Merge<br/>PR]
    F --> G[publish-release.yml]
    G --> H[Create GitHub<br/>Release]
    H --> I{Manual<br/>Approval}
    I -->|Approve| J[Publish to PyPI]
    style I fill:#f59e0b,stroke:#333,stroke-width:2px,color:#fff
    style J fill:#10b981,stroke:#333,stroke-width:2px,color:#fff

How It Works¶

Tag a release (signed with gitsign keyless Sigstore signing):

bash git tag -s v0.2.0 -m "Release v0.2.0" git push origin v0.2.0

One-time gitsign setup so git tag -s signs via Sigstore, with no long-lived GPG key to manage:

bash git config --local gpg.x509.program gitsign git config --local gpg.format x509 git config --local tag.gpgSign true

The first signing authenticates via your identity provider in a browser; verify a tag with gitsign verify-tag. Signed tags complement the PEP 740 artifact attestations the publish workflow already produces, so both the tag and the published artifacts are verifiable.
Automated changelog workflow (changelog.yml):
- Generates changelog from conventional commits using git-cliff
- Creates a Pull Request with the updated CHANGELOG.md
- Builds the package distributions (wheels and sdist) for immediate validation
- Stores distributions as workflow artifacts (reused later to avoid rebuilding)
Review and merge the changelog PR:
- A maintainer reviews the generated changelog
- Once approved, merge the PR to main
Automated release workflow (publish-release.yml):
- Creates a GitHub Release with generated release notes
- Attaches distribution files to the release
- Waits for manual approval before proceeding to PyPI
Manual approval for PyPI publishing:
- Designated reviewers receive a notification
- Review the GitHub Release to verify everything is correct
- Approve the deployment to publish to PyPI
- Package is published using Trusted Publishing (OIDC, no tokens needed)
Release notes generation:
- All commits since the last tag are analyzed
- Commits are grouped by type (Added, Fixed, Documentation, etc.)
- Only commits following conventional format are included
- Breaking changes are highlighted

Version Numbering¶

This project uses Semantic Versioning:

Major (1.0.0): Breaking changes
Minor (0.1.0): New features (backward compatible)
Patch (0.0.1): Bug fixes (backward compatible)

Use conventional commits to communicate the type of change, and select the appropriate version number when tagging.

Questions?¶

If you have any questions, feel free to:

Thank you for contributing! 🎉