Getting Started¶

In this tutorial, we will run an Optuna-powered hyperparameter search using Scikit-Learn's familiar fit() / best_params_ API. Along the way, we will install the package, define an Optuna search space, create an OptunaSearchCV instance, and inspect the results.

Try it interactively¶

OptunaSearchCV Quickstart

Run a fast hyperparameter search and read the best parameters and score.

View · Open in marimo

Prerequisites¶

Python 3.11+ installed
A terminal or command prompt

Installation¶

Choose your preferred package manager:

pipuv

pip install sklearn_optuna

uv add sklearn_optuna

Verify the installation:

import sklearn_optuna

print(sklearn_optuna.__version__)

The output should look something like:

0.1.0a3

Your First Hyperparameter Search¶

Now let's set up a classification problem and find the best regularization strength for a logistic regression model.

Prepare the data¶

We will use Scikit-Learn's make_classification to generate a small synthetic dataset:

from sklearn.datasets import make_classification

X, y = make_classification(
    n_samples=200, n_features=6, n_informative=3,
    n_redundant=0, random_state=42,
)

Define the search space¶

We tell Optuna which hyperparameter to tune and over what range. Here we search for the regularization parameter C on a log scale between 0.01 and 10:

from optuna.distributions import FloatDistribution

param_distributions = {
    "C": FloatDistribution(1e-2, 10.0, log=True),
}

Notice that we use Optuna's FloatDistribution instead of a plain list or range. The log=True argument means trials are sampled uniformly in log-space, which is important for parameters that span several orders of magnitude.

Create and run the search¶

Now we bring it all together with OptunaSearchCV. We pass a Sampler wrapper so that results are reproducible:

import optuna
from sklearn.linear_model import LogisticRegression

from sklearn_optuna import OptunaSearchCV, Sampler

search = OptunaSearchCV(
    LogisticRegression(max_iter=200),
    param_distributions,
    n_trials=10,
    sampler=Sampler(sampler=optuna.samplers.TPESampler, seed=0),
    cv=3,
)
search.fit(X, y)

After a moment, you should see Optuna's trial log. The search runs 10 trials, each evaluating a different value of C using 3-fold cross-validation.

Inspect the results¶

Let's check what the search found:

print(f"Best params: {search.best_params_}")
print(f"Best score:  {search.best_score_:.3f}")

You should see output similar to:

Best params: {'C': 1.234}
Best score:  0.870

The best model is already refitted on the full dataset and ready to use:

predictions = search.best_estimator_.predict(X[:5])
print(predictions)

Notice that best_estimator_ is a regular Scikit-Learn estimator. Everything you normally do with a fitted model (predict, score, serialize) works here too.

What We Built¶

We ran an Optuna-powered hyperparameter search through Scikit-Learn's standard API. Along the way, we:

Installed Sklearn-Optuna and verified it
Defined an Optuna search space with FloatDistribution
Created an OptunaSearchCV instance with a reproducible sampler
Inspected best_params_, best_score_, and best_estimator_

Try Interactive Examples¶

For hands-on learning with interactive notebooks, see the Examples page where you can:

Run code directly in your browser via WebAssembly
Experiment with different parameters
See visual outputs in real-time

Or run locally:

justuv run

just example quickstart

uv run marimo edit examples/quickstart.py

Next Steps¶

How to Configure Samplers: choose the right optimization algorithm
How to Use Callbacks: control when optimization stops
How to Use in Pipelines: tune hyperparameters inside Scikit-Learn pipelines
Concepts and Architecture: understand how OptunaSearchCV works under the hood
Examples: interactive notebooks with more advanced use cases