What Is Hyperparameter Tuning?

Written by Coursera Staff • Updated on

Learn what hyperparameter tuning is and how you can use different techniques to balance the performance, computational cost, and efficiency of your machine learning model.

[Featured Image] A computer programmer is hyperparameter tuning on a learning machine at work.

When designing a machine learning model, choosing the appropriate hyperparameters is an important step in optimizing your algorithm’s function and performance. This process is “hyperparameter tuning,” which provides a structured approach to testing different hyperparameter values and seeing how they affect your model output. To understand what hyperparameter tuning is and how it works, explore the basics of machine learning models, hyperparameters, tuning techniques, and how to start building your knowledge base.

Understanding hyperparameter tuning

Understanding what machine learning models are and how hyperparameters influence performance provides a basis for exploring hyperparameter tuning techniques. Each technique offers a unique approach to finding the best model, allowing you to tailor your algorithm design based on your data and industry.

What is a machine learning model?

Machine learning models, also known as machine learning algorithms, help computers learn how to recognize patterns and make predictions without explicit instructions. Machine learning models are guidelines that help a computer understand how to perform a certain task. For example, your machine learning model might tell the computer how to take input data (like photos or text) and produce a certain output (like identifying cats in photos or the sentiments contained in a text block). 

What are hyperparameters?

When building a machine learning model, you need to specify certain aspects called hyperparameters. Hyperparameters are like “settings” or “dials” that control how the model learns and processes data. 

What hyperparameters are used for tuning?

You can choose to tune several types of hyperparameters during model optimization in order to find the best fit. Common examples you might explore include:

  • Learning rate: This specifies how quickly your model makes hyperparameter changes based on new information.

  • Learning rate decay: This sets how quickly the learning rate lowers over time, allowing the model to gradually hone in on the best technique.

  • Mini-batch size: This instructs the model on how many data points to process at a time. Choosing the right size helps you strike a balance between accuracy and computational efficiency.

  • Number of epochs: The number of epochs refers to how many times the model is exposed to the entire training data set during the learning process. This number balances training data performance with the ability to generalize to new data.

  • Number of hidden layers: For artificial neural networks, this decides how many layers of “thinking” your model has.

What is hyperparameter tuning?

To design a machine learning model with the most accurate output, you need to optimize the hyperparameters. This involves trial and error, as your model will make adjustments after each training run. Think of hyperparameter tuning like developing the perfect cake recipe: You might try a number of different ingredient ratios and cook times before landing on the perfect combination.

Types of hyperparameter tuning techniques

When you engage in hyperparameter tuning, you can choose a number of different techniques. Some people prefer manual tuning because it allows you to directly see how each adjustment to your model affects the result, but this can be tedious for larger models. Automated algorithms are the most widely used. These include techniques like Bayesian optimization, grid search, and random search. 

Bayesian optimization

Bayesian optimization is based on Bayes’ theorem, which models the likelihood of each event happening based on current knowledge. When used in hyperparameter tuning, this means Bayesian optimization algorithms build a probabilistic model that specifies which combination of hyperparameters is “most likely” to lead to the best outcomes. They then iteratively try new combinations based on the likelihood of a successful outcome.

Advantages

  • More efficient than grid or random searches for finding the best hyperparameter settings

  • Able to develop more accurate and tailored models based on known information for multiple hyperparameter types

Disadvantages

  • Can lack transparency of other hyperparameter determination and optimization methods

  • Requires an understanding of prior probabilities to design optimization function 

  • May lack interpretability and consistency compared to other methods

Grid search

Grid search is a formulaic type of hyperparameter tuning. As you might guess, it involves picking out a grid of hyperparameter values and evaluating each one to find the best fit. This method is fully automated and considered to be an exhaustive search. 

Advantages

  • Simple to implement

  • Easy to expand model hyperparameters beyond your original test values 

  • Allows for more control over your desired hyperparameter combinations

  • Ideal for smaller data sets

Disadvantages

  • Specifying hyperparameter values requires guesswork

  • Computational time requirements are high compared to other methods

  • May be less effective for high-dimensional data

Random search

Random search is loosely based on grid search. However, it selects hyperparameter values randomly during each training iteration rather than going through every single combination. 

Advantages

  • Works well with a small number of hyperparameters

  • Requires less training iterations than grid search

  • Balances performance with computational power

Disadvantages

  • May not perform as well with sensitive machine learning models

  • Noisy data can lead to reduced performance with this method compared to grid search

  • The random combinations might not include the actual best combination

Who uses hyperparameter tuning?

Hyperparameter tuning is an important part of building machine learning models, meaning professionals involved in model building are likely to use hyperparameter tuning techniques. Professionals in this area include roles such as data scientists, researchers, and machine learning engineers.

As a machine learning engineer, your role will vary depending on the company you work for and the exact nature of your position, but you’ll likely work in model design and optimization. You might use artificial intelligence and machine learning techniques to analyze and interpret complex input data. You would then use different techniques, such as hyperparameter tuning, to evaluate your design and refine it as needed to create the model with the best performance.

How to learn more about hyperparameter tuning

To learn more about hyperparameters, building a foundation in machine learning can help you understand how to design and optimize your models more effectively. You can do this through degree programs, online courses, boot camps, and online guided projects. To start, consider exploring machine learning basics such as supervised, unsupervised, semi-supervised, and reinforcement learning. 

Once you’ve mastered the basics, you can start learning about model design and common algorithms such as neural networks, regression analyses, clustering, decision trees, and random forests. Each comes with its own applications and advantages, and the way you optimize each one will be different. Following this, you can explore different training and optimization techniques, including hyperparameter tuning optimization methods. Over time, you’ll find the methods that work best for your applications and professional field.

Explore more about machine learning on Coursera.

Hyperparameters allow you to design “settings” within your machine learning model that optimize how your algorithm learns and performs. To explore more about machine learning models and how to design ones that fit your needs, consider taking an online course or Specialization on Coursera. To begin, consider the Machine Learning Specialization by Stanford and DeepLearning.AI, which introduces you to basic algorithms and model design techniques to help you build effective algorithms.

Keep reading

Updated on
Written by:

Editorial Team

Coursera’s editorial team is comprised of highly experienced professional editors, writers, and fact...

This content has been made available for informational purposes only. Learners are advised to conduct additional research to ensure that courses and other credentials pursued meet their personal, professional, and financial goals.

Build your skills. Boost your career.

Access 10,000+ world-class courses, learn more effectively with Coursera Coach, and earn recognized credentials — all with one subscription.

Unlock 10,000+ world-class courses and Coursera Coach.

Access 10,000+ world-class courses, learn more effectively with Coursera Coach, and earn recognized credentials — all with one subscription.

Learn more