In this case, you can follow a really simple way to do black-box optimization is a random search, and it will explore high dimensional spaces faster than a grid search.
Using random search you will get different values on every dimension each time, in case of grid search you don't.
Bayesian optimization has enough theoretical guarantees, and implementations like Spearmint can help you wrap any script you have.
Hyperband, a novel bandit-based approach for hyperparameter optimization, it provides faster convergence than Naive Bayesian Optimisation because it can run different networks for different numbers of iterations, and Bayesian optimization doesn't support that naively. While it is possible to do better with a Bayesian optimization algorithm that can take this into account, such as FABOLOUS, in practice hyperband is so simple you're probably better using it and watching it to tune the search space at intervals.
To get more insights on Bayesian Optimization and Pytorch Model, study Machine Learning Online Course. Questions based on these topics are often asked in an interview. So, for that one can go through our Machine Learning Interview Questions.
Hope this answer helps you!