In your problem, you have lots of categorical data. It is not suitable to train a precise machine learning model. To train a machine learning model efficiently, you need to convert categorical data to numbers.
To solve this problem, you can use One hot encoding to encode use words into numbers.
One hot encoding in scikit learn is an efficient way to implement this.
One Hot Encoder:
Encode categorical integer features as a one-hot numeric array. By default, the encoder derives the categories based on the unique values in each feature.
>>> from sklearn.preprocessing import OneHotEncoder
>>> enc = OneHotEncoder(handle_unknown='ignore')
>>> X = [['Male', 1], ['Female', 3], ['Female', 2]]
Hope this answer helps.