0 votes
1 view
in AI and Deep Learning by (28.1k points)

I am new to Artificial Intelligence. I understand K nearest neighbor algorithm and how to implement it. However, how do you calculate the distance or weight of things that aren't on a scale?

For example, the distance of age can be easily calculated, but how do you calculate how near is red to blue? Maybe colors are a bad example because you still can say the use of the frequency. How about a burger to pizza to fries for example?

I got a feeling there's a clever way to do this.

Thank you in advance for your kind attention.

Can I do it this way? Let's say I am using my KNN algorithm to do a prediction for a person whether he/she will eat at my restaurant that serves all three of the above food. Of course, there are other factors but to keep it simple, for the field of favorite food, out of 300 people, 150 loves the burger, 100 loves pizza, and 50 loves fries. Common sense tells me favorite food affect peoples' decision on whether to eat or not.

So now a person enters his/her favorite food like burger and I am going to predict whether he/she's going to eat at my restaurant. Ignoring other factors, and based on my (training) previous knowledge base, common sense tells me that there's a higher chance the k nearest neighbors' distance for this particular field favorite food is nearer as compared to if he entered pizza or fries.

The only problem with that is that I used probability, and I might be wrong because I don't know and probably can't calculate the actual distance. I also worry about this field putting too much/too little weight on my prediction because the distance probably isn't to scale with other factors (price, time of day, whether the restaurant is full, etc that I can easily quantify) but I guess I might be able to get around it with some parameter tuning.

Oh, everyone put up a great answer, but I can only accept one. In that case, I'll just accept the one with the highest votes tomorrow. Thank you all once again.

1 Answer

0 votes
by (57.5k points)

Represent all your data (food) for which you collect as a "dimension" (or a column in a table). After that record all the likes from all the person from whom you have collected the data, and place the results in a table like a table given below:

image

Now, given a new person, with information about some of the foods he likes, you can measure similarity to other people using a simple measure such as the Pearson Correlation Coefficient, or the Cosine Similarity, etc.

Now you have a way to find K nearest neighbors and make some decisions.

Welcome to Intellipaat Community. Get your technical queries answered by top developers !


Categories

...