3 Months Free Update
3 Months Free Update
3 Months Free Update
You have used k-means clustering to classify behavior of 100, 000 customers for a retail store. You decide to use household income, age, gender and yearly purchase amount as measures. You have chosen to use 8 clusters and notice that 2 clusters only have 3 customers assigned. What should you do?
Your company has organized an online campaign for feedback on product quality and you have all the responses for the product reviews, in the response form people have check box as well as text field. Now you know that people who do not fill in or write non-dictionary word in the text field are not considered valid feedback. People who fill in text field with proper English words are considered valid response. Which of the following method you should not use to identify whether the response is valid or not?
A fruit may be considered to be an apple if it is red, round, and about 3" in diameter. A naive Bayes classifier considers each of these features to contribute independently to the probability that this fruit is an apple, regardless of the
Suppose you have been given two Random Variables X and Y, whose joint distribution is already known, the marginal distribution of X is simply the probability distribution of X averaging over information about Y. It is the probability distribution of X when the value of Y is not known. So how do you calculate the marginal distribution of X
Your customer provided you with 2. 000 unlabeled records three groups. What is the correct analytical method to use?
You are using one approach for the classification where to teach the agent not by giving explicit categorizations, but by using some sort of reward system to indicate success, where agents might be rewarded for doing certain actions and punished for doing others. Which kind of this learning
Which of the following true with regards to the K-Means clustering algorithm?
Refer to the exhibit.
You are using K-means clustering to classify customer behavior for a large retailer. You need to determine the optimum number of customer groups. You plot the within-sum-of-squares (wss) data as shown in the exhibit. How many customer groups should you specify?
Google Adwords studies the number of men, and women, clicking the advertisement on search
engine during the midnight for an hour each day.
Google find that the number of men that click can be modeled as a random variable with distribution
Poisson(X), and likewise the number of women that click as Poisson(Y).
What is likely to be the best model of the total number of advertisement clicks during the midnight for an hour ?
Refer to the exhibit.
You are building a decision tree. In this exhibit, four variables are listed with their respective values of info-gain.
Based on this information, on which attribute would you expect the next split to be in the decision tree?