A Data Scientist is developing a machine learning model to predict future patient outcomes based on information collected about each patient and their treatment plans. The model should output a continuous value as its prediction. The data available includes labeled outcomes for a set of 4,000 patients. The study was conducted on a group of individuals over the age of 65 who have a particular disease that is known to worsen with age. Initial models have performed poorly. While reviewing the underlying data, the Data Scientist notices that, out of 4,000 patient observations, there are 450 where the patient age has been input as 0. The other features for these observations appear normal compared to the rest of the sample population How should the Data Scientist correct this issue?
A) Drop all records from the dataset where age has been set to 0.
B) Replace the age field value for records with a value of 0 with the mean or median value from the dataset
C) Drop the age feature from the dataset and train the model using the rest of the features.
D) Use k-means clustering to handle missing features
Correct Answer:
Verified
Q35: A Data Science team within a large
Q36: An office security agency conducted a successful
Q37: A Machine Learning Specialist working for an
Q38: A Machine Learning Specialist is working with
Q39: A company is observing low accuracy while
Q41: A Data Scientist needs to create a
Q42: A Machine Learning Specialist uploads a dataset
Q43: Which of the following metrics should a
Q44: A Marketing Manager at a pet insurance
Q45: An interactive online dictionary wants to add
Unlock this Answer For Free Now!
View this answer and more for free by performing one of the following actions
Scan the QR code to install the App and get 2 free unlocks
Unlock quizzes for free by uploading documents