A retail company wants to combine its customer orders with the product description data from its product catalog. The structure and format of the records in each dataset is different. A data analyst tried to use a spreadsheet to combine the datasets, but the effort resulted in duplicate records and records that were not properly combined. The company needs a solution that it can use to combine similar records from the two datasets and remove any duplicates. Which solution will meet these requirements?
A) Use an AWS Lambda function to process the data. Use two arrays to compare equal strings in the fields from the two datasets and remove any duplicates.
B) Create AWS Glue crawlers for reading and populating the AWS Glue Data Catalog. Call the AWS Glue SearchTables API operation to perform a fuzzy-matching search on the two datasets, and cleanse the data accordingly.
C) Create AWS Glue crawlers for reading and populating the AWS Glue Data Catalog. Use the FindMatches transform to cleanse the data.
D) Create an AWS Lake Formation custom transform. Run a transformation for matching products from the Lake Formation console to cleanse the data automatically.
Correct Answer:
Verified
Q137: A machine learning specialist is developing a
Q138: A company offers an online shopping service
Q139: A library is developing an automatic book-borrowing
Q140: A company sells thousands of products on
Q141: A medical imaging company wants to train
Q143: A data science team is planning to
Q144: A company will use Amazon SageMaker to
Q145: A manufacturing company asks its machine learning
Q146: A data scientist is using an Amazon
Q147: An aircraft engine manufacturing company is measuring
Unlock this Answer For Free Now!
View this answer and more for free by performing one of the following actions
Scan the QR code to install the App and get 2 free unlocks
Unlock quizzes for free by uploading documents