A media company has been performing analytics on log data generated by its applications. There has been a recent increase in the number of concurrent analytics jobs running, and the overall performance of existing jobs is decreasing as the number of new jobs is increasing. The partitioned data is stored in Amazon S3 One Zone-Infrequent Access (S3 One Zone-IA) and the analytic processing is performed on Amazon EMR clusters using the EMR File System (EMRFS) with consistent view enabled. A data analyst has determined that it is taking longer for the EMR task nodes to list objects in Amazon S3. Which action would MOST likely increase the performance of accessing log data in Amazon S3?
A) Use a hash function to create a random string and add that to the beginning of the object prefixes when storing the log data in Amazon S3.
B) Use a lifecycle policy to change the S3 storage class to S3 Standard for the log data.
C) Increase the read capacity units (RCUs) for the shared Amazon DynamoDB table.
D) Redeploy the EMR clusters that are running slowly to a different Availability Zone.
Correct Answer:
Verified
Q40: A data analyst is using AWS Glue
Q41: A company needs to store objects containing
Q42: A banking company wants to collect large
Q43: A retail company is building its data
Q44: A company has 1 million scanned documents
Q46: A data engineering team within a shared
Q47: An online gaming company is using an
Q48: An online retailer needs to deploy a
Q49: A smart home automation company must efficiently
Q50: An online retail company with millions of
Unlock this Answer For Free Now!
View this answer and more for free by performing one of the following actions
Scan the QR code to install the App and get 2 free unlocks
Unlock quizzes for free by uploading documents