Deck 3: Apache Hadoop Distributed File System (HDFS), Cloud Storage, and Database Instances
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Question
Unlock Deck
Sign up to unlock the cards in this deck!
Unlock Deck
Unlock Deck
1/50
Play
Full screen (f)
Deck 3: Apache Hadoop Distributed File System (HDFS), Cloud Storage, and Database Instances
1
A file in HDFS that is smaller than a single block size
A)cannot be stored in hdfs
B)occupies the full block\s size.
C)can span over multiple blocks
D)occupies only the size it needs and not the full block
A)cannot be stored in hdfs
B)occupies the full block\s size.
C)can span over multiple blocks
D)occupies only the size it needs and not the full block
occupies only the size it needs and not the full block
2
Which among the following are the duties of the NameNodes
A)manage file system namespace
B)it is responsible for storing actual data
C)perform read-write operation as per request for the clients
D)none of the above
A)manage file system namespace
B)it is responsible for storing actual data
C)perform read-write operation as per request for the clients
D)none of the above
manage file system namespace
3
If the IP address or hostname of a data node changes
A)the namenode updates the mapping between file name and block name
B)the data in that data node is lost forever
C)the namenode need not update mapping between file name and block name
D)there namenode has to be restarted
A)the namenode updates the mapping between file name and block name
B)the data in that data node is lost forever
C)the namenode need not update mapping between file name and block name
D)there namenode has to be restarted
the namenode need not update mapping between file name and block name
4
For the frequently accessed HDFS files the blocks are cached in
A)the memory of the data node
B)in the memory of the namenode
C)both the above
D)none of the above
A)the memory of the data node
B)in the memory of the namenode
C)both the above
D)none of the above
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
5
Which scenario demands highest bandwidth for data transfer between nodes
A)different nodes on the same rack
B)nodes on different racks in the same data center.
C)nodes in different data centers
D)data on the same node
A)different nodes on the same rack
B)nodes on different racks in the same data center.
C)nodes in different data centers
D)data on the same node
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
6
When a client contacts the namenode for accessing a file, the namenode responds with
A)size of the file requested
B)block id and hostname of all the data nodes containing that block
C)block id of the file requested
D)all of the above
A)size of the file requested
B)block id and hostname of all the data nodes containing that block
C)block id of the file requested
D)all of the above
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
7
In HDFS the files cannot be
A)read
B)deleted
C)executed
D)archived
A)read
B)deleted
C)executed
D)archived
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
8
Which among the following is the duties of the Data Nodes
A)manage file system namespace
B)stores meta-data
C)regulates client's access to files
D)perform read-write operation as per request for the clients
A)manage file system namespace
B)stores meta-data
C)regulates client's access to files
D)perform read-write operation as per request for the clients
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
9
NameNode and DataNode do communicate using
A)active pulse
B)heartbeats
C)h-signal
D)data pulse
A)active pulse
B)heartbeats
C)h-signal
D)data pulse
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
10
Amazon EC2 provides virtual computing environments, known as :
A)chunks
B)instances
C)messages
D)none of the mentioned
A)chunks
B)instances
C)messages
D)none of the mentioned
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
11
Data stored in ___________ domains doesn't require maintenance of a schema.
A)simpledb
B)sql server
C)oracle
D)rds
A)simpledb
B)sql server
C)oracle
D)rds
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
12
Which of the following is a system for creating block level storage devices that can be used for Amazon Machine Instances in EC2 ?
A)cloudwatch
B)amazon elastic block store
C)aws import/export
D)all of the mentioned
A)cloudwatch
B)amazon elastic block store
C)aws import/export
D)all of the mentioned
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
13
Which of the following type of virtualization is also characteristic of cloud computing
A)storage
B)application
C)cpu
D)all of the mentioned
A)storage
B)application
C)cpu
D)all of the mentioned
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
14
Which of the following is most important feature of cloud storage listed below ?
A)logon authentication
B)bare file
C)multiplatform support
D)adequate bandwidth
A)logon authentication
B)bare file
C)multiplatform support
D)adequate bandwidth
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
15
Which of the following is open cloud storage management standard by SNIA ?
A)cdmi
B)occi
C)cea
D)adequate bandwidth
A)cdmi
B)occi
C)cea
D)adequate bandwidth
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
16
Which of the following system does not provision storage to most users ?
A)paas
B)iaas
C)caas
D)saas
A)paas
B)iaas
C)caas
D)saas
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
17
Which of the following service is provided by Google for online storage ?
A)drive
B)skydrive
C)dropbox
D)all of the mentioned
A)drive
B)skydrive
C)dropbox
D)all of the mentioned
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
18
Which of the following backup create a cloned copy of your current data or drive ?
A)continuous data protection
B)open file backup
C)reverse delta backup
D)none of the mentioned
A)continuous data protection
B)open file backup
C)reverse delta backup
D)none of the mentioned
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
19
Which of the following storage devices exposes its storage to clients as Raw storage that can be partitioned to create volumes ?
A)block
B)file
C)disk
D)all of the mentioned
A)block
B)file
C)disk
D)all of the mentioned
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
20
Which of the following should be used considering factors shown in the figure?
A)simpledb
B)rds
C)amazon ec2
D)all of the mentioned
A)simpledb
B)rds
C)amazon ec2
D)all of the mentioned
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
21
Point out the wrong statement.
A)the metrics obtained by cloudwatch may be used to enable a feature called auto scaling
B)a number of tools are used to support ec2 services
C)amazon machine instances are sized at various levels and rented on a computing/hour basis
D)none of the mentioned
A)the metrics obtained by cloudwatch may be used to enable a feature called auto scaling
B)a number of tools are used to support ec2 services
C)amazon machine instances are sized at various levels and rented on a computing/hour basis
D)none of the mentioned
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
22
Which of the following is an edge-storage or content-delivery system that caches data in different physical locations?
A)amazon relational database service
B)amazon simpledb
C)amazon cloudfront
D)amazon associates web services
A)amazon relational database service
B)amazon simpledb
C)amazon cloudfront
D)amazon associates web services
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
23
Which of the following allows you to create instances of the MySQL database to support your Web sites?
A)amazon elastic compute cloud
B)amazon simple queue service
C)amazon relational database service
D)amazon simple storage system
A)amazon elastic compute cloud
B)amazon simple queue service
C)amazon relational database service
D)amazon simple storage system
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
24
. Which of the following allows you to create instances of the MySQL database to support your Web sites?
A)amazon elastic compute cloud
B)amazon simple queue service
C)amazon relational database service
D)amazon simple storage system
A)amazon elastic compute cloud
B)amazon simple queue service
C)amazon relational database service
D)amazon simple storage system
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
25
Point out the correct statement.
A)amazon elastic cloud is a system for creating virtual disks(volume)
B)simpledb interoperates with both amazon ec2 and amazon s3
C)ec3 is an analytics as a service provider
D)none of the mentioned
A)amazon elastic cloud is a system for creating virtual disks(volume)
B)simpledb interoperates with both amazon ec2 and amazon s3
C)ec3 is an analytics as a service provider
D)none of the mentioned
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
26
Point out the correct statement.
A)with atmos, you can create your own cloud storage system or leverage a public cloud service with atmos online
B)ibm is a major player in cloud computing particularly for businesses
C)in managed storage, the storage service provider makes storage capacity available to users
D)all of the mentioned
A)with atmos, you can create your own cloud storage system or leverage a public cloud service with atmos online
B)ibm is a major player in cloud computing particularly for businesses
C)in managed storage, the storage service provider makes storage capacity available to users
D)all of the mentioned
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
27
Redundancy has to be implemented at the ___________ architectural level for effective results in cloud computing.
A)lower
B)higher
C)middle
D)all of the mentioned
A)lower
B)higher
C)middle
D)all of the mentioned
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
28
Which of the following can manage data from CIFS and NFS file systems over HTTP networks?
A)storagegrid
B)datagrid
C)diskgrid
D)all of the mentioned
A)storagegrid
B)datagrid
C)diskgrid
D)all of the mentioned
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
29
Point out the wrong statement.
A)aws s3 essentially lets you create your own cloud storage
B)aws created "availability zones" within regions, which are sets of systems that are isolated from one another
C)amazon web services (aws) adds redundancy to its iaas systems by allowing ec2 virtual machine instances
D)none of the mentioned
A)aws s3 essentially lets you create your own cloud storage
B)aws created "availability zones" within regions, which are sets of systems that are isolated from one another
C)amazon web services (aws) adds redundancy to its iaas systems by allowing ec2 virtual machine instances
D)none of the mentioned
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
30
A ___________is a logical unit that serves as the target for storage operations, such as the SCSI protocol READs and WRITEs.
A)gets
B)pun
C)lun
D)all of the mentioned
A)gets
B)pun
C)lun
D)all of the mentioned
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
31
Which of the following use LUNs to define a storage volume that appears to a connected computer as a device?
A)san
B)iscsi
C)fibre channel
D)all of the mentioned
A)san
B)iscsi
C)fibre channel
D)all of the mentioned
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
32
Which of the following protocol is used for discovering and retrieving objects from a cloud?
A)occi
B)smtp
C)http
D)all of the mentioned
A)occi
B)smtp
C)http
D)all of the mentioned
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
33
Which of the following disk operation is performed When a tenant is granted access to a virtual storage container?
A)crud
B)file system modifications
C)partitioning
D)all of the mentioned
A)crud
B)file system modifications
C)partitioning
D)all of the mentioned
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
34
Which of the following standard connect distributed hosts or tenants to their provisioned storage in the cloud?
A)cdmi
B)ocmi
C)coa
D)all of the mentioned
A)cdmi
B)ocmi
C)coa
D)all of the mentioned
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
35
IBM and ___________have announced a major initiative to use Hadoop to support university courses in distributed computer programming.
A)google latitude
B)android (operating system)
C)google variations
D)google
A)google latitude
B)android (operating system)
C)google variations
D)google
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
36
Point out the correct statement.
A)hadoop is an ideal environment for extracting and transforming small volumes of data
B)hadoop stores data in hdfs and supports data compression/decompression
C)the giraph framework is less useful than a mapreduce job to solve graph and machine learning
D)none of the mentioned
A)hadoop is an ideal environment for extracting and transforming small volumes of data
B)hadoop stores data in hdfs and supports data compression/decompression
C)the giraph framework is less useful than a mapreduce job to solve graph and machine learning
D)none of the mentioned
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
37
What license is Hadoop distributed under?
A)apache license 2.0
B)mozilla public license
C)shareware
D)commercial
A)apache license 2.0
B)mozilla public license
C)shareware
D)commercial
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
38
Sun also has the Hadoop Live CD ___________project, which allows running a fully functional Hadoop cluster using a live CD.
A)openoffice.org
B)opensolaris
C)gnu
D)linux
A)openoffice.org
B)opensolaris
C)gnu
D)linux
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
39
Hadoop achieves reliability by replicating the data across multiple hosts and hence does not require ___________storage on hosts.
A)raid
B)standard raid levels
C)zfs
D)operating system
A)raid
B)standard raid levels
C)zfs
D)operating system
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
40
What was Hadoop written in?
A)java (software platform)
B)perl
C)java (programming language)
D)lua (programming language)
A)java (software platform)
B)perl
C)java (programming language)
D)lua (programming language)
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
41
The Hadoop list includes the HBase database, the Apache Mahout ___________ system, and matrix operations.
A)machine learning
B)pattern recognition
C)statistical classification
D)artificial intelligence
A)machine learning
B)pattern recognition
C)statistical classification
D)artificial intelligence
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
42
The Mapper implementation processes one line at a time via ___________ method. TOPIC 5.2 MAPREDUCE
A)map
B)reduce
C)mapper
D)reducer
A)map
B)reduce
C)mapper
D)reducer
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
43
Point out the correct statement.
A)mapper maps input key/value pairs to a set of intermediate key/value pairs
B)applications typically implement the mapper and reducer interfaces to provide the map and reduce methods
C)mapper and reducer interfaces form the core of the job
D)none of the mentioned
A)mapper maps input key/value pairs to a set of intermediate key/value pairs
B)applications typically implement the mapper and reducer interfaces to provide the map and reduce methods
C)mapper and reducer interfaces form the core of the job
D)none of the mentioned
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
44
The Hadoop MapReduce framework spawns one map task for each ___________ generated by the InputFormat for the job.
A)outputsplit
B)inputsplit
C)inputsplitstream
D)all of the mentioned
A)outputsplit
B)inputsplit
C)inputsplitstream
D)all of the mentioned
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
45
Users can control which keys (and hence records) go to which Reducer by implementing a custom?
A)partitioner
B)outputsplit
C)reporter
D)all of the mentioned
A)partitioner
B)outputsplit
C)reporter
D)all of the mentioned
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
46
Point out the wrong statement.
A)the mapper outputs are sorted and then partitioned per reducer
B)the total number of partitions is the same as the number of reduce tasks for the job
C)the intermediate, sorted outputs are always stored in a simple (key-len, key, value-len, value) format
D)none of the mentioned
A)the mapper outputs are sorted and then partitioned per reducer
B)the total number of partitions is the same as the number of reduce tasks for the job
C)the intermediate, sorted outputs are always stored in a simple (key-len, key, value-len, value) format
D)none of the mentioned
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
47
Applications can use the __________ to report progress and set application-level status messages.
A)partitioner
B)outputsplit
C)reporter
D)all of the mentioned
A)partitioner
B)outputsplit
C)reporter
D)all of the mentioned
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
48
The right level of parallelism for maps seems to be around ___________ maps per- node.
A)1-10
B)10-100
C)100-150
D)150-200
A)1-10
B)10-100
C)100-150
D)150-200
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
49
The number of reduces for the job is set by the user via ___________
A)jobconf.setnumtasks(int)
B)jobconf.setnumreducetasks(int)
C)jobconf.setnummaptasks(int)
D)all of the mentioned
A)jobconf.setnumtasks(int)
B)jobconf.setnumreducetasks(int)
C)jobconf.setnummaptasks(int)
D)all of the mentioned
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck
50
The framework groups Reducer inputs by key in ___________ stage.
A)sort
B)shuffle
C)reduce
D)none of the mentioned
A)sort
B)shuffle
C)reduce
D)none of the mentioned
Unlock Deck
Unlock for access to all 50 flashcards in this deck.
Unlock Deck
k this deck