Quiz 12: Secondary Storage Management

Computing

Modifications in the directory contents The table to show the directory contents is as given below: img The figure to show the storage allocations is as given below: img The modified table to add new file and to allocate the storage units is as given below: img The modified figure to show the storage allocations for the added file for units and pointers is as given below: img

Google BigTable BigTable is designed by Google to scale large size of data across thousands of servers. It is a high performance distributed storage system that manages large size of data. BigTable is used in various Google projects and products, such as Google Analytics, Google Finance, Personalized Search, Orkut, and Google Earth. A BigTable cluster stores a large number of tables containing a set of tablets; each tablet in turn consists of all associated data. Implementation of BigTable The implementation of BigTable has three components given below: • Library - that links to every client • Master server - They assign tablets to tablet servers and detect the addition and expiration of tablet servers and hence balance tablet-server load and garbage collection of files in GFS. • Tablet servers - They can be added or removed from cluster to handle changes in workloads. The tablet server handles read and write requests to the tablets and divides splits that have large size. Process of Implementation The process of implementation of BigTable is given below: • Tablet Location - Tablet location information is stored in three level hierarchies. The first level is a file that contains the location of root tablet which in turn contains location of all tablets. Root tablet is the first table and is never split. The second level is metadata table which contains location of all the tablets and in the next level, there is a client library that caches locations. • Table Assignment - Each tablet server is assigned one tablet at a time, Master server keeps track of assigned and unassigned tablets. A tables is assigned to the tablet server by the master server by sending a load table request • Tablet Serving - Read and write operations on a tablet server are performed after proper authorization. Authorization is performed by reading the list of permitted writers and readers from the first file. • Compactions - There are two types of compactions - Minor Compaction and Major Compaction. Compactions are done to reduce the memory usage of the tablet server and the amount of data to be read. A new table is created after every minor compaction. It's a merging compaction which rewrites all tables into exactly one table. Table produced by major compaction contains no deletion information. Application services of BigTable: Application services of BigTable are given below: • Locality Group - Locality groups can be made by clients to group multiple column families together. In every tablet, a separate table is generated for each locality group. It enables more efficient reads by separating column families that are not accessed together. • Compression - Clients can choose a compression format of their tables for a locality group and each table block is compressed separately. A two-pass custom compression scheme is also available, in the first pass, long common strings are passed across the large window. In the second pass, repetitions across a small window are found by using a fast compression algorithm. • Caching - Tablet servers use two levels of caching to improve read performance. The higher level cache is Scan Cache that caches the key value pairs returned by the table. It is useful for applications that repeatedly read the same data. The lower level cache is Block Cache that caches table blocks that were read from GFS. It is useful for applications that tend to have sequential reads. • Speeding up tablet recovery - Source tablet server does a minor compaction on the tablet that is migrated from one tablet server to another by the master server, in turn reducing time as the amount of uncompact state in tablet's server is reduced. • Exploiting immutability - All tables generated in the BigTable are immutable and hence problem of permanently removing deleted data is transformed to garbage collecting obsolete tables. Each table is stored in the METADATA; obsolete tables are removed by the master as garbage collection. Applications using BigTable have high availability and scalability of large size of data. Products of BigTable have flexible and high performance solution.

File management system and its layers The group of system software which is used to manage the secondary storage and access the functions is called file management system (FMS). The layers of the file management system are given below: • Command layer • File control layer • Storage Input/output control layer The services which are used for the manipulation of the applications and files/ directories are provided by the command layer. The services for the manipulation of the files and the directories are provided by file control layer. A part of the kernel which is used to manage the movement of data, and the storage devices is called storage I/O control.

There is no answer for this question