Managing Big Data in Clusters and Cloud Storage Quiz

Managing Big Data in Clusters and Cloud Storage Quiz Answer. This course is offered by “Coursera”. In this post you will get Managing Big Data in Clusters and Cloud Storage Coursera Quiz Answer | 100% Correct Answer

 

Managing Big Data in Clusters and Cloud Storage Quiz Coursera Quiz Answer

Offered By ”Cloudera”

4.7 stars (188 ratings)

Enroll Now

 

1.
Question 1
Use the Table Browser in Hue to view the tables in the fun database. (See the environment installation instructions for how to log in.) Which of the following tables are in the fun database? Check all that apply.

1 point

  • card_rank
  • card_suit
  • crayons
  • customers
  • games
  • inventory
  • makers
  • orders
  • toys

===================================================

 

2.
Question 2
Use the Table Browser in Hue to view the columns in the employees table in the default database. Which of the following columns are in the employees table? Check all that apply.

1 point

  • city
  • country
  • empl_id
  • first_name
  • grade
  • last_name
  • makers
  • office_id
  • salary

===================================================

 

3.
Question 3
Which of the following is a way to show what tables are available in the fly database using the query editor (center panel of Hue)?

1 point

  • Select the fly database as the active database and then run the command DESCRIBE DATABASE;
  • Select the fly database as the active database and then run the command DESCRIBE TABLES;
  • Select the fly database as the active database and then run the command SHOW DATABASES;
  • Select the fly database as the active database and then run the command SHOW TABLES;

===================================================

 

4.
Question 4
By default, what directory in HDFS would store the data for a table named artists in a database named music?

1 point

  • /user/hive/warehouse/music.db/artists
  • /user/hive/warehouse/music/artists
  • /music.db/artists
  • /user/hive/warehouse/artists
  • /music/artists

===================================================

 

5.
Question 5
By default, what directory in HDFS would store the data for a table named facilities in the default database?

1 point

  • /user/hive/warehouse/facilities
  • /user/hive/warehouse/default/facilities
  • /default.db/facilities
  • /user/hive/warehouse/default.db/facilities
  • /default/facilities

===================================================

 

6.
Question 6
What delimiter is used to separate the values in the lines of the text file containing the data in the orders table in the default database?

1 point

  • tab (\t)
  • comma (,)

===================================================

 

7.
Question 7
Which command will list the files in an HDFS directory to the terminal screen?

1 point

  • hdfs dfs ls /path/to/directory/
  • hdfs dfs get /path/to/directory/
  • hdfs dfs -ls /path/to/directory/
  • hdfs dfs -cat /path/to/directory/
  • hdfs dfs -get /path/to/directory/
  • hdfs dfs cat /path/to/directory/

===================================================

 

8.
Question 8
Which is an advantage of using S3?

1 point

  • There are many instances of S3, each with its own resources and connections
  • Hive and Impala only work with S3
  • Scalability is easier with S3 (or other cloud storage) than on-premises storage
  • S3 enables data locality

===================================================

 

9.
Question 9
Which commands can you use to print the contents of a file on S3 to your terminal screen? Check all that apply.

1 point

  • aws s3 -pr
  • hdfs dfs cat
  • aws s3 cat
  • aws s3 pr
  • hdfs dfs -cat
  • aws s3 -cat
  • hdfs dfs -pr
  • hdfs dfs pr

 

Week 2 Graded Quiz

1.
Question 1
Which statement creates a database named mydatabase?

1 point

  • CREATE DATABASE mydatabase LOCATION ‘/user/mydatabase.db’;
  • CREATE NEW DATABASE mydatabase;
  • CREATE NEW DATABASE mydatabase LOCATION ‘/user/mydatabase.db’;
  • CREATE DATABASE mydatabase;

===================================================

 

2.
Question 2
A new table is created using the following statement. The database used is in the default storage location in the Hive warehouse. Which statements describe the expected outcomes of this statement? Check all that apply.

CREATE TABLE thisdb.thistable (id TINYINT, name STRING);

1 point

  • The table is configured to store data in a directory named thistable
  • The table is configured to store data in /user/hive/warehouse/thistable
  • The table’s storage directory is a subdirectory of /user/hive/warehouse/thisdb.db
  • The table’s name is thisdb
  • The table is in the database thisdb
  • The table has four columns called id, TINYINT, name, and STRING

===================================================

 

3.
Question 3
A data file specifies a song (such as “Bohemian Rhapsody”) on an album (in this case A Night At The Opera) by an artist or group (Queen). The file uses the pipe character (|) to separate values, so the example would look like this row:

Bohemian Rhapsody|A Night At The Opera|Queen

Which statement is appropriate to define a table using data in this format?

1 point

CREATE TABLE songs (song STRING, album STRING, artist STRING)

ROW FORMAT DELIMITED

FIELDS TERMINATED BY ‘\|’;

CREATE TABLE songs (song STRING, album STRING, artist STRING)

ROW FORMAT DELIMITED

FIELDS TERMINATED BY |;

CREATE TABLE songs (song STRING, album STRING, artist STRING)

ROW FORMAT DELIMITED BY ‘\|’;

CREATE TABLE songs (song STRING, album STRING, artist STRING)

ROW FORMAT DELIMITED BY |;

CREATE TABLE songs (song STRING, album STRING, artist STRING)

ROW FORMAT DELIMITED

FIELDS TERMINATED BY ‘|’;

CREATE TABLE songs (song STRING, album STRING, artist STRING)

ROW FORMAT DELIMITED BY ‘|’;

 

===================================================

 

4.
Question 4
The data for a table to be called weblogs is provided in Parquet files (with format PARQUET) and are placed in S3 in a directory named weblogs in the bucket named training-coursera1. Which statement correctly creates this table? (Assume the column list is correct.)

1 point

CREATE EXTERNAL TABLE weblogs (…)

FILE FORMAT PARQUET

LOCATION ‘s3a://training-coursera1/weblogs/’;

CREATE EXTERNAL TABLE weblogs (…)

STORED AS PARQUET

LOCATION ‘s3a://training-coursera1/weblogs/’;

CREATE EXTERNAL TABLE weblogs (…)

FILE FORMAT PARQUET

STORED AT ‘s3a://training-coursera1/weblogs/’;

CREATE EXTERNAL TABLE weblogs (…)

ROW FORMAT PARQUET

LOCATION ‘s3a://training-coursera1/weblogs/’;

CREATE EXTERNAL TABLE weblogs (…)

ROW FORMAT DELIMITED PARQUET

STORED AT ‘s3a://training-coursera1/weblogs/’;

CREATE EXTERNAL TABLE weblogs (…)

STORED AS PARQUET

AT ‘s3a://training-coursera1/weblogs/’;

 

===================================================

 

5.
Question 5
Which statement correctly uses a SerDe for reading and writing newtable’s data?

1 point

CREATE TABLE newtable (col1 STRING, col2 INT)

ROW FORMAT SERDE org.apache.hadoop.hive.serde2.OpenCSVSerde;

CREATE TABLE newtable (col1 STRING, col2 INT)

WITH SERDE org.apache.hadoop.hive.serde2.OpenCSVSerde;

CREATE TABLE newtable (col1 STRING, col2 INT)

STORED AS SERDE org.apache.hadoop.hive.serde2.OpenCSVSerde;

CREATE TABLE newtable (col1 STRING, col2 INT)

ROW FORMAT SERDE ‘org.apache.hadoop.hive.serde2.OpenCSVSerde’;

CREATE TABLE newtable (col1 STRING, col2 INT)

WITH SERDE ‘org.apache.hadoop.hive.serde2.OpenCSVSerde’;

CREATE TABLE newtable (col1 STRING, col2 INT)

STORED AS SERDE ‘org.apache.hadoop.hive.serde2.OpenCSVSerde’;

 

===================================================

 

6.
Question 6
An alternative to using CREATE EXTERNAL TABLE when creating an externally managed table is to set the table property EXTERNAL to TRUE. Which of the following would correctly do this?

1 point

CREATE TABLE table_with_header (col1 INT, col2 STRING)

TBLPROPERTIES (‘EXTERNAL’=’TRUE’);

CREATE TABLE table_with_header (col1 INT, col2 STRING)

TBLPROPERTIES ‘EXTERNAL’=’TRUE’;

CREATE TABLE table_with_header (col1 INT, col2 STRING)

SET TABLEPROPERTIES (‘EXTERNAL’=’TRUE’);

CREATE TABLE table_with_header (col1 INT, col2 STRING)

TABLEPROPERTIES (‘EXTERNAL’=’TRUE’);

CREATE TABLE table_with_header (col1 INT, col2 STRING)

SET TBLPROPERTIES (‘EXTERNAL’=’TRUE’);

CREATE TABLE table_with_header (col1 INT, col2 STRING)

SET TABLEPROPERTIES ‘EXTERNAL’=’TRUE’;

CREATE TABLE table_with_header (song INT, col2 STRING)

TABLEPROPERTIES ‘EXTERNAL’=’TRUE’;

CREATE TABLE table_with_header (col1 INT, col2 STRING)

SET TBLPROPERTIES ‘EXTERNAL’=’TRUE’;

 

===================================================

 

7.
Question 7
Which commands are valid ways to change existing table schemas using Apache Impala? Check all that apply.

1 point

  • ALTER TABLE investors DROP COLUMNS amount, share;
  • ALTER TABLE investors DROP COLUMN share;
  • ALTER TABLE investors CHANGE amount INT TO quantity BIGINT;
  • ALTER TABLE investors CHANGE amount TO quantity;
  • ALTER TABLE investors CHANGE amount quantity INT;
  • ALTER TABLE investors CHANGE amount quantity;

===================================================

 

8.
Question 8
Which statements describe the differences between dropping int_table, which was created using CREATE TABLE (with no later alterations), and ext_table, which was created using CREATE EXTERNAL TABLE (with no later alterations)?

1 point

  • Dropping int_table might delete the directory in which its data is stored, but dropping ext_table will not.
  • Dropping int_table will not delete the directory in which its data is stored, but dropping ext_table might delete the directory for its table.
  • Dropping int_table will not delete the data for int_table, but dropping ext_table might drop the data for that table.
  • Dropping int_table might delete the data for the table, but dropping ext_table will not drop the data for that table.

===================================================

 

9.
Question 9
For which of the following is Impala a better choice than Hive? Check all that apply.

1 point

  • Altering the order of columns using FIRST or AFTER
  • Querying a table created using a SerDe
  • Running queries on tables that are available in both engines and for which you want a fast response
  • Creating a table using a SerDe
  • Dropping a column using the ALTER TABLE … DROP COLUMN command

===================================================

 

10.
Question 10
Suppose you have been querying a table named mytable using Impala. You then added data to mytable using an hdfs dfs command, and you want to query the table in Impala again, with the new data. What is your best course of action?

1 point

  • Run REFRESH mytable; then query as usual
  • Run REFRESH; then query as usual
  • Run INVALIDATE METADATA; then query as usual
  • Query as usual, no other action needed
  • Run REFRESH METADATA mytable; then query as usual

 

Week 3 Graded Quiz

1.
Question 1
Which of these data types allows the greatest precision?

1 point

  • DOUBLE
  • FLOAT
  • DECIMAL

===================================================

 

2.
Question 2
Which of the following are character string data types used by Hive and Impala?

1 point

  • TEXT
  • CHAR
  • VARCHAR
  • CHARACTER
  • STRING

===================================================

 

3.
Question 3
Which of the following are numeric data types?

1 point

  • BIGINT
  • STRING
  • BINARY
  • TEXT
  • DOUBLE

===================================================

 

4.
Question 4
Which of these are good advice for choosing data types, assuming you want flexibility with both Hive and Impala?

1 point

  • Use the DATE type rather than TIMESTAMP
  • Choose the smallest integer type that accommodates the required range
  • Use the DOUBLE type for currency or other exact quantities
  • Use the STRING type rather than CHAR or VARCHAR, for general use

===================================================

 

5.
Question 5
Which statements about the typeof function are true?

1 point

  • You can use it with Impala
  • It only works when a column reference is used
  • You can use it with expressions as well as column references
  • You can use it with Hive

===================================================

 

6.
Question 6
You have a column of integer values that you expect to range from -10 to 127. Using Impala, what’s the smallest integer data type you can use that allows you to identify which values are out of this range? The ranges of the integer data types are provided below as a reminder.

Integer Type Range
TINYINT -128 to 127
SMALLINT -32,768 to 32,767
INT -2,147,483,648 to 2,147,483,647 (approximately 2.1 billion)
1 point

  • TINYINT
  • SMALLINT
  • INT

===================================================

 

7.
Question 7
Which are advantages of using Parquet files?

1 point

  • Good interoperability (can be used by many applications)
  • Columnar format (which improves performance for some access patterns)
  • Choice of different delimiters to separate column values
  • Easily read by humans

===================================================

 

8.
Question 8
Why is interoperability an important consideration when choosing a file type for storing data?

1 point

  • You might need to use different tools to access the same data
  • Your data might grow significantly over time, requiring a shift in your storage options
  • You might sometimes need to load all the data at once, but other times only need some of the data

===================================================

 

9.
Question 9
Which are important considerations when choosing a file type for storing data?

1 point

  • On-premises or cloud storage
  • Ingest pattern
  • Data lifetime
  • Names of the columns in the data
  • Data size and query performance

 

Week 4 Graded Quiz

1.
Question 1
Which of these are ways to load data into HDFS using Hue on the VM? Check all that apply.

1 point

  • Issue a -put command in Hue’s query editor
  • Use the Table Browser to load data into the storage directory of an existing table.
  • Load data into any HDFS directory through the data source assist panel on the left side
  • Use the Table Browser to load data into the storage directory of a new table at the time when you create the new table
  • Use the File Browser to load data into any HDFS directory
  • Drag and drop files onto the Hue application

===================================================

 

2.
Question 2
Which of these tasks can you complete using HDFS shell commands? Check all that apply.

1 point

  • Load data into the storage directory of an existing table
  • Load data into the storage directory for a table that hasn’t been created yet
  • Create a new table and load data into its storage directory

===================================================

 

3.
Question 3
Which hdfs dfs command option can be used to upload a file from your local filesystem to HDFS?

1 point

  • -cp
  • -get
  • -ls
  • -mv
  • -put
  • -rm

===================================================

 

4.
Question 4
On the course VM, which command could you use to upload the local file /home/training/training_materials/analyst/data/games.csv to an S3 bucket named bucket-o-games? For purposes of this question, assume you have write access to this bucket.

1 point

  • aws s3 put /home/training/training_materials/analyst/data/games.csv s3://bucket-o-games/
  • aws s3 cp /home/training/training_materials/analyst/data/games.csv s3a://bucket-o-games/
  • aws s3 put /home/training/training_materials/analyst/data/games.csv s3a://bucket-o-games/
  • aws s3 cp /home/training/training_materials/analyst/data/games.csv s3://bucket-o-games/

===================================================

 

5.
Question 5
What is the effect of including TBLPROPERTIES (‘serialization.null.format’ = ”) in a CREATE TABLE statement?

1 point

  • Whitespace characters inside character strings in the table’s data files will not appear in query results
  • NULL or missing values in the table’s data files will be enclosed in single quotes in query results
  • Empty strings in the table’s data files will appear as NULL in query results
  • NULL or missing values in the table’s data files will appear as empty strings in query results

===================================================

 

6.
Question 6
This command uses Sqoop to import data from MySQL to HDFS on the VM (with the user training).

$ sqoop import –connect jdbc:mysql://localhost/mydb \

–username training –password training \

–table example_table \

–target-dir /user/hive/warehouse/example_table

Where in HDFS will the data be found?

1 point

  • /user/hive/warehouse/example_table
  • /user/training/example_table
  • /user/hive/warehouse/mydb.db/example_table
  • /user/training/mydb.db/example_table

===================================================

 

7.
Question 7
Which are valid uses for Sqoop? Check all that apply.

1 point

  • Importing data to HDFS from an RDBMS (such as MySQL)
  • Downloading a file to HDFS from your local filesystem
  • Exporting data from HDFS to an RDBMS (such as MySQL)
  • Uploading a file to HDFS from your local filesystem
  • Importing data to HDFS from cloud storage (such as S3)
  • Exporting data from HDFS to cloud storage (such as S3)

===================================================

 

8.
Question 8
Which are advantages of using the LOAD DATA SQL statement instead of the hdfs dfs -mv shell command to load data? Check all that apply.

1 point

  • If you use Impala to execute the LOAD DATA command, the metadata cache is automatically updated
  • LOAD DATA renames any files that are the same as files that already exist within the directory
  • hdfs dfs -mv cannot move files into the Hive warehouse
  • LOAD DATA works with S3, while hdfs dfs -mv does not

===================================================

 

9.
Question 9
Why is it a poor practice to use INSERT commands to load data into a table a few rows at a time?

1 point

  • You actually can’t; INSERT will only allow adding one row a time
  • You get many small files, which can slow big data systems down
  • INSERT overwrites existing data, so only the data from the last command will be in the table
  • The syntax varies too much between engines

===================================================

 

10.
Question 10
Which are valid ways to create a copy of an existing table? Check all that apply.

1 point

Peer-graded Assignment: Data Management

Click Here To Download The Assignment

[Attention]: To get any assignment File you must need to click here to subscribe to YouTube and after that fill up this google form]

Other Questions Of This Category