Which type of numeric value does a logistic regression model estimate? A. Probability B. A p-value C. Any integer D. Any real number Explanation: Show Answer

# Category: E20-007

Data Science Associate Exam

## Which query interface would you recommend?

Your colleague, who is new to Hadoop, approaches you with a question. They want to know how best to accesstheir data. This colleague has a strong background in data flow languages and programming. Which query interface would you recommend? A. Pig B. Hive C. Howl D. HBase Explanation: Show Answer

## Which tool should they use?

The web analytics team uses Hadoop to process access logs. They now want to correlate this data withstructured user data residing in a production single-instance JDBC database. They collaborate with theproduction team to import the data into Hadoop. Which tool should they use? A. Sqoop B. Pig C. Chukwa D. Scribe Explanation: Show Answer

## What does the R code z <- f[1:10, ] do?

What does the R codez <- f[1:10, ]do? A. Assigns the first 10 rows of f to the vector z B. Assigns the 1st 10 columns of the 1st row of f to z C. Assigns a sequence of values from 1 to 10 to z D. Assigns the 1st 10 columns to z Explanation: […]

## In R, functions like plot() and hist() are known as what?

In R, functions like plot() and hist() are known as what? A. generic functions B. virtual methods C. virtual functions D. generic methods Explanation: Show Answer

## Which combination of subtotals do you expect to be returned by the query?

Review the following code: SELECT pn, vn, sum(prc*qty)FROM saleGROUP BY CUBE(pn, vn)ORDER BY 1, 2, 3; Which combination of subtotals do you expect to be returned by the query? A. (pn, vn) B. ( (pn, vn), (pn) ) C. ( (pn, vn) , (pn), (vn) ) D. ( (pn, vn) , (pn), (vn) , ( […]

## In MADlib what does MAD stand for?

In MADlib what does MAD stand for? A. Magnetic, Agile, Deep B. Machine Learning, Algorithms for Databases C. Mathematical Algorithms for Databases D. Modular, Accurate, Dependable Explanation: Show Answer

## Which tool should they use to export the structured data from Hadoop?

The web analytics team uses Hadoop to process access logs. They now want to correlate this data withstructured user data residing in their massively parallel database. Which tool should they use to export thestructured data from Hadoop? A. Sqoop B. Pig C. Chukwa D. Scribe Explanation: Show Answer

## When would you prefer a Naive Bayes model to a logistic regression model for classification?

When would you prefer a Naive Bayes model to a logistic regression model for classification? A. When you are using several categorical input variables with over 1000 possible values each. B. When you need to estimate the probability of an outcome, not just which class it is in. C. When all the input variables are […]

## Before you build an ARMA model, how can you tell if your time series is weakly stationary?

Before you build an ARMA model, how can you tell if your time series is weakly stationary? A. There appears to be a constant variance around a constant mean. B. The mean of the series is close to 0. C. The series is normally distributed. D. There appears to be no apparent trend component. Explanation: […]