Show
Recommended textbook solutions
Information Technology Project Management: Providing Measurable Organizational Value5th EditionJack T. Marchewka 346 solutions
Fundamentals of Database Systems7th EditionRamez Elmasri, Shamkant B. Navathe 687 solutions Introduction to Algorithms3rd EditionCharles E. Leiserson, Clifford Stein, Ronald L. Rivest, Thomas H. Cormen 726 solutions
Service Management: Operations, Strategy, and Information Technology7th EditionJames Fitzsimmons, Mona Fitzsimmons 103 solutions Coursera Google Data Analytics Professional Certificate Course 4 – Process Data from Dirty to Clean quiz answers to all weekly
questions (weeks 1 – 6): You may also be interested in Google Data Analytics Professional Certificate Course 1: Foundations – Cliffs Notes. As you start thinking about how to prepare
your data for exploration, this part of the course will highlight why data integrity is so essential to successful decision-making. You’ll learn about how data is generated and the techniques analysts use to decide what data to collect for analysis. And you’ll discover structured and unstructured data, data types, and data formats. Which process
do data analysts use to make data more organized and easier to read? To make data more organized and easier to read, data analysts use data manipulation. Fill in the blank: The degree to which data conforms to certain business rules or constraints determines the data’s _. The degree to which data conforms to certain business rules or constraints determines the data’s validity. Which of the following is an example of invalid data? A mandatory value left blank is invalid because mandatory values must be filled in. Fill in the blank: Data being used for analysis should align with _ and help answer stakeholder questions. Data being used for analysis should
align with business objectives and help answer stakeholder questions. Before analysis, a company collects data from countries that use different date formats. Which of the following updates would improve the data integrity? Changing
all of the dates to the same format would improve the data integrity. When should data analysts think about modifying a business objective? Select all that apply. Data
analysts should think about modifying a business objective when the data doesn’t align with the original objective and when there is not enough data to meet the objective. What should an analyst do if they do not have the data needed to meet a business objective? Select all that apply. If an analyst does not have the data needed to meet a business objective, they should gather related data on a small scale and request additional time. Then, they can find more complete data or perform the analysis by finding and using
proxy data from other datasets. Which of the following are limitations that might lead to insufficient data? Select all that apply. Limitations that might lead to insufficient data include data that updates continually, outdated data, and data from a single source. How can a data analyst eliminate sampling bias of a population for a study about the most popular ice cream flavors? To eliminate sampling bias of a population for this study, a data analyst can use random sampling. Sampling on the basis of geographical location can still lead to sampling bias. Question 4A data analyst wants to find out how many people in Utah have swimming pools. It’s unlikely that they can survey every Utah resident. Instead, they survey enough people to be representative of the population. This describes what data analytics concept?
L5 Testing your dataQuestion 1A research team runs an experiment to determine if a new security system is more effective than the previous version. What type of results are required for the experiment to be statistically significant?
Question 2In order to have a high confidence level in a customer survey, what should the sample size accurately reflect?
In order to have a high confidence level in a customer survey, the sample size should accurately reflect the entire population. Question 3A data analyst determines an appropriate sample size for a survey. They can check their work by making sure the confidence level percentage plus the margin of error percentage add up to 100%.
L6 Consider the margin of errorQuestion 1Fill in the blank: Margin of error is the _ amount that the sample results are expected to differ from those of the actual population.
Question 2In a survey about a new cleaning product, 75% of respondents report they would buy the product again. The margin of error for the survey is 5%. Based on the margin of error, what percentage range reflects the population’s true response?
Weekly challenge 1Question 1Which of the following conditions are necessary to ensure data integrity? Select all that apply.
Question 2What is one potential problem associated with data manipulation that analysts must be aware of?
Question 3A data analyst is given a dataset for analysis. It includes data about the total population of every country in the previous 20 years. Based on the available data, an analyst will be able to determine which country was the most populous from 2016 to 2017.
Question 4A data analyst is given a dataset for analysis. June 2014 Invoices – Sheet1.csv Which of the following has duplicate data?
Question 5A data analyst is working on a project about the global supply chain. They have a dataset with lots of relevant data from Europe and Asia. However, they decide to generate new data that represents all continents. What type of insufficient data does this scenario describe?
Question 6A car manufacturer wants to learn more about the brand preferences of electric car owners. There are millions of electric car owners in the world. Who should the company survey?
Question 7Fill in the blank: Sampling bias in data collection happens when a sample isn’t representative of _.
Question 8Which of the following processes helps ensure a close alignment of data and business objectives?
Week 2: Sparkling-clean dataAll about clean dataEvery data analyst wants clean data to work with when performing an analysis. In this part of the course, you’ll learn the difference between clean and dirty data. You’ll also explore data cleaning techniques using spreadsheets and other tools. Learning Objectives
Answers to week 2 quiz questionsL2 Recognize clean vs. dirty dataQuestion 1Describe the difference between a null and a zero in a dataset.
Question 2What are the most common processes and procedures handled by data engineers? Select all that apply.
Question 3What are the most common processes and procedures handled by data warehousing specialists? Select all that apply.
Question 4A data analyst is cleaning a dataset. They want to confirm that exactly three characters are present in each cell of a certain spreadsheet column. Which tool can they use?
L3 Data cleaning techniquesL4 Cleaning data in spreadsheetsWeekly challenge 2Question 1Which of the following terms describe dirty data? Select all that apply.
Question 2Field length is a spreadsheet tool for determining if a field has been duplicated.
Question 3A data analyst notices that the customer in row 2 shares the same Customer ID as the customer in row 6. What does this scenario describe?
Question 4Fill in the blank: Conditional formatting is a spreadsheet tool that changes how _ appear when values meet a specific condition.
Question 5A data analyst uses the SPLIT function to divide a text string around a specified character and put each fragment into a new, separate cell. What is the specified character separating each item called?
Question 6For a function to work properly, data analysts must follow each function’s predetermined structure. What is this structure called?
Question 7You are working with the following selection of a spreadsheet:
In order to extract the five-digit postal code from Burlington, MA, what is the correct function?
Question 8A data analyst in a human resources department is working with the following selection of a spreadsheet:
They want to create employee identification numbers (IDs) in column D. The IDs should include the year hired plus the last four digits of the employee’s Social Security Number (SS#). What function will create the ID 20093208 for the employee in row 5?
Question 9An analyst is cleaning a new dataset containing 500 rows. They want to make sure the data contained from cell B2 through cell B300 does not contain a number greater than 50. Which of the following COUNTIF function syntaxes could be used to answer this question? Select all that apply.
Question 10The V in VLOOKUP stands for what?
Question 11Fill in the blank: Data mapping is the process of _ fields from one data source to another.
Question 12Describe the relationship between a primary key and a foreign key.
Week 3: Cleaning data with SQLCleaning data in SQLKnowing a variety of ways to clean data can make an analyst’s job much easier. In this part of the course, you’ll check out how to clean your data using SQL. You’ll explore queries and functions that you can use in SQL to clean and transform your data to get it ready for analysis. Learning Objectives
Answers to week 3 quiz questionsL2 More about SQLQuestion 1Which of the following are benefits of using SQL? Select all that apply.
Question 2Which of the following tasks can data analysts do using both spreadsheets and SQL? Select all that apply.
Question 3SQL is a language used to communicate with databases. Like most languages, SQL has dialects. How should data analysts approach SQL dialects? Select all that apply.
L3 Learn basic SQL queriesQuestion 1Which of the following SQL functions can data analysts use to clean string variables? Select all that apply.
Question 2You are working with a database of information about middle school students. The student_data table contains the name and eight-digit identification (ID) number for each student. The first four digits of each ID number correspond to the student’s graduation year. For example, 20267482 indicates the student will graduate in 2026. The identification number is stored as a string in the id_number column. How do you complete this query to return the name of all students who will graduate in 2026? Select name from student data where This function instructs the database to return four characters of each student ID, starting with the first character. It will only retrieve data about students who will graduate in 2026. Select name from student data where SUBSTR (id number, 1, 4) = 2026
Question 3A data analyst wants to confirm that all of the text strings in a table are the correct length. How would they complete the following query to return any routes greater than 10 characters long?
Weekly challenge 3Question 1Data analysts choose SQL for which of the following reasons? Select all that apply.
Question 2In which of the following situations would a data analyst use spreadsheets instead of SQL? Select all that apply.
Question 3A data analyst creates many new tables in their company’s database. When the project is complete, the analyst wants to remove the tables so they don’t clutter the database. What SQL commands can they use to delete the tables?
Question 4A data analyst is cleaning customer data for an online retail company. They are working with the following section of a database: The analyst wants to find out if the state data is consistent and if any text strings contain more than two characters. What is the correct SQL clause to use to find any text strings containing more than two characters?
Question 5Fill in the blank: The _ function counts the number of characters a string contains.
Question 6In SQL databases, what data type refers to a number that contains a decimal?
Question 7Fill in the blank: In SQL databases, the _ function can be used to convert data from one datatype to another.
Question 8Fill in the blank: The _ function can be used to return non-null values in a list.
Week 4: Verify and report on your cleaning resultsVerify and report your cleaning resultsCleaning your data is an essential step in the data analysis process. Verifying and reporting your cleaning is a way to show that your data is ready for the next step. In this part of the course, you’ll find out the processes involved with verifying and reporting data cleaning as well as their benefits. Learning Objectives
Answers to week 4 quiz questionsL2 Manually cleaning dataQuestion 1Making sure data is properly verified is an important part of the data-cleaning process. Which of the following tasks are involved in this verification? Select all that apply.
Question 2An analyst has just finished cleaning a dataset. Before analysis, why might the analyst want to revisit the business problem? Select all that apply.
Question 3A data analyst is cleaning a dataset with inconsistent formats and repeated cases. They use the TRIM function to remove extra spaces from string variables. What other tools can they use for data cleaning? Select all that apply.
L3 Documenting cleaning resultsQuestion 1Fill in the blank: While cleaning data, documentation is used to track _. Select all that apply.
Question 2Why is it important for a data analyst to document the evolution of a dataset? Select all that apply.
L4 Documentation the cleaning processQuestion 1Which of the following data errors can be eliminated by documenting the data-cleaning process? Select all that apply.
Question 2Documenting data-cleaning makes it possible to achieve what goals? Select all that apply.
Weekly challenge 4Question 1The data collected for an analysis project has just been cleaned. What are the next steps for a data analyst? Select all that apply.
Question 2A data analyst is in the verification step. They consider the business problem, the goal, and the data involved in their analytics project. What scenario does this describe?
Question 3Which function removes leading, trailing, and repeated spaces in data?
Question 4A data analyst uses the COUNTA function to count which of the following?
Question 5A WHEN statement considers one or more conditions and returns a value as soon as that condition is met.
Question 6What is the process of tracking changes, additions, deletions, and errors during data cleaning?
Question 7Fill in the blank: A changelog contains a _ list of modifications made to a project.
Question 8Reviewing version history is an effective way to view a changelog in SQL.
Week 5: Optional: Adding data to your resumeOptional: Adding data to your resumeCreating an effective resume will help you on your data analytics career path. In this part of the course, you’ll learn all about the job application process with a focus on crafting a resume that highlights your strengths and applicable experience. Even if you aren’t applying to jobs yet, it’s still a good time to improve your resume. It’s like spring training for a first season in a major league–you don’t want to miss it! Learning Objectives
Career-building expertise on YouTubeHow to build a compelling data science portfolio and resume: A hiring manager from Quora reviews actual resumes from data science candidates and gives candid feedback on areas of improvement. Learn what to include and omit from your resume and portfolio as well as formatting tips. This offers a great firsthand look into what hiring managers are seeking when reviewing your resume and portfolio Portfolio and resume analysis with data science hiring managers: We put together a panel of hiring managers to discuss what they are seeking in candidates and how they examine different resumes submitted by job seekers like you. Learn from the mistakes of others and get ahead of the curve by adapting your resume/portfolio to avoid the noted mistakes and capitalize on what others have done well in their resumes Overview of the Data Science Interview Process: Hiring managers at Google discuss typical data science interviews, including the soft and hard skills you will want to prioritize. You will get a better sense of the interview process from both sides, and better prepare yourself for what to expect when interviewing for a data science role. Live Breakdown of Common Data Science Interview Questions: Watch a mock interview to see how a Kaggle data scientist answers questions during a data science interview. The video also includes live coding! This video is great preparation for some of the most commonly asked data science interview questions. Am I a Good Fit? Identifying Your Best Data Science Job Opportunities: Ever wonder where you will fit in for your future career? This chat with Jessica Kirkpatrick, an intelligence manager, gives you a great breakdown of the different types of categories within the data science job market, the different types of job opportunities you may notice, and how you can frame previous work and skills from another career to fit into the data science job market. Real Stories from a Panel of Successful Career Switchers: Are you switching careers? Awesome! Learn from people who were in the same position as you and successfully switched their careers into data science. This panel discusses the different experiences in their careers and life that shifted them into the data science field. Week 6: Course challengePrepare for the course challenge by reviewing terms and definitions in the glossary. Then, demonstrate your knowledge of the importance of sample size, data integrity, and the connection of data to business objectives during the quiz. You will also have an opportunity to apply your skill with data cleaning techniques in both spreadsheets and SQL. Finally, document, report on, and verify your data-cleaning process and results. Learning Objectives
Course challengeScenario 1, questions 1-5Question 1You are a data analyst at a small analytics company. Your company is hosting a project kick-off meeting with a new client, Meer-Kitty Interior Design. The agenda includes reviewing their goals for the year, answering any questions, and discussing their available data. Meer-Kitty Interior Design About Us Page.pdf Meer-Kitty Interior Design Business Plan.pdf Meer-Kitty Interior Design has two goals. They want to expand their online audience, which means getting their company and brand known by as many people as possible. They also want to launch a line of high-quality indoor paint to be sold in-store and online. You decide to consider the data about indoor paint first. Kitty Survey Feedback – Meer-Kitty survey feedback.csv You are pleased to find that the available data is aligned to the business objective. However, you do some research about confidence level for this type of survey and learn that you need at least 120 unique responses for the survey results to be useful. Therefore, the dataset has two limitations: First, there are only 40 responses; second, a Meer-Kitty superfan, User 588, completed the survey 11 times. As the survey has too few responses and numerous duplicates that are skewing results, what are your options? Select all that apply.
Question 2During the meeting, you also learn that Meer-Kitty videos are hosted on their website. For each product offered, there is an accompanying video for customers to learn more. So, more views for a video suggests greater consumer interest. Your goal is to identify which videos are most popular, so Meer-Kitty knows what topics to explore in the future. Unfortunately, Meer-Kitty has just three months of data available because they only recently launched the videos on their site. Without enough data to identify long-term trends about the video subjects that people prefer, what should you do?
Question 3Now that you’ve identified some limitations with Meer-Kitty’s data, you want to communicate your concerns to stakeholders. In addition to insufficient video trend data, your main concern with the indoor paint survey is that the data isn’t representative of the population as a whole. Clearly, one particular respondent, the superfan, is overrepresented. This means the data doesn’t represent the population as a whole. When surveying people for Meer-Kitty in the future, what are some best practices you can use to address some of the issues associated with sampling bias? Select all that apply.
Question 4The stakeholders understand your concerns and agree to repeat the indoor paint survey. In a few weeks, you have a much better dataset with more than 150 responses and no duplicates. Kitty Survey Feedback – New Meer-Kitty survey feedback.csv You notice that questions 4 and 5 are dependent on the respondent’s answer to question 3. So, you need to determine how many people answered Yes to question 3, then compare that to responses to questions 4 and 5. That way, you will know if questions 4 and 5 have any nulls. You decide to use a spreadsheet tool that changes how cells appear when they contain the word Yes. Which tool do you use?
Question 5You continue cleaning the data. You use tools such as remove duplicates and COUNTIF to ensure the dataset is complete, correct, and relevant to the problem you’re trying to solve. Then, you complete the verification and reporting processes to share the details of your data-cleaning effort with your team. While reviewing, your team notes one aspect of data cleaning that would improve the dataset even more. They point out that the new survey also has a new question in Column G: “What are your favorite indoor paint colors?” This was a free-response question, so respondents typed in their answers. Some people included multiple different colors of paint. In order to determine which colors are most popular, it will be necessary to put each color in its own cell. What spreadsheet function enables you to put each of the colors in Column G into a new, separate cell?
Scenario 2, questions 6-10Question 6You’ve completed this program and are interviewing for a junior data scientist position. The job is at B.Spoke Market Research, a company that analyzes market conditions using customer surveys and other research methods. The detailed job description can be found below: C4 B.Spoke Market Research Job Description.pdf So far, you’ve had a phone interview with a recruiter and you’ve secured a second interview with the B.Spoke team. The recruiter’s email can be found below: C4 S2 Email from Recruiter.pdf You arrive 15 minutes early for your interview. Soon, you are escorted into a conference room, where you meet Jodie Choi, the data science lead. After welcoming you, the behavioral interview begins. For your first question, your interviewer wants to learn about your experience with spreadsheets. She says: Sometimes the team needs data that is stored in different spreadsheets. So, we use a spreadsheet function to find the information we need. There is a spreadsheet function that searches for a value in the first column of a given range and returns the value of a specified cell in the row in which it is found. It is called SEARCH.
Question 7Next, your interviewer wants to know more about your understanding of tools that work in both spreadsheets and SQL. She explains that the data her team receives from customer surveys sometimes has many duplicate entries. She says: Spreadsheets have a great tool for that called remove duplicates. In SQL, you can include DISTINCT to do the same thing. In which part of the SQL statement do you include DISTINCT?
Question 8Now, your interviewer explains that the data team usually works with very large amounts of customer survey data. After receiving the data, they import it into a SQL table. But sometimes, the new dataset imports incorrectly and they need to change the format. She asks: What function would you use to convert data in a SQL table from one datatype to another?
Question 9Next, your interviewer explains that one of their clients is an online retailer that needs to create product numbers for a vast inventory. Her team does this by combining the text strings for product number, manufacturing date, and color. She asks: Which SQL function would you use to add strings together to create new text strings?
Question 10For your final question, your interviewer explains that her team often comes across data with extra spaces. She asks: Which function would enable you to eliminate those extra spaces? You respond: To eliminate extra spaces for consistency, use the TRIM function.
Related contentBasic Statistics Mini-Course Google Data Analytics Professional Certificate Course 1: Foundations – Cliffs Notes Google Data Analytics Professional Certificate Course 2: Ask Questions – quiz answers Google Data Analytics Professional Certificate Course 3: Prepare Data – quiz answers Google Data Analytics Professional Certificate Course 5: Analyze Data – quiz answers Google Data Analytics Professional Certificate Course 6: Share Data – quiz answers Google Data Analytics Professional Certificate Course 7: Data Analysis with R – quiz answers Google Data Analytics Professional Certificate Course 8: Capstone – quiz answers IT career paths – everything you need to know Back to DTI Courses What function can be used to convert data from one datatype to another?Additionally there area number of OLAP DML functions that you can use to convert values from one data type to another.
What function would you use to convert data in a SQL table?The CONVERT() function in SQL server is used to convert a value of one type to another type.
What are the SQL data types?Data types in SQL Server are organized into the following categories:. Exact numerics. Unicode character strings.. Approximate numerics. Binary strings.. Date and time. Other data types.. Character strings.. bigint. numeric.. bit. smallint.. decimal. smallmoney.. int. tinyint.. What is SQL double?What is a double data type in SQL? DOUBLE(size, d) A normal-size floating point number. The total number of digits is specified in size. The number of digits after the decimal point is specified in the d parameter.
|