Image by author gotten from Part 5 of the Google Data Analytics Professional Certificate In the data analysis phases, the fifth (5th) phase is “The analyze” phase which is an important phase such that part of the insights that will be gotten from a data is in this phase. This is the fifth part of the Google Data Analytics Professional Course and I’ll be discussing a few things I learnt from this part. Show
What does analysis mean? Analysis is the process used to make sense of the data collected. The goal of analysis is to identify trends and relationships within data so you can accurately answer the question you’re asking and to do this, you need to stick to the 4 phases of analysis which are:
Moving to the first phase which is Organize data. There are ways to organize data and some ways are:
When sorting a dataset in a spreadsheet, you can sort in two ways: Sort sheet or Sort range.
There’s two methods for sorting spreadsheet data where one involves using the menu and the other involves writing out the sort function. To sort a spreadsheet data using the menu bar, you follow the steps below:
Now you have two choices: sort sheet or sort range. Image by authorTo sort a spreadsheet using the sort function, you can follow this steps:
Sorting data in SQL was also treated and some of the ways treated were: Using the ORDER BY clause to sort results returned in a query which can be done using this syntax: SELECT column 1, column 2 or you can just put * to select all
columns You can also order in either a descending order or ascending order. See below the syntax: SELECT column 1, column 2 or you can just put * to select all columns Some other ways of sorting data in SQL was covered but won’t be discussed in this article. After sorting data, before you go on to analyze, you need to check if each column has the right data type because incorrectly formatted data can lead to: Mistakes which may take time to fix and then affect the stakeholder’s decision-making. Typecasting is a way of converting data from one type to another in databases (SQL). To change data type in spreadsheet, you can click on data then you’ll have a dropdown to change the data type. You can also use the convert function. Example below: Image by author: Using the convert functionIn the above example, the convert function was used to convert the column which contains temperature from Fahrenheit to Celsius. Furthermore, “data validation” which is a spreadsheet feature was discussed. This discussion was basically about the function and not the process of data validation. It makes you control what can and can’t be entered in your worksheet. You can use this function to:
Data can also be transformed in SQL using the CAST function. There are also more specialized functions like COERCION to work with big numbers, and UNIX_DATE to work with dates. UNIX_DATE returns the number of days that have passed since January 1, 1970 and is used to compare and work with dates across multiple time zones. The syntax of the CAST function is seen below: Image from Google Data Analytics Professional CourseWhere expression is the data to be converted and typename is the data type to be returned. Other ways to transform data from one type to another using the CAST function was also treated. Furthermore, some functions that can be used to perform operations on strings were talked about. Those functions are: LEN, LEFT, RIGHT and FIND. You can also manipulate strings in SQL. Some functions you can use are: the CONCAT function (adds strings together to create new text strings that can be used as unique keys), CONCAT-WS function (this adds two or more strings together with a separator like period (.)), CONCAT with + (this adds two or more strings together using the + operator). Moving on, some ways to seek for help when we get stuck with something was treated because sometimes when we get stuck, we feel like we can sort out an issue ourselves which can slow down the pace at which we do something but when we engage others, we can be more productive. Two (2) ways one can seek for help when we get stuck with things are:
When using the web to search for solutions, there are best practices which are:
The course moved further to data aggregation and how data aggregation helps a data analyst. Aggregation means collecting or gathering many separate pieces into a whole therefore, data aggregation is the process of gathering data from multiple sources in order to combine it into a single summarized collection. Data aggregation helps data analyst:
An aggregation tool is VLOOKUP which can also be called Vertical Lookup. This is a function that searches for a certain value in a column to return a corresponding piece of information. I moved further to learn about questions to ask during a troubleshooting process and some of the questions are:
Also, I learnt about understanding Joins in SQL. Join is a SQL clause that is used to combine rows from two or more tables based on a related column. Common Joins analysts use are:
Aliases are used in SQL queries to create temporary names for a column or table. Aliases makes referencing tables and columns in your SQL queries much simple when you have table or column names that are too long or complex to make use of in queries. We also have subquery in SQL which is basically an SQL query nested inside a larger query. Some terms were also defined and they are as follows:
Note that SUMIF and COUNTIF is different from SUMIFS and COUNTIFS.
Embedding calculations in SQL was also discussed and some things i picked were:
Data validation process is checking and rechecking the quality of your data so that it is complete, accurate, secure and consistent. It’s a form of data cleaning. Types of validation were mentioned and they are: Data type, data range, data constraints, data consistency, data structure and code validation. Everything in this write-up was learnt from the part 5 of the Google Data Analytics professional course which I recommend if you want to learn the concept of Data Analysis from the foundational level. I’m open to questions, discussion and comments and we could connect via LinkedIn. Watch out for the part 6 of this and before then, keep learning. What function would you use to search for a certain value in a spreadsheet column to return the corresponding piece of information?The VLOOKUP or Vertical Lookup function is used when data is listed in columns. This function searches for a value in the left-most column and matches it with data in a specified column in the same row. You can use VLOOKUP to find data in a sorted or unsorted table.
What function would you use to search for a certain value in a spreadsheet column returns the corresponding piece of information 1 point Countif search Vlookup return?The VLOOKUP function searches for a certain value in a column to return a corresponding piece of information.
What function would you use to convert data in a SQL table from one datatype to another 1 point?The CONVERT() function in SQL server is used to convert a value of one type to another type.
What should an analyst do if they do not have the data needed to meet a business objective select all that apply 1 point?If an analyst does not have the data needed to meet a business objective, they should gather related data on a small scale and request additional time. Then, they can find more complete data or perform the analysis by finding and using proxy data from other datasets.
|