A data analyst wants to search for a certain value in a column, then return a corresponding piece of information. which function should they use?

Image by author gotten from Part 5 of the Google Data Analytics Professional Certificate

In the data analysis phases, the fifth (5th) phase is “The analyze” phase which is an important phase such that part of the insights that will be gotten from a data is in this phase. This is the fifth part of the Google Data Analytics Professional Course and I’ll be discussing a few things I learnt from this part.

What does analysis mean?

Analysis is the process used to make sense of the data collected. The goal of analysis is to identify trends and relationships within data so you can accurately answer the question you’re asking and to do this, you need to stick to the 4 phases of analysis which are:

  • Organize data
  • Format and adjust data
  • Get input from others
  • Transform data

Moving to the first phase which is Organize data. There are ways to organize data and some ways are:

  • Sorting data: This is when you arrange data into a meaningful order to make it easier to understand, analyze and visualize.
  • Filtering data: This is the process of showing only the data that meets a specific criteria while hiding the rest.
Image from Google Data Analytics Professional Course

When sorting a dataset in a spreadsheet, you can sort in two ways: Sort sheet or Sort range.

  • Sort sheet: All of the data in a spreadsheet is sorted by the ranking of a specific sorted column data across rows is kept together.
  • Sort range: Nothing else on the spreadsheet is rearranged besides the specified cells in a column.

There’s two methods for sorting spreadsheet data where one involves using the menu and the other involves writing out the sort function.

To sort a spreadsheet data using the menu bar, you follow the steps below:

  • Click on the alphabet’s button to highlight all the cells in the column.
Image by author
  • Head to the Data tab in the menu.

Now you have two choices: sort sheet or sort range.

Image by author

To sort a spreadsheet using the sort function, you can follow this steps:

  • Enter the sort function beginning with the equal to sign =sort(.
  • Enter the range of data that you want to sort, for example, G1:I
  • Put a comma after the number that represents the column you wish to sort by; for instance, put the number 3 after the first comma to represent the second column. If you don’t want to do that, you can also enter the desired column’s range, for example, C2:C.
  • If you want to sort in ascending order, type TRUE, and if you want to sort in descending order, type FALSE after a comma.
  • Type a closing parenthesis, and then press enter on the keyboard. After following the steps above, you should have this as your sort function =sort(A3:C, 2, TRUE).

Sorting data in SQL was also treated and some of the ways treated were:

Using the ORDER BY clause to sort results returned in a query which can be done using this syntax:

SELECT column 1, column 2 or you can just put * to select all columns
FROM table_name (inputting the table name)
ORDER BY column 1, column 2 (you then input the column name of the column that you want to order by)

You can also order in either a descending order or ascending order. See below the syntax:

SELECT column 1, column 2 or you can just put * to select all columns
FROM table_name (inputting the table name)
ORDER BY column 1, column 2 DESC (you then input the column name of the column that you want to order by and desc showing in descending order)

Some other ways of sorting data in SQL was covered but won’t be discussed in this article.

After sorting data, before you go on to analyze, you need to check if each column has the right data type because incorrectly formatted data can lead to: Mistakes which may take time to fix and then affect the stakeholder’s decision-making.

Typecasting is a way of converting data from one type to another in databases (SQL).

To change data type in spreadsheet, you can click on data then you’ll have a dropdown to change the data type.

Image by author: Changing data type

You can also use the convert function. Example below:

Image by author: Using the convert function

In the above example, the convert function was used to convert the column which contains temperature from Fahrenheit to Celsius.

Furthermore, “data validation” which is a spreadsheet feature was discussed. This discussion was basically about the function and not the process of data validation. It makes you control what can and can’t be entered in your worksheet.

You can use this function to:

  • Add dropdown lists with predetermined options.
  • Create custom check boxes.
  • Protect structured data and formulas.

Data can also be transformed in SQL using the CAST function. There are also more specialized functions like COERCION to work with big numbers, and UNIX_DATE to work with dates. UNIX_DATE returns the number of days that have passed since January 1, 1970 and is used to compare and work with dates across multiple time zones.

The syntax of the CAST function is seen below:

Image from Google Data Analytics Professional Course

Where expression is the data to be converted and typename is the data type to be returned.

Other ways to transform data from one type to another using the CAST function was also treated.

Furthermore, some functions that can be used to perform operations on strings were talked about. Those functions are: LEN, LEFT, RIGHT and FIND.

You can also manipulate strings in SQL. Some functions you can use are: the CONCAT function (adds strings together to create new text strings that can be used as unique keys), CONCAT-WS function (this adds two or more strings together with a separator like period (.)), CONCAT with + (this adds two or more strings together using the + operator).

Moving on, some ways to seek for help when we get stuck with something was treated because sometimes when we get stuck, we feel like we can sort out an issue ourselves which can slow down the pace at which we do something but when we engage others, we can be more productive. Two (2) ways one can seek for help when we get stuck with things are:

  • Asking someone for help
  • Searching the internet for answers

When using the web to search for solutions, there are best practices which are:

  • Thinking skills: The way you think be it analytical, mathematical or structured thinking help to build your mental model (your thought process and the way you approach a problem).
  • Using the right data analytics term: Ability to us the right terms to search for solutions such that those terms you use will make you get options as to what other data analysts are talking about.
  • Basic knowledge of tools

The course moved further to data aggregation and how data aggregation helps a data analyst. Aggregation means collecting or gathering many separate pieces into a whole therefore, data aggregation is the process of gathering data from multiple sources in order to combine it into a single summarized collection.

Data aggregation helps data analyst:

  • Identify trends
  • Make comparisons
  • Gain insights

An aggregation tool is VLOOKUP which can also be called Vertical Lookup. This is a function that searches for a certain value in a column to return a corresponding piece of information.

I moved further to learn about questions to ask during a troubleshooting process and some of the questions are:

  • How should I prioritize these issues?
  • In a single sentence, what’s the issue I’m facing?\
  • What resources can help solve the problem?
  • How can I stop this problem from happening in the future?

Also, I learnt about understanding Joins in SQL. Join is a SQL clause that is used to combine rows from two or more tables based on a related column.

Common Joins analysts use are:

  • Inner Join: A function that returns records with matching values in both tables.
  • Left Join: It’s a function that will return all the records from the left table and only the matching records from the right table.
  • Right Join: This is a function that will return all records from the right table and only the matching records from the left.
  • Outer Join: a function that combines RIGHT and LEFT join to return all matching records in both tables.

Aliases are used in SQL queries to create temporary names for a column or table. Aliases makes referencing tables and columns in your SQL queries much simple when you have table or column names that are too long or complex to make use of in queries.

We also have subquery in SQL which is basically an SQL query nested inside a larger query.

Some terms were also defined and they are as follows:

  • Functions: A great tool to help do some calculations. COUNTIF, SUMIF are examples. =COUNTIF (range, “value”), is the COUNTIF syntax.
  • Summary table: A table used to summarize statistical information about data.
  • SUMIF: A function that adds numeric data based on one condition.

Note that SUMIF and COUNTIF is different from SUMIFS and COUNTIFS.

  • SUMPRODUCT: A function that multiplies arrays and returns the sum of those products.
  • Array, a collection of value in cells.
  • Profit margin: A percentage that indicates how many cents of profit has been generated for each dollar of sale.
  • Pivot table: This allows you view data in multiple ways to find insights and trends. Pivot table has 4 basic parts which are: Rows, Columns, Values and Filters.
  • An operator is a symbol that names the type of operator or calculation to be performed in a formula. Spreadsheet and SQL have some similarities when it comes to operators.

Embedding calculations in SQL was also discussed and some things i picked were:

  • In SQL, underscores are lines used to underline words and connect text characters.
  • Embedding calculations in your queries will help your analysis to be organized while getting your results.
  • GROUPBY command is a command that group rows that have the same values from a table into summary rows.
  • Extract command lets you pull one part of a given date to use.

Data validation process is checking and rechecking the quality of your data so that it is complete, accurate, secure and consistent. It’s a form of data cleaning.

Types of validation were mentioned and they are: Data type, data range, data constraints, data consistency, data structure and code validation.

Everything in this write-up was learnt from the part 5 of the Google Data Analytics professional course which I recommend if you want to learn the concept of Data Analysis from the foundational level.

I’m open to questions, discussion and comments and we could connect via LinkedIn.

Watch out for the part 6 of this and before then, keep learning.

What function would you use to search for a certain value in a spreadsheet column to return the corresponding piece of information?

The VLOOKUP or Vertical Lookup function is used when data is listed in columns. This function searches for a value in the left-most column and matches it with data in a specified column in the same row. You can use VLOOKUP to find data in a sorted or unsorted table.

What function would you use to search for a certain value in a spreadsheet column returns the corresponding piece of information 1 point Countif search Vlookup return?

The VLOOKUP function searches for a certain value in a column to return a corresponding piece of information.

What function would you use to convert data in a SQL table from one datatype to another 1 point?

The CONVERT() function in SQL server is used to convert a value of one type to another type.

What should an analyst do if they do not have the data needed to meet a business objective select all that apply 1 point?

If an analyst does not have the data needed to meet a business objective, they should gather related data on a small scale and request additional time. Then, they can find more complete data or perform the analysis by finding and using proxy data from other datasets.