What are the 5 sources of secondary data?

What are the 5 sources of secondary data?
Primary Sources are immediate, first-hand accounts of a topic, from people who had a direct connection with it. Primary sources can include:

Texts of laws and other original documents.

Newspaper reports, by reporters who witnessed an event or who quote people who did.

Speeches, diaries, letters and interviews - what the people involved said or wrote.

Original research.

Datasets, survey data, such as census or economic statistics.

Photographs, video, or audio that capture an event.

Primary Data: Data that has been generated by the researcher himself/herself, surveys, interviews, experiments, specially designed for understanding and solving the research problem at hand.

Secondary Data: Using existing data generated by large government Institutions, healthcare facilities etc. as part of organizational record keeping. The data is then extracted from more varied datafiles. 

Supplementary Data: A few years ago the Obama Administration judged that any research that is done using Federal Public funds should be available for free to the public. Moreover Data Management Plans should be in place to store and preserve the data for almost eternity. These data sets are published as Supplementary Materials in the journal lliterature, and data sets can downloaded and manipulated for research. 

NOTE: Even though the research is Primary source, the supplemental files downloaded by others becomes Secondary Source.

 Pros and Cons for each. 

Comparison Chart

BASIS FOR COMPARISON PRIMARY DATA SECONDARY DATA
Meaning Primary data refers to the first hand data gathered by the researcher himself. Secondary data means data collected by someone else earlier.
Data Real time data Past data
Process Very involved Quick and easy
Source Surveys, observations, experiments, questionnaire, personal interview, etc. Government publications, websites, books, journal articles, internal records etc.
Cost effectiveness Expensive Economical
Collection time Long Short
Specific Always specific to the researcher's needs. May or may not be specific to the researcher's need.
Available in Crude form Refined form
Accuracy and Reliability More Relatively less
 

Secondary data refer to the data that are gathered by a secondary party other than the user himself. The common sources of the secondary data for social science include statements, the data collected by government agencies, organisational documents, and the data that are basically collected for other research objectives. However, primary data, by difference, are gathered by the investigator conducting the research.

Also Check: Difference Between Primary Data And Secondary Data

Sources of Secondary Data

Secondary data are basically second-hand pieces of information. These are not gathered from the source as the primary data. To put it in other words, the secondary data are those that are already collected. So, these are comparatively less reliable than the primary data. 

These are usually used when the time for the enquiry is compact and the exactness of the enquiry can be settled to an extent. However, the secondary data can be gathered from different sources which can be categorised into two categories. These are as follows:

1. Published sources

2. Unpublished sources

1. Published sources

Secondary data is usually gathered from the published (printed) sources. A few major sources of published information are as follows:

  • Published articles of local bodies, and central and state governments
  • Statistical synopses, census records, and other reports issued by the different departments of the government
  • Official statements and publications of the foreign governments
  • Publications and reports of chambers of commerce, financial institutions, trade associations, etc.
  • Magazines, journals, and periodicals
  • Publications of government organisations like the Central Statistical Organisation (CSO), National Sample Survey Organisation (NSSO)
  • Reports presented by research scholars, bureaus , economists, etc.

2. Unpublished sources

Statistical data can be obtained from several unpublished references. Some of the major unpublished sources from which secondary data can be gathered are as follows:

  • The research works conducted by teachers, professors, and professionals
  • The records that are maintained by private and business enterprises
  • Statistics maintained by different departments and agencies of the central and the state government, undertakings, corporations, etc.

Practice questions

Q.1. _____________ sources mean data available in printed form.
a. Published

b. Unpublished

c. Both (a) and (b)

d. None of the above

Q.2. Records maintained by various government and private offices are examples of ________ source of collecting secondary data.
a. Published

b. Unpublished

c. Both (a) and (b)

d. None of the above

Q.3. Reports issued by agencies like WHO, UNO, IMF, etc., are examples of ________ source of collecting secondary data.
a. Published

b. Unpublished

c. Both (a) and (b)

d. None of the above

Answer Key
1 – a, 2 – b, 3 – a

The above-mentioned concept is for CBSE Class 11 Statistics for Economics – Meaning and Sources of Secondary Data. For solutions and study materials, visit our website or download the app for more information and the best learning experience.

What is secondary data, and why is it important? Find out in this post.

Within data analytics, there are many ways of categorizing data. A common distinction, for instance, is that between qualitative and quantitative data. In addition, you might also distinguish your data based on factors like sensitivity. For example, is it publicly available or is it highly confidential? 

Probably the most fundamental distinction between different types of data is their source. Namely, are they primary, secondary, or third-party data? Each of these vital data sources supports the data analytics process in its own way. In this post, we’ll focus specifically on secondary data. We’ll look at its main characteristics, provide some examples, and highlight the main pros and cons of using secondary data in your analysis. 

We’ll cover the following topics: 

Ready to learn all about secondary data? Then let’s go.

Secondary data (also known as second-party data) refers to any dataset collected by any person other than the one using it. 

Secondary data sources are extremely useful. They allow researchers and data analysts to build large, high-quality databases that help solve business problems. By expanding their datasets with secondary data, analysts can enhance the quality and accuracy of their insights. Most secondary data comes from external organizations. However, secondary data also refers to that collected within an organization and then repurposed.

Secondary data has various benefits and drawbacks, which we’ll explore in detail in section four. First, though, it’s essential to contextualize secondary data by understanding its relationship to two other sources of data: primary and third-party data. We’ll look at these next.

To best understand secondary data, we need to know how it relates to the other main data sources: primary and third-party data.

What is primary data?

‘Primary data’ (also known as first-party data) are those directly collected or obtained by the organization or individual that intends to use them. Primary data are always collected for a specific purpose. This could be to inform a defined goal or objective or to address a particular business problem. 

For example, a real estate organization might want to analyze current housing market trends. This might involve conducting interviews, collecting facts and figures through surveys and focus groups, or capturing data via electronic forms. Focusing only on the data required to complete the task at hand ensures that primary data remain highly relevant. They’re also well-structured and of high quality.

What is secondary data?

As explained, ‘secondary data’ describes those collected for a purpose other than the task at hand. Secondary data can come from within an organization but more commonly originate from an external source. If it helps to make the distinction, secondary data is essentially just another organization’s primary data. 

Secondary data sources are so numerous that they’ve started playing an increasingly vital role in research and analytics. They are easier to source than primary data and can be repurposed to solve many different problems. While secondary data may be less relevant for a given task than primary data, they are generally still well-structured and highly reliable.

What is third-party data?

‘Third-party data’ (sometimes referred to as tertiary data) refers to data collected and aggregated from numerous discrete sources by third-party organizations. Because third-party data combine data from numerous sources and aren’t collected with a specific goal in mind, the quality can be lower. 

Third-party data also tend to be largely unstructured. This means that they’re often beset by errors, duplicates, and so on, and require more processing to get them into a usable format. Nevertheless, used appropriately, third-party data are still a useful data analytics resource. You can learn more about structured vs unstructured data here

OK, now that we’ve placed secondary data in context, let’s explore some common sources and types of secondary data.

3. What are some examples of secondary data?

External secondary data

Before we get to examples of secondary data, we first need to understand the types of organizations that generally provide them. Frequent sources of secondary data include: 

  • Government departments
  • Public sector organizations
  • Industry associations
  • Trade and industry bodies
  • Educational institutions
  • Private companies
  • Market research providers

While all these organizations provide secondary data, government sources are perhaps the most freely accessible. They are legally obliged to keep records when registering people, providing services, and so on. This type of secondary data is known as administrative data. It’s especially useful for creating detailed segment profiles, where analysts hone in on a particular region, trend, market, or other demographic.

Types of secondary data vary. Popular examples of secondary data include:

  • Tax records and social security data
  • Census data
  • Electoral statistics
  • Health records
  • Books, journals, or other print media
  • Social media monitoring, internet searches, and other online data
  • Sales figures or other reports from third-party companies
  • Libraries and electronic filing systems
  • App data, e.g. location data, GPS data, timestamp data, etc.

Internal secondary data 

As mentioned, secondary data is not limited to that from a different organization. It can also come from within an organization itself. 

Sources of internal secondary data might include:

  • Sales reports
  • HR filings
  • Annual accounts
  • Quarterly sales figures
  • Customer relationship management systems
  • Emails and metadata
  • Website cookies

In the right context, we can define practically any type of data as secondary data. The key takeaway is that the term ‘secondary data’ doesn’t refer to any inherent quality of the data themselves, but to how they are used. Any data source (external or internal) used for a task other than that for which it was originally collected can be described as secondary data.

4. What are the advantages and disadvantages of using secondary data?

Secondary data is suitable for any number of analytics activities. The only limitation is a dataset’s format, structure, and whether or not it relates to the topic or problem at hand. 

When analyzing secondary data, the process has some minor differences, mainly in the preparation phase. Otherwise, it follows much the same path as any traditional data analytics project. You can learn more about secondary data analysis in this post. 

More broadly, though, what are the advantages and disadvantages of using secondary data? Let’s take a look.

Advantages of using secondary data

It’s an economic use of time and resources: Because secondary data have already been collected, cleaned, and stored, this saves analysts much of the hard work that comes from collecting these data firsthand. For instance, for qualitative data, the complex tasks of deciding on appropriate research questions or how best to record the answers have already been completed. Secondary data saves data analysts and data scientists from having to start from scratch. 

It provides a unique, detailed picture of a population: Certain types of secondary data, especially government administrative data, can provide access to levels of detail that it would otherwise be extremely difficult (or impossible) for organizations to collect on their own. Data from public sources, for instance, can provide organizations and individuals with a far greater level of population detail than they could ever hope to gather in-house. You can also obtain data over larger intervals if you need it., e.g. stock market data which provides decades’-worth of information. 

Secondary data can build useful relationships: Acquiring secondary data usually involves making connections with organizations and analysts in fields that share some common ground with your own. This opens the door to a cross-pollination of disciplinary knowledge. You never know what nuggets of information or additional data resources you might find by building these relationships.

Secondary data tend to be high-quality: Unlike some data sources, e.g. third-party data, secondary data tends to be in excellent shape. In general, secondary datasets have already been validated and therefore require minimal checking. Often, such as in the case of government data, datasets are also gathered and quality-assured by organizations with much more time and resources available. This further benefits the data quality, while benefiting smaller organizations that don’t have endless resources available.

It’s excellent for both data enrichment and informing primary data collection: Another benefit of secondary data is that they can be used to enhance and expand existing datasets. Secondary data can also inform primary data collection strategies. They can provide analysts or researchers with initial insights into the type of data they might want to collect themselves further down the line.

Disadvantages of using secondary data

They aren’t always free: Sometimes, it’s unavoidable—you may have to pay for access to secondary data. However, while this can be a financial burden, in reality, the cost of purchasing a secondary dataset usually far outweighs the cost of having to plan for and collect the data firsthand. 

The data isn’t always suited to the problem at hand: While secondary data may tick many boxes concerning its relevance to a business problem, this is not always true. For instance, secondary data collection might have been in a geographical location or time period ill-suited to your analysis. Because analysts were not present when the data were initially collected, this may also limit the insights they can extract.

The data may not be in the preferred format: Even when a dataset provides the necessary information, that doesn’t mean it’s appropriately stored. A basic example: numbers might be stored as categorical data rather than numerical data. Another issue is that there may be gaps in the data. Categories that are too vague may limit the information you can glean. For instance, a dataset of people’s hair color that is limited to ‘brown, blonde and other’ will tell you very little about people with auburn, black, white, or gray hair. 

You can’t be sure how the data were collected: A structured, well-ordered secondary dataset may appear to be in good shape. However, it’s not always possible to know what issues might have occurred during data collection that will impact their quality. For instance, poor response rates will provide a limited view. While issues relating to data collection are sometimes made available alongside the datasets (e.g. for government data) this isn’t always the case. You should therefore treat secondary data with a reasonable degree of caution.

Being aware of these disadvantages is the first step towards mitigating them. While you should be aware of the risks associated with using secondary datasets, in general, the benefits far outweigh the drawbacks.

5. Wrap-up and further reading

In this post we’ve explored secondary data in detail. As we’ve seen, it’s not so different from other forms of data. What defines data as secondary data is how it is used rather than an inherent characteristic of the data themselves. 

To learn more about data analytics, check out this free, five-day introductory data analytics short course. You can also check out these articles to learn more about the data analytics process: