By the end of this chapter you'll be able to…

  • 1Distinguish between primary and secondary data with their respective advantages and disadvantages
  • 2Compare census (complete enumeration) and sample survey methods and identify when each is appropriate
  • 3Describe the three main random sampling methods — simple random, stratified, and systematic
  • 4Explain the difference between random and non-random sampling and the problem of non-random methods
  • 5Identify five major sources of secondary data in India (Census, NSSO, RBI, Economic Survey, PLFS)
💡
Why this chapter matters
Every statistical analysis is only as good as the data behind it — this chapter teaches how data is collected, the difference between primary and secondary sources, and how to choose a sampling method. These skills underpin all quantitative research and policy design.

Before you start — revise these

A 5-minute refresher here will save you 30 minutes of confusion below.

Collection of Data

"Garbage in, garbage out. The quality of your analysis depends on the quality of your data."

1. Chapter Overview

Before you can ANALYSE data, you must COLLECT it. This chapter covers: the TWO SOURCES (primary and secondary), METHODS of collecting primary data, the difference between CENSUS and SAMPLE surveys, TYPES of sampling (random and non-random), and key SOURCES of secondary data in India.


2. Primary vs Secondary Data

Primary DataSecondary Data
DefinitionData collected by the investigator FIRST-HAND for their SPECIFIC purposeData ALREADY collected by someone else for SOME OTHER purpose
ExamplesA survey you conduct to study student spending habitsCensus of India data. NSSO consumption survey. RBI bulletins.
AdvantagesTailored to YOUR question. You know the quality.Cheap, fast. Covers large populations and long time periods.
DisadvantagesExpensive, time-consuming. Requires fieldwork.May not perfectly fit YOUR question. Quality may be uncertain.

3. Methods of Collecting Primary Data

A. Census vs Sample

Census (Complete Enumeration)Sample Survey
WhatSurvey EVERY unit in the populationSurvey a SUBSET (sample) and INFER about the whole population
CostVERY EXPENSIVECheaper
TimeVery TIME-CONSUMINGFaster
AccuracyIn theory: PERFECT. In practice: errors possible in huge operations.Sampling error exists, but CAN be measured. If well-designed: reliable.
When UsedPopulation is small. Or high precision required (national census).Population is large. When census is impractical.

B. Methods of Sampling

Random (Probability) Sampling

  • Each unit has a KNOWN, NON-ZERO probability of being selected
  • Simple Random Sampling: Each unit has EQUAL chance. Like drawing names from a hat.
  • Stratified Sampling: Population divided into GROUPS (strata) first (by age, gender, income). Then random sample from EACH stratum. Ensures all groups are represented.
  • Systematic Sampling: Select every Kth unit (every 10th house on a street, every 100th name on a list).

Non-Random (Non-Probability) Sampling

  • Selection is based on INVESTIGATOR'S JUDGMENT or CONVENIENCE
  • Judgment / Purposive Sampling: Investigator CHOOSES units they think are representative
  • Convenience Sampling: Choose whoever is EASIEST to reach
  • PROBLEM: Cannot measure sampling error. May NOT be representative. Bias risk.

4. Sources of Secondary Data in India

SourceWhat It Provides
Census of India (every 10 years — 2011, next: 2021-delayed)Population, literacy, occupation, housing, amenities — for every village and town
NSSO (National Sample Survey Office — now merged into MoSPI surveys)Consumption expenditure, employment, health, education — continuous surveys
RBI BulletinBanking, money supply, forex reserves, interest rates, inflation
Economic Survey (Ministry of Finance, annually)Comprehensive review of the Indian economy
Registrar General of IndiaBirth rates, death rates, IMR, life expectancy
Periodic Labour Force Survey (PLFS)Employment and unemployment

5. Pilot Survey and Questionnaire Design

  • Pilot survey: a SMALL-SCALE TRIAL before the full survey. Tests: are the questions CLEAR? Do they generate USEFUL answers? Are there ambiguities?
  • Questionnaire: the form with questions. Must be: CLEAR, SPECIFIC, UNAMBIGUOUS, LOGICALLY ORDERED. Avoid leading questions. Pre-test with a pilot.

6. Exam Focus

  1. Primary vs Secondary data — distinction, pros/cons
  2. Census vs Sample survey — when each is used
  3. Random sampling — simple random, stratified, systematic
  4. Non-random sampling — judgment, convenience. Problem: cannot measure error.
  5. Key Indian data sources — Census, NSSO, RBI, Economic Survey

7. Conclusion

Good data doesn't grow on trees. It is COLLECTED — with care, method, and awareness of sources of error:

  • PRIMARY data: You collect it. Tailored but expensive.
  • SECONDARY data: Someone else collected it. Cheap but may not fit perfectly.
  • SAMPLING: When you can't survey everyone, pick a REPRESENTATIVE sample using RANDOM methods.
  • INDIA's DATA INFRASTRUCTURE: Census, NSSO, RBI, Economic Survey — these are the building blocks of economic knowledge in India.

'To consult the statistician after an experiment is finished is often merely to ask him to conduct a post mortem.' — R.A. Fisher. Statistical thinking begins BEFORE data is collected — with the design of the survey.

Key formulas & results

Everything you need to memorise, in one card. Screenshot this for revision.

Primary Data
Data collected first-hand by the investigator for their specific research purpose
Tailored to the question but expensive and time-consuming; you control the quality
Secondary Data
Data already collected by someone else for some other purpose, used second-hand
Cheap and fast, covers long time periods; may not perfectly fit the research question
Simple Random Sampling
Every unit in the population has an equal probability of selection; equivalent to drawing from a hat
Each unit's probability = 1/N where N is population size
Systematic Sampling
Select every Kth unit from a list; K = Population size / Sample size
Example: if population = 1000 and sample = 100, select every 10th name on the list
Stratified Sampling
Divide population into homogeneous strata, then take random samples from each stratum
Ensures proportional representation of all sub-groups (e.g., rural and urban, male and female)
⚠️

Common mistakes & fixes

These are the exact errors that cost students marks in board exams. Read them once, save yourself the trouble.

WATCH OUT
Saying census is always better than sampling because it covers everyone
Census is impractical for large populations — too costly and time-consuming. A well-designed sample survey is reliable, faster, and cheaper. Sampling error exists but CAN be measured; census errors are harder to detect.
WATCH OUT
Confusing stratified sampling with quota sampling (non-random)
In stratified sampling, selection WITHIN each stratum is RANDOM — this makes it a probability method. Quota sampling also divides into groups but selection within quota is non-random (by convenience or judgment).
WATCH OUT
Treating non-random sampling as acceptable for generalising to the population
Non-random (non-probability) sampling cannot be used to make inferences about the whole population because sampling error cannot be measured. Results may be biased.

NCERT exercises (with solutions)

Every NCERT exercise from this chapter — what it covers and how many questions to expect.

Practice problems

Try each one yourself before tapping "Show solution". Active recall > rereading.

Q1EASY· primary-vs-secondary
Distinguish between primary data and secondary data. Give one example of each from the Indian economy.
Show solution
Primary data: Collected first-hand by the researcher for their specific purpose. Example: A researcher conducting a household survey on consumer spending in Mumbai to study the impact of rising prices. Secondary data: Already collected by someone else for another purpose. Example: Using Census of India data on literacy rates to study the relationship between education and poverty. Key difference: Primary data is original and specific to the research question; secondary data is second-hand and may require adaptation.
Q2MEDIUM· sampling-methods
Explain three methods of random sampling with examples. Why is random sampling preferred over non-random sampling?
Show solution
Three methods of random (probability) sampling: 1. Simple Random Sampling: Every unit has an equal chance of selection. Example: Selecting 500 students from a list of 10,000 by lottery or random number tables — like drawing names from a hat. 2. Stratified Sampling: Population divided into groups (strata) first, then random samples taken from each stratum. Example: A survey on employment divides workers into formal and informal sectors, urban and rural, then randomly samples from each group — ensures all groups are represented. 3. Systematic Sampling: Select every Kth unit. Example: To survey every 20th house on a street of 1,000 houses (K = 1000/50 = 20). Why random sampling is preferred: In random sampling, every unit has a known, non-zero probability of selection, so: (a) Sampling error can be measured and reduced; (b) Results can be generalised to the population with a known level of confidence; (c) No selection bias from the investigator's judgment. Non-random methods (convenience, judgment) have unmeasurable bias.
Q3HARD· data-sources
What is a census survey? Compare it with a sample survey. Discuss five important sources of secondary data in India and what each provides.
Show solution
Census survey (complete enumeration): Every unit in the population is surveyed. Example: India's decennial Census of India surveys every household in the country. Advantages: No sampling error; complete coverage. Disadvantages: Extremely expensive; very time-consuming; prone to errors in large-scale operations. Sample survey: Only a representative subset (sample) is surveyed and inferences are drawn about the population. Advantages: Cheaper, faster, manageable. Disadvantages: Sampling error exists (but can be measured if random sampling is used). When to use: Census when population is small or when precision is critical; sample survey for large populations where census is impractical. Five major sources of secondary data in India: 1. Census of India (every 10 years, last in 2011, next delayed): Population, literacy, occupation, housing, amenities for every village and town. 2. NSSO/NSO surveys (National Statistical Office): Consumption expenditure, employment and unemployment (merged into PLFS), health, education — through periodic sample surveys. 3. RBI Bulletin (Reserve Bank of India): Banking data, money supply, forex reserves, interest rates, inflation — monthly publication. 4. Economic Survey (Ministry of Finance, annual): Comprehensive review of India's economy — GDP, inflation, fiscal deficit, sector performance. 5. Periodic Labour Force Survey (PLFS, NSSO): Employment and unemployment data — tracks work participation rate, wage rates, informalisation trends.

5-minute revision

The whole chapter, distilled. Read this the night before the exam.

  • Primary data = collected first-hand for specific purpose; Secondary data = already collected by someone else
  • Census = every unit surveyed; Sample = representative subset — census is expensive and time-consuming
  • Simple random sampling: every unit has equal probability (lottery method or random number tables)
  • Stratified sampling: divide population into strata, random sample from each stratum — best for heterogeneous populations
  • Systematic sampling: every Kth unit selected; K = population size / sample size
  • Non-random sampling (judgment, convenience): cannot measure sampling error; results cannot be generalised
  • Pilot survey: small-scale trial before the full survey to test questionnaire clarity and identify problems
  • Key Indian secondary data sources: Census of India, NSSO/NSO surveys, RBI Bulletin, Economic Survey, PLFS

CBSE marks blueprint

Where the marks come from in this chapter — so you can plan your prep.

Typical chapter weightage: 4-6 marks

Question typeMarks eachTypical countWhat it tests
Short Answer3-41Primary vs secondary data, census vs sample, or types of sampling
Long Answer60-1Detailed comparison of sampling methods or sources of secondary data in India
Prep strategy
  • Make a comparison table: primary vs secondary data (definition, examples, advantages, disadvantages) — this table format works directly in exam answers
  • Memorise the three random sampling methods with one concrete example each; exams ask 'explain stratified sampling with example'
  • Know at least four Indian secondary data sources with what each provides — Census, NSSO/PLFS, RBI Bulletin, Economic Survey

Where this shows up in the real world

This chapter isn't just an exam topic — it lives in the world around you.

NSSO Consumption Surveys

India's National Sample Survey Office uses stratified random sampling to survey lakhs of households across India. The consumption data collected feeds directly into poverty estimates, HDI calculations, and food security policy.

RBI Policy Surveys

RBI collects primary data through its periodic surveys of households (inflation expectations) and industries (order books, capacity utilisation). This data directly shapes India's monetary policy decisions.

Exam strategy

Battle-tested tips from teachers and toppers for this chapter.

  1. When comparing census vs sample: structure the answer as a table — examiners award marks for clear comparison, not long paragraphs
  2. For sampling methods: name the method, give the definition with the probability criterion, then give one concrete example — three steps scores full marks
  3. Always distinguish random from non-random sampling by the key criterion: in random sampling, probability of selection is KNOWN and NON-ZERO
  4. Secondary data questions: be specific — name the institution, what the data covers, and how it is used in India (e.g., PLFS tracks employment and unemployment trends)

Going beyond the textbook

For olympiad aspirants and curious learners — topics that build on this chapter.

  • Explore the concept of sampling frame and sampling frame errors — a problem where the list used for sampling does not match the target population
  • Study cluster sampling as a fourth probability method — used by large surveys like NFHS when a complete population list is unavailable

Where else this chapter is tested

CBSE board isn't the only one — other exams test this chapter too.

CBSE Class 11 BoardHigh
CUETMedium
Research Methods (undergraduate)Medium

Questions students ask

The real ones — pulled from the Q&A community and tutor sessions.

A questionnaire is filled in by the respondent themselves and mailed or given to them. A schedule is filled in by the enumerator (interviewer) based on the respondent's answers during a personal interview. Schedules are more reliable for illiterate populations.

No. Once data has been collected by someone, it is primary data for that researcher. For anyone who uses it later for a different purpose, it is secondary data. The classification depends on who collected it and for what purpose.
Verified by the tuition.in editorial team
Last reviewed on 26 May 2026. Written and reviewed by subject-matter experts — read about our process.
Editorial process →
Header Logo