By the end of this chapter you'll be able to…

  • 1Identify types of correlation (positive/negative, linear/non-linear, simple/multiple/partial) with economic examples
  • 2Interpret degrees of correlation: perfect (±1), high (±0.7–0.99), moderate, low, and zero
  • 3Draw and interpret a scatter diagram to visually assess the direction and strength of correlation
  • 4Calculate Karl Pearson's coefficient of correlation (r) using the deviation method
  • 5Calculate Spearman's rank correlation coefficient and explain when to use it instead of Pearson's r
💡
Why this chapter matters
Correlation is the tool that measures whether two economic variables move together — GDP and employment, savings and investment, price and demand. Understanding that correlation does not imply causation is one of the most important critical thinking skills in economics.

Correlation

"Correlation does not imply causation. But it often whispers where to look."

1. Chapter Overview

CORRELATION measures the STRENGTH and DIRECTION of the RELATIONSHIP between TWO variables. This chapter covers: types of correlation (positive/negative, linear/non-linear, simple/multiple/partial), DEGREES of correlation (perfect, high, low, zero), and KARL PEARSON'S COEFFICIENT OF CORRELATION (r).


2. What Is Correlation?

  • A STATISTICAL MEASURE of the relationship between two variables
  • Answers: When X changes, does Y change SYSTEMATICALLY? In which DIRECTION? How STRONGLY?

Types of Correlation

By DirectionBy FormBy Number of Variables
Positive: X↑ → Y↑ (height & weight)Linear: Points cluster around a straight lineSimple: Two variables
Negative: X↑ → Y↓ (price & quantity demanded)Non-linear: Relationship is curvedMultiple: 3+ variables
Zero: No relationshipPartial: Controlling for other variables

Degree of Correlation

  • Perfect (r = +1 or -1): ALL points fall exactly on a straight line
  • High (r ~ ±0.7 to ±0.99): Strong relationship
  • Moderate (r ~ ±0.3 to ±0.7): Moderate relationship
  • Low (r ~ 0 to ±0.3): Weak relationship
  • Zero (r = 0): No linear relationship (there could be non-linear)

3. Methods of Measuring Correlation

1. Scatter Diagram

  • Plot (X, Y) pairs on a graph. Each point = one observation.
  • The pattern of dots SHOWS the relationship VISUALLY
  • Rough idea. Not precise. Good for FIRST GLANCE.

2. Karl Pearson's Coefficient of Correlation (r)

Or equivalently:

Properties of r

  • Always between —1 and +1 (inclusive)
  • +1: Perfect positive correlation. All points on a RISING line.
  • —1: Perfect negative correlation. All points on a FALLING line.
  • 0: No LINEAR correlation
  • r is UNIT-FREE (doesn't depend on the units of X or Y)
  • r is SYMMETRIC: Correlation(X,Y) = Correlation(Y,X)
  • r is AFFECTED by outliers

4. Correlation vs Causation — THE MOST IMPORTANT CAVEAT

Correlation ≠ Causation

  • Just because X and Y move together does NOT mean X CAUSES Y
  • Examples:
    • Ice cream sales and drowning deaths are POSITIVELY correlated. Does ice cream cause drowning? NO. Both increase in SUMMER (the hidden variable: temperature).
    • Shoe size and reading ability in children are positively correlated. Do big feet make you read better? NO. Older children have bigger feet AND read better. AGE is the hidden variable.

Spurious Correlation

  • A correlation that appears real but is COINCIDENTAL or explained by a THIRD VARIABLE
  • Always ask: is there a LOGICAL CONNECTION? Could a THIRD VARIABLE explain this?

5. Exam Focus

  1. Types — positive/negative, linear/non-linear, simple/multiple
  2. Degree — perfect, high, moderate, low, zero
  3. Scatter diagram — visual inspection of correlation
  4. Karl Pearson's r — formula, properties (range -1 to +1, unit-free, symmetric)
  5. Correlation ≠ Causation — examples

6. Conclusion

Correlation is the first step in understanding relationships between variables:

  • SCATTER DIAGRAM: Look at the dots. Do they suggest a pattern?
  • PEARSON'S r: The numerical measure. -1 to +1. The STRENGTH and DIRECTION of the linear relationship.
  • CAUSATION: Correlation is a CLUE, not a CONCLUSION. Always ask: WHY? What's the MECHANISM? Is there a THIRD VARIABLE?

'The data say A and B go together. The scientist asks: but WHY? Correlation opens the door. Causation walks through it.'

Key formulas & results

Everything you need to memorise, in one card. Screenshot this for revision.

Karl Pearson's r — Deviation Method
r = Σdxdy / √(Σdx² × Σdy²), where dx = X − X̄ and dy = Y − Ȳ
Most commonly used formula in CBSE exams; dx and dy are deviations from respective means
Karl Pearson's r — Actual Mean Method
r = Σ(X − X̄)(Y − Ȳ) / √[Σ(X − X̄)² · Σ(Y − Ȳ)²]
Equivalent to deviation method; numerator = covariance × N; denominator = σX × σY × N
Karl Pearson's r — Assumed Mean (Step Deviation) Method
r = [NΣdxdy − ΣdxΣdy] / √[NΣdx² − (Σdx)²] × √[NΣdy² − (Σdy)²]
Used when actual means are in decimals; dx and dy are deviations from assumed means; this is the most common exam formula
Spearman's Rank Correlation
r = 1 − (6Σd²) / [N(N² − 1)], where d = difference between ranks of corresponding X and Y values
Used for ordinal data (ranks), qualitative variables, or when the data is not normally distributed; d² must be summed for all pairs
⚠️

Common mistakes & fixes

These are the exact errors that cost students marks in board exams. Read them once, save yourself the trouble.

WATCH OUT
Concluding that correlation proves causation
Correlation measures statistical association, NOT causation. Ice cream sales and drowning deaths are positively correlated (both rise in summer) — a third variable (temperature/season) explains both. Always ask: is there a logical mechanism? Could a third variable explain the relationship?
WATCH OUT
Incorrectly computing Σdx², Σdy², or Σdxdy in the step-deviation formula
For the assumed mean method, build a table with columns: X, Y, dx = X−A, dy = Y−B, dx², dy², dxdy. Calculate each column carefully. A single arithmetic error propagates through — show all steps for partial marks.
WATCH OUT
Forgetting that r must lie between −1 and +1
If your calculated r is less than −1 or greater than +1, there is an arithmetic error somewhere. Check that you have correctly computed Σdx², Σdy², and Σdxdy before re-examining the formula substitution.

Practice problems

Try each one yourself before tapping "Show solution". Active recall > rereading.

Q1EASY· types-of-correlation
Identify the type of correlation (positive, negative, or zero) for each pair: (a) Price and quantity demanded. (b) Height and income. (c) Study hours and marks. (d) Temperature and sales of woollen clothes.
Show solution
(a) Price and quantity demanded: NEGATIVE correlation — as price rises, quantity demanded falls (Law of Demand). (b) Height and income: ZERO (or very low) correlation — there is no systematic economic relationship between a person's height and their income. (c) Study hours and marks: POSITIVE correlation — more study time generally leads to better marks. (d) Temperature and sales of woollen clothes: NEGATIVE correlation — as temperature rises, demand for woollens falls.
Q2MEDIUM· spearman-rank
Calculate Spearman's Rank Correlation for the following data on students' ranks in Maths and Economics: Student: A, B, C, D, E, F. Maths rank: 1, 2, 3, 4, 5, 6. Economics rank: 2, 4, 1, 5, 3, 6.
Show solution
Step 1: Calculate d = Maths rank − Economics rank for each student: A: 1−2 = −1. B: 2−4 = −2. C: 3−1 = 2. D: 4−5 = −1. E: 5−3 = 2. F: 6−6 = 0. Step 2: Calculate d²: 1, 4, 4, 1, 4, 0. Step 3: Σd² = 1+4+4+1+4+0 = 14. Step 4: N = 6. r = 1 − (6Σd²) / [N(N²−1)] = 1 − (6 × 14) / [6(36−1)] = 1 − 84/210 = 1 − 0.4 = 0.6. Interpretation: r = 0.6 indicates moderate positive correlation — students who rank well in Maths tend to rank well in Economics, but the relationship is not very strong.
Q3HARD· karl-pearson
Calculate Karl Pearson's r using assumed mean method for: X: 10, 20, 30, 40, 50; Y: 25, 35, 55, 65, 70. Use assumed mean A = 30 for X and B = 55 for Y.
Show solution
Step 1: Build the table. dx = X−30; dy = Y−55. X=10: dx=−20, dy=−30, dx²=400, dy²=900, dxdy=600. X=20: dx=−10, dy=−20, dx²=100, dy²=400, dxdy=200. X=30: dx=0, dy=0, dx²=0, dy²=0, dxdy=0. X=40: dx=10, dy=10, dx²=100, dy²=100, dxdy=100. X=50: dx=20, dy=15, dx²=400, dy²=225, dxdy=300. Step 2: Σdx = −20−10+0+10+20 = 0. Σdy = −30−20+0+10+15 = −25. Σdx² = 400+100+0+100+400 = 1000. Σdy² = 900+400+0+100+225 = 1625. Σdxdy = 600+200+0+100+300 = 1200. N = 5. Step 3: r = [NΣdxdy − ΣdxΣdy] / √[NΣdx²−(Σdx)²] × √[NΣdy²−(Σdy)²] = [5×1200 − 0×(−25)] / √[5×1000−0] × √[5×1625−(−25)²] = 6000 / √5000 × √(8125−625) = 6000 / √5000 × √7500 = 6000 / (70.71 × 86.60) = 6000 / 6123.7 ≈ 0.98. Interpretation: r ≈ 0.98 indicates a very high positive correlation between X and Y.

5-minute revision

The whole chapter, distilled. Read this the night before the exam.

  • Positive correlation: both variables move in the SAME direction (GDP and employment); Negative: OPPOSITE directions (price and quantity demanded)
  • r always lies between −1 and +1: +1 = perfect positive; −1 = perfect negative; 0 = no linear relationship
  • r is unit-free (dimensionless) and symmetric: r(X,Y) = r(Y,X)
  • Karl Pearson's r: Σdxdy / √(Σdx² × Σdy²) where dx = X − X̄, dy = Y − Ȳ
  • Step-deviation formula: [NΣdxdy − ΣdxΣdy] / √[NΣdx²−(Σdx)²] × √[NΣdy²−(Σdy)²]
  • Spearman's rank correlation: r = 1 − 6Σd²/[N(N²−1)]; d = difference in ranks; used for ordinal data
  • Correlation ≠ causation: a third variable (lurking variable) can create spurious correlation
  • Scatter diagram: rising cloud of points = positive; falling = negative; circular cloud = zero correlation

CBSE marks blueprint

Where the marks come from in this chapter — so you can plan your prep.

Typical chapter weightage: 6-8 marks

Question typeMarks eachTypical countWhat it tests
Short Answer31Types of correlation, degree classification, scatter diagram interpretation, or correlation vs causation
Long Answer61Full computation of Karl Pearson's r or Spearman's rank correlation with a table
Prep strategy
  • For Karl Pearson's r: always set up the full table (X, Y, dx, dy, dx², dy², dxdy) — computation marks are given column by column, so a neat table is essential
  • For Spearman's: the d² table is mandatory — calculate d for each pair, square it, sum it, then apply the formula
  • Memorise the degree classification: r = ±1 (perfect), ±0.75–0.99 (high), ±0.25–0.75 (moderate), 0–±0.25 (low), 0 (zero) — these interpretations are asked explicitly

Where this shows up in the real world

This chapter isn't just an exam topic — it lives in the world around you.

RBI Correlation Studies

RBI economists calculate the correlation between credit growth and GDP growth, between inflation and interest rates, and between rupee depreciation and import prices. These correlations inform monetary policy decisions.

Education and Income Studies

NSO surveys show a positive correlation between years of education and monthly wages in India. This correlation supports government investment in education — though economists are careful to control for other factors (family background, caste, location) before claiming causation.

Exam strategy

Battle-tested tips from teachers and toppers for this chapter.

  1. For numerical problems: always show the complete table (all columns, all rows, row sums) before substituting into the formula — each correct column earns marks independently
  2. For theory questions about types of correlation: structure the answer by direction (positive/negative) AND by form (linear/non-linear) AND by number of variables (simple/multiple/partial) — this systematic classification earns full marks
  3. Scatter diagram question: plot points carefully, draw the axis labels, and describe the pattern in words ('points show a positive linear trend') — the verbal description earns marks
  4. Correlation vs causation: always give TWO examples — one where correlation and causation coincide (education and income) and one of spurious correlation (ice cream and drowning)

Going beyond the textbook

For olympiad aspirants and curious learners — topics that build on this chapter.

  • Explore partial correlation: correlation between X and Y after controlling for the effect of a third variable Z — used to isolate the true relationship between two economic variables
  • Study the concept of autocorrelation in time series data — when consecutive observations are correlated with each other, violating the independence assumption used in Pearson's formula

Where else this chapter is tested

CBSE board isn't the only one — other exams test this chapter too.

CBSE Class 11 BoardHigh
CUETHigh
Class 12 Economics / StatisticsMedium

Questions students ask

The real ones — pulled from the Q&A community and tutor sessions.

Use Spearman's rank correlation when: (1) data is in ranks/ordinal form (e.g., performance ratings, competitive exam ranks); (2) the data is qualitative but can be ranked (e.g., beauty, intelligence ranked by a judge); (3) the distribution is highly skewed or not normal. Pearson's r is preferred when data is quantitative, continuous, and approximately normally distributed.

r = 0 means no LINEAR relationship — the variables are uncorrelated in a straight-line sense. But there could still be a strong NON-LINEAR relationship (e.g., U-shaped or curved). Always check the scatter diagram before concluding there is no relationship.
Verified by the tuition.in editorial team
Last reviewed on 26 May 2026. Written and reviewed by subject-matter experts — read about our process.
Editorial process →
Header Logo