Foundations›Core· 30 min read

Features (X) and Labels (y)

Features are the inputs you learn from; the label is the answer you want to predict.

What you will learn

Define features and labels
Lay data out as X and y
Use the standard X / y naming

The two parts of every dataset

Supervised learning needs your data split into two parts:

Features — the inputs, the clues you learn from (e.g. a house’s size and number of bedrooms).
Label — the single answer you want to predict (e.g. the house’s price).

By tradition, features are stored in a capital X and the label in a lowercase y. You will see X and y in nearly every ML example, so get comfortable with them now.

A tiny table

Here is a small dataset. The first two columns are features; the last column is the label we want to predict.

Size (sqft) — feature	Bedrooms — feature	Price (lakh) — label
600	1	30
900	2	45
1200	3	60
1500	3	72

Reading across one row: a 600 sqft, 1-bedroom home sold for 30 lakh. The features describe it; the label is the outcome.

The same table as X and y

X holds the input features; y holds the matching answers

# Features: each inner list is one house [size, bedrooms]
X = [
    [600, 1],
    [900, 2],
    [1200, 3],
    [1500, 3],
]

# Label: the matching price for each house, in lakh
y = [30, 45, 60, 72]

print('Number of houses:', len(X))
print('Features of house 3:', X[2], '-> price', y[2])

Note: Output: Number of houses: 4 Features of house 3: [1200, 3] -> price 60 Notice X and y line up row by row: X[2] describes the same house whose price is y[2]. Keeping them aligned is essential.

Watch out: X and y must stay in the same order, with one label per row of features. If they get shuffled apart, the model learns nonsense.

Good features vs useless features

A feature is only worth including if it actually helps predict the label. Compare two candidate features for predicting a house price:

Candidate feature	Good or useless?	Why
Size in sqft	Good	Bigger homes really do cost more — it carries signal
Number of bedrooms	Good	More bedrooms usually means a higher price
Owner’s favourite colour	Useless	It has nothing to do with price — it is just noise
House number on the door	Useless	A door numbered 7 is not pricier than one numbered 6

Feeding in useless features does not just waste effort — it can confuse the model, because it may chase a random coincidence in your small dataset. Picking features that truly relate to the label is half the battle in ML.

Tip: Choosing good features is one of the most important skills in ML. A feature like “house size” is useful for price; a feature like “owner’s favourite colour” is just noise.

Q. In ML, what is the “label” (y)?

Answer: The label (y) is the answer you want to predict. The features (X) are the inputs used to predict it.

✍️ Practice

For “predict a student’s exam score”, list two sensible features and the label.
Add a fifth house to the X and y above and print its features and price.

🏠 Homework

Design a tiny 5-row dataset for any prediction idea. Clearly mark which columns are features and which is the label.