Data Sources
Where our India PIN code data comes from and how we process it
Primary Data Source — GeoNames
GeoNames is a free geographical database covering all countries of the world, available under the Creative Commons Attribution 4.0 License.
We use the India data file (IN.txt) from GeoNames which contains postal codes, place names, administrative divisions, and geographic coordinates for all locations in India.
- 📄 Format: Tab-separated values (TSV)
- 🏳️ Country: India (IN)
- 📊 Records: ~155,000+ rows
- 🔄 Updated: Periodically by GeoNames
- 📜 License: Creative Commons Attribution 4.0
Data Fields
Each GeoNames record for India contains the following fields that we use in our import pipeline:
| Field | Description | Used For |
|---|---|---|
country_code | Always "IN" for India | Validation |
postal_code | 6-digit PIN code | PIN Code records |
place_name | Area / locality name | Area locations |
admin_name1 | State name | State hierarchy |
admin_name2 | District name | District hierarchy |
admin_name3 | City/Taluk name | City hierarchy |
latitude | Geographic latitude | Coordinates |
longitude | Geographic longitude | Coordinates |
Our 6-Stage Processing Pipeline
Raw GeoNames data is processed through a 6-stage pipeline before being stored in our database:
Generator-based TSV parsing — reads the IN.txt file line by line with minimal memory usage. Handles encoding edge cases and malformed rows.
Normalises Unicode characters, expands abbreviations, trims whitespace, and standardises coordinate formats.
Checks PIN code format (6-digit, valid India range), verifies coordinates are within India's geographic bounds, and flags invalid rows.
Generates URL-safe slugs for all location names using PHP's Intl transliteration and custom rules for Indian place names.
Creates or retrieves State, District, City, and Area location records with correct parent-child relationships. Avoids duplicate insertions using get-or-create logic.
Creates Area ↔ PIN Code mappings in the area_pincode_map table. One PIN code can serve multiple areas, and one area can have multiple PIN codes.
Attribution
Data is provided by GeoNames under the Creative Commons Attribution 4.0 International License.
GeoNames data may not be 100% accurate for every location. If you find an error, please report it to GeoNames directly or contact us through the admin panel.