200GB compressed (Split into separate 300MB archives), 1.4TB uncompressed
Format: JSON
Lines: 400 Million
Some of the fields per entry:
full_name, first_name, last_name, gender, birth_date (few have this data), linkedin_url, username, id, last_updated, inferred salary, inferred years of experience, emails [array] [~100M/400M lines have emails, ~140M total emails], skills [array], phone_numbers [array], interests [array], other socials (fb, github, twitter), languages [array], location (locality, region, country,continent, coordinates)
Fields With Detailed Subdata:
jobs [array], education [array], experience [array]