You may have seen the terms “data scraping” or “web scraping” in news stories lately, and asked yourself, “What is data scraping?” Look no further, the pros in OIT have got the answer.
What information can be scraped?
Websites can include a lot of important data – much of which is information that users provide. Think for a moment, what information have you made publicly available on the web? Perhaps within your Facebook or LinkedIn profile details you’ve shared your name, degrees, relationships, location and work history. How many of your account security questions can be answered by this information? For example: what was your first job, or what is your maternal grandfather’s name?
How does it work?
Malicious actors use web scraping tools to extract data from websites. Data scraping creates feeds of information for easy parsing and analysis. Content can be scraped from multiple websites (a phone number here/an email address there) to combine the information and establish an entire user profile.
Note – data scraping is not always used for malicious activities. Marketing companies, content creators and designers often use data scraping tools to research their customer base, find leads or personalize advertisements. Although not malicious, consider asking yourself – do I want companies to know this much information about me?
Here is a key tip to limiting what personal information about you can be scraped off of the web: limit what you put out there! If less of your personal information exists online, less information can be scraped and used – by marketing companies or by malicious actors.
Recently, scraped data from 500 million LinkedIn users was posted online. To be clear – LinkedIn did not experience a data breach, rather actors scraped publicly available content from the site, compiled it together then posted it online. The data included account usernames, full names, email addresses, phone numbers, workplace information, genders and links to other social media accounts.
Review your online profiles, and consider what you want malicious actors and marketers to know about you. Remember – information posted online can be made available to audiences beyond your “friends,” so be cautious what you share.