Ethical Data Collection Concerns and Solutions
Consent
The first ethical data collection concern is consent. This is especially relevant since the targets are teenagers; most will be under eighteen. It could be argued the targets are not mature enough to understand the consequences of their actions. Additionally, currently, the data collected by Instagram can be classified as an “act of consent.” This is unethical because the lack of explicit and educated consent allows users to not fully understand how their data is used, which allows the individual to be treated as a means to an end.
The solution to this ethical concern is that if one chooses to opt into the cybersecurity feature on Instagram, they will be provided with clear, additional information about what other voluntary information they can provide, how it will be used, what benefits exist for user, and how they can opt-out anytime. Essentially, the project, a feature on Instagram, will provide educated consent to the user to opt into this feature. Furthermore, the wording of consent will be checked by model makers and presented to focus groups to ensure that teenagers can fully understand and can consent to this data collection and feature usage.
Sensitive Data
The second ethical data collection concern is the collection of sensitive data. Demographic predictors and texts that correspond to a particular user can be used to create an even more thorough profile than what instagram already collects. Especially since, the engineered features that reveal popularity, influence, frequency of harmful messages can also be correlated to factors such as mental health. This is an ethical concern because Instagram also sells their data to advertisers, who could potentially use these predictors for malicious purposes, for example, a mental health start-up focusing on connecting cash only providers with patients could prey on vulnerable teenagers.
The solution to this ethical concern is multi-pronged. But first it is notable, the collection of demographic factors and comments provides invaluable information that reveals how likely one is to be cyberbullied since certain groups get bullied more and comments are the most prevalent form of cyberbullying. Therefore, including these factors helps identify potential victims and introduce real life interventions. Furthermore, since this feature is created for a safety purpose for teenagers, the data collected should not be sold to advertisers or revealed to any external parties that don’t share the purpose of protecting teens from cyberbullying. While this may face pushback from Instagram sales teams, the potential legal ramifications of selling sensitive data from minors ought to pose incentive for Instagram to keep this data in house. Furthermore, we aim to protect this data through data encryption and anonymization. We could also use data masking to hide sensitive information such as particular comments and user ids while allowing the data to still be worked with. Additionally, audit trails would help identify any changes made after the base training data collection, which can be referred to later if problems arise.