The HHS Secretary's Advisory Committee on Human Research Protections (SACHRP) issued draft guidance in March 2013 on Internet Research. That guidance provides a very detailed discussion of the ethical issues involved in internet research. The SACHRP guidance is available at http://www.hhs.gov/ohrp/sachrp/mtgings/2013%20March%20Mtg/internet_research.pdf
The SBS IRB considers the information/data on websites that are freely accessible to anyone with internet access (without requiring user registration, log-in password, etc) to be analogous to information that is observable in physical public spaces, so long as de-identification, aggregation, and other usual anonymizing practices are part of the research design.
Many studies have turned to services that keep contact with a large number of volunteer or paid potential research subjects, who are invited to participate in a specific web-mediated study/survey, to which they are specifically asked to consent when participating. We have distinguished the phases of "recruitment" and "consent" in this way, and allow this kind of recruitment so long as we have reviewed the standards of the web service for separating identifying and other contact information for potential research subjects and any demographic information they may supply in otherwise anonymized fashion as variables of the research protocol.
When third-party web-based databases are downloaded and used for research, we are concerned with whether or not identification of specific individual research subjects is possible from the way such data are presented. If data from two or more databases are connected together by the researcher, we are particularly concerned with the maintenance of anonymity as data-cells become more and more specific.
Of course, just as in non-internet studies, researchers frequently may wish to have access to actual identifiers for research subjects who have properly consented to participate, and to archive these for various periods of time. We understand researchers' needs for identifiers, but will ask that appropriate protections for data confidentiality be in place.
Recently, researchers found a data security vulnerability in Amazon Mechanical Turk (mTurk) that can allow mTurk worker IDs to be connected to personally identifying information that mTurk workers post on their Amazon profile pages. For a thorough discussion of this topic, see the journal article titled "Mechanical Turk is Not Anonymous," available at http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2228728. The IRB will expect that appropriate language be included in mTurk consent scripts to alert mTurk participants to this vulnerability and explain how participant confidentiality will be protected. For example, a consent form might say "Please be aware that any work performed on Amazon MTurk can potentially be linked to information about you on your Amazon public profile page, depending on the settings you have for your Amazon profile. We will not be accessing any personally identifying information about you that you may have put on your Amazon public profile page. We will store your mTurk worker ID separately from the other information you provide to us."