What is Scraping? Explaining the Legal Challenges of this Convenient Data Collection Method that's Gaining Attention
As advancements in data analysis and AI technology continue, ‘data collection’ is garnering attention. In this context, data collection through ‘scraping’ is gaining prominence. Scraping is convenient as it can be easily used even when there is not enough data accumulated within a company. However, depending on how it is used, it could potentially lead to nuisance or illegal activities. Therefore, it is crucial to thoroughly understand the legal issues related to scraping when utilizing it.
In this article, we will explain the legal issues related to scraping for businesses considering its use.
What is Scraping?
Scraping is a computer term derived from the English word “Scraping,” which means “to scrape” or “to gather.” It refers to the technology of extracting, acquiring, and collecting data or information from specific websites or programs.
It is also sometimes referred to as web scraping, web crawling, or web spidering.
In recent years, with the increasing value of data and information, more and more companies are using scraping to extract, acquire, and collect data and information.
Specifically, first, the necessary information is extracted, acquired, and collected through scraping.
Next, the collected data is analyzed, and a database is created according to the purpose of the scraping.
Afterwards, the database is provided to customers, or it is used to benefit the company’s own business.
Cases Where Scraping Becomes a Legal Issue
Scraping does not always lead to legal issues, but it can become problematic in certain cases.
In the following, we will introduce cases where legal issues may arise.
Cases Violating Terms of Use that Prohibit Scraping
When using a specific website, if you have agreed to the website’s terms of use, you are required to use the website in accordance with these terms.
If the terms of use include a clause prohibiting scraping, naturally, those who have agreed to the terms of use cannot violate them by scraping.
If you violate the terms of use and scrape, it could be considered a breach of contract or an illegal act, and you may be held civilly liable for damages or an injunction against scraping by the website operator.
An Attorney Explains How to Create Terms of Use for Web Services and More (Part 1)
An Attorney Explains How to Create Terms of Use for Web Services and More (Part 2)
Cases Violating Copyright Law
There are cases where copyright is granted to data and content on a specific website, and when copyright is granted, it is protected by copyright law.
Therefore, when scraping, you need to be careful not to violate copyright law.
What is Copyright?
Copyright is the right to protect works.
A work is something that creatively expresses thoughts or feelings and belongs to the realm of literature, academia, art, or music (Article 2, Paragraph 1, Item 1 of the Copyright Law).
Cases Where Copyright is Not Granted to the Data or Content Being Scraped
While data and content on a specific website are protected by copyright law when copyright is granted, they are not protected by copyright law when they are merely data and copyright is not granted.
Therefore, when using scraping, it is necessary to check what kind of data you are collecting and consider whether copyright is granted.
Cases Where Copyright is Granted to the Data or Content Being Scraped
If copyright is granted to the data or content being scraped, it will be protected by copyright law.
When scraping, if the process involves copying data or content, proceeding without the consent of the rights holder may infringe the rights holder’s right of reproduction (Article 21 of the Copyright Law).
However, it does not infringe copyright if it falls under Article 30-4 of the Copyright Law (use not intended for the enjoyment of thoughts or feelings expressed in a work), which was added by the amendment to the Copyright Law.
Also, it does not infringe copyright if it falls under Article 47-5 of the Copyright Law (minor use associated with information processing by electronic computers and the provision of its results).
Cases Where High Access to the Server Occurs
Scraping can lead to high access to a website, causing the server to go down and making it impossible to view or display the website.
In this case, the company operating the website in question may not be able to conduct its business due to the server going down, and may be charged with obstruction of business by deception (Article 233 of the Penal Code) or obstruction of business by damaging electronic computers (Article 234-2 of the Penal Code).
Cases Violating the Personal Information Protection Law
There are cases where personal information is obtained through scraping.
When obtaining personal information, it is necessary to clarify the purpose of use to the individual. However, it is considered unrealistic to individually specify the purpose of use to specific individuals.
Therefore, if you are considering a case where you scrape and obtain personal information, it is necessary to publish a privacy policy or personal information protection policy, etc., and clarify the purpose of use.
Note that for personal information that requires special consideration in handling, such as race, creed, social status, medical history, criminal history, etc. (sensitive personal information), it is not possible to obtain it just by publishing a privacy policy or personal information protection policy, etc., and the consent of the individual is required, so caution is needed.
Also, it is assumed that there may be cases where personal information obtained through scraping is databased and provided to third parties.
However, when providing to third parties, in principle, you need to obtain the consent of the individual in advance (Article 27 of the Personal Information Protection Law), so caution is needed in this regard as well.
Actual Cases Where Scraping Became an Issue
An example of a case where scraping became an actual issue is the Okazaki City Central Library incident that occurred around March 2010.
In this case, the Okazaki City Central Library’s book search system experienced an access disruption. It was later discovered that the cause of the disruption was scraping, and the man who performed the scraping was arrested on suspicion of obstructing business by fraud.
The arrested man was a user of the Okazaki City Central Library, but he was dissatisfied with the usability of the library’s book system. He accessed the book system and extracted data from it.
The arrested man was detained for 20 days, but ultimately, he was given a suspended indictment because it was not recognized that he had a strong intention to obstruct the operations of the Okazaki City Central Library.
Although this case resulted in a relatively light punishment of a suspended indictment, depending on the content of the scraping, there is a possibility that it could result in a severe punishment, so caution is necessary.
Summary
In conclusion, we have explained the legal issues related to scraping for those business operators who are considering using it.
Whether or not legal issues arise from scraping depends on how it is used. Therefore, if you scrape data without proper research, you may encounter legal problems, so caution is necessary.
Deciding whether or not scraping will cause legal issues requires specialized knowledge. Therefore, we recommend that business operators who are considering using scraping consult with a lawyer who has specialized knowledge.
Introduction to Our Firm’s Measures
Monolith Law Office is a legal office with high expertise in both IT, particularly the internet, and law. In recent years, attention has been drawn to the need for caution when using web scraping. The need for legal checks is increasingly growing. Our firm analyzes the legal risks associated with businesses that have already started or are about to start, based on various legal regulations, and aims to legalize them as much as possible without stopping the business. Details are described in the article below.