The logo of the Korea Artificial Intelligence and Software Industry Association (KOSA) was highlighted alongside concerns that policies related to the handling of training data, which is a critical resource in the AI era, are not adequately reflecting the needs of the industry.
The KOSA’s Subcommittee for Hyper-Giant AI recently published a “Data Strategy Report for AI Industrial Transition” which includes such insights. KOSA noted, “Different industries require different types of data and processing standards. However, there is still a lack of effective institutional support that reflects these needs.” They further diagnosed that companies have to subjectively interpret the legal and ethical standards they must adhere to in the process of collecting and utilizing training data. This leads to various issues, including difficulties in data collection, deterioration of data quality, and delays in commercialization.
The report shared real-world challenges faced by AI companies. For instance, Company A, which uses medical imaging data, struggled to access data collaboratively because it was kept within specific institutions privately. They also noted the varying data export procedures by institutions, highlighting the need for standardization.
Company B faced issues with reduced object recognition accuracy while developing an AI model for industrial safety diagnostics using CCTV footage, due to de-identification of publicly available training data. They raised the necessity for conditional systems that allow the use of original data within private networks.
The report pointed out that the guidelines for AI training data processing are based on abstract principles, leading to ambiguity in interpretation during practical application. This results in companies taking a conservative approach. Additionally, the boundary between pseudonymized and anonymized information is unclear, leading to frequent cases where different organizations make different judgments on the same data.
Suggested policy improvements included: establishing detailed guidelines based on the data characteristics and usage purposes in various industries such as healthcare, finance, manufacturing, and distribution; creating an integrated, one-stop system that consolidates scattered guidelines from the Personal Information Protection Commission, Ministry of Health and Welfare, Financial Services Commission, etc.; providing specific case studies and judgment criteria that practitioners can easily understand and apply; and expanding regulatory sandboxes for AI training data utilization.
Joon-hee Cho, Chairman of the Korea Artificial Intelligence and Software Industry Association, emphasized the need for a game-changer strategy with a sense of crisis that AI transition is impossible without a data strategy.