KIBIT was developed in 2012 as an AI engine in support of document review handled by experts and legal professionals, and can be operated with very little training data. When administering eDiscovery for international matters, it is critical to accurately identify relevant documents within a limited period of time from large data volumes that can reach several terabytes per custodian. “KIBIT Automator”, an AI review tool that includes KIBIT, is currently used in Japan and abroad as a legal technology solution that contributes to the efficiency of evidence discovery.
Japanese is a language characterized by the fact that it is not divided into separate words (words separated by spaces) like English, and that it contains particles and other words that do not have meaning on their own. As a result, Japanese language processing by AI requires two technologies: one that decomposes sentences into morphological elements, such as individual words *１ (morphological elements analysis), and a second that analyzes the morphemes obtained from the decomposed sentence. In the latter technique, a single morpheme, such as "は" or "に," has been a challenge to evaluate the degree of relevance in determining whether or not sentences that contain such morphemes are relevant to the matter.
FRONTEO's R&D team has now improved the Illumination Forest algorithm within KIBIT to automatically discard words consisting of a single letter through machine learning. The result is an improvement in Recall*２ in comparison to previous performance, and a reduction of up to 7% in the number of documents that require manual review to find 80% of documents that are relevant to the matter (see figure, using FRONTEO's test data).
This technology can also be applied to other non-shared languages such as Korean and Chinese. One of KIBIT's strengths is its ability to handle difficult Asian languages such as Korean and Chinese, and the results of this research are expected to further improve the accuracy of a wide range of products incorporating KIBIT.
FRONTEO will continue to promote the advancement of its unique AI solutions that contain core strengths in natural language processing, and will strive to develop and improve AI algorithms as a digital forensics and eDiscovery service provider to the legal community.
＊1 Morphological elements: The smallest linguistic unit that has meaning.
＊2 Recall: Percentage of data correctly predicted to be relevant out of all relevant data.
About FRONTEO. Everything we do comes from our service-oriented culture that puts the client at its center. We develop and constantly improve leading-edge technology, and market custom, seamless services that create value for our clients, employees, consumers and shareholders. Our name is no coincidence. “FRONTEO” helps us focus while looking forward, empowering us to apply our AI technology to the mission-critical legal market. We combine our machine learning with unparalleled attention to our clients. At FRONTEO, you are the center of our universe. You define your needs, and we are your helpmates and facilitators. Together, now and in the future, we succeed.