Didi Global. has been granted a patent for a method that trains an automatic agent using reinforcement learning. The process involves obtaining a secret task for a simulated user, generating actions based on user instructions, and adjusting policies based on rewards for successful task completion. GlobalData’s report on Didi Global gives a 360-degree view of the company including its patenting strategy. Buy the report here.

Smarter leaders trust GlobalData

Report-cover

Data Insights Didi Global Inc - Company Profile

Buy the Report

Data Insights

The gold standard of business intelligence.

Find out more

According to GlobalData’s company profile on Didi Global, V2I communication was a key innovation area identified from patents. Didi Global's grant share as of June 2024 was 39%. Grant share is based on the ratio of number of grants to total number of patents.

Reinforcement learning for training automatic agents with secret tasks

Source: United States Patent and Trademark Office (USPTO). Credit: Didi Global Inc

The patent US12026544B2 outlines a computer-implemented method and system for training an automatic agent through reinforcement learning (RL). The process begins with the agent receiving a secret task, which remains unknown to it, and obtaining instructions from a simulated user based on a user-side RL policy. The automatic agent then generates actions in response to these instructions, while rewards are determined for both the simulated user and the agent based on the successful completion of the secret task. The method emphasizes the adjustment of both user-side and agent-side RL policies according to the rewards received, thereby enhancing the learning process for both parties involved.

Further details in the claims specify that the secret task may involve navigating to a target destination, with the agent predicting this destination based on user input tokens. The agent's actions can include generating response templates and making API calls to provide relevant information, such as geographical coordinates. The system also incorporates penalties for incorrect instructions or excessive communication, and it rewards both the agent and the user based on the similarity of their interactions to previously collected human dialogues. Additionally, the automatic agent can be deployed to interact with human users, generating real actions based on human instructions, thereby broadening its application beyond simulated environments.

To know more about GlobalData’s detailed insights on Didi Global, buy the report here.

Data Insights

From

The gold standard of business intelligence.

Blending expert knowledge with cutting-edge technology, GlobalData’s unrivalled proprietary data will enable you to decode what’s happening in your market. You can make better informed decisions and gain a future-proof advantage over your competitors.

GlobalData

GlobalData, the leading provider of industry intelligence, provided the underlying data, research, and analysis used to produce this article.

GlobalData Patent Analytics tracks bibliographic data, legal events data, point in time patent ownerships, and backward and forward citations from global patenting offices. Textual analysis and official patent classifications are used to group patents into key thematic areas and link them to specific companies across the world’s largest industries.