When Hongzhi Gao was young, he lived with his family in Gansu, a province located in central northern China next to the Tengger Desert. Thinking back to his childhood, he recalls the constant, steady dirt wind outside his home, and that for most months of the year no more than a minute passed after leaving before the sand fills any empty spaces and seeps into its interior. pockets, boots and his mouth. The monotony of the desert stayed in his head for years, and in college he turned that memory into an idea to build a machine that can bring plant life to the desert landscape.
Efforts to stop desertification, the process by which fertile land turns into desert, have focused primarily on expensive manual solutions. Hongzhi designed a robot with deep learning technology to automate the tree planting process – from identifying optimal locations to planting tree seedlings and watering. Despite having no experience with AI, as an undergraduate student, Hongzhi used Baidu’s PaddlePaddle deep learning platform to put together different modules to build a robot with better object-sensing capabilities than similar machines already available on the market. It took Hongzhi and his friends less than a year to get the final product up and running.
The Hongzhi desert robot serves as a telling example of the increasing accessibility of artificial intelligence.
Today, more than four million developers are using Baidu’s open source artificial intelligence technology to create solutions that can improve the lives of people in their communities, and many of them have little or no technical experience in the field. . “In the next decade, AI will be the source of changes that will take place in all the fabrics of our society, transforming the way that industries and companies operate. Technology will broaden the human experience by taking us deeper into the digital world, ”said Baidu CEO Robin Li at Baidu Create 2021, an artificial intelligence developer conference.
Entering a new chapter in the evolution of AI, Haifeng Wang, Baidu’s CTO, identified two key trends that underpin the industry’s way forward: AI will continue to mature and increase in technical complexity. And at the same time, the cost of implementation and the barrier to entry will decrease, benefiting both companies building AI-powered solutions at scale and software developers exploring the world of AI.
Fusion of knowledge and data with deep learning
Integrating knowledge and data with deep learning has significantly improved the efficiency and accuracy of AI models. Since 2011, Baidu’s artificial intelligence infrastructure has been acquiring and integrating new information into a large-scale knowledge graph. Currently, this knowledge graph has more than 550 billion facts, covering all aspects of everyday life, as well as industry-specific topics, including manufacturing, pharmaceuticals, law, financial services, technology, and media and entertainment. .
This knowledge graph and the massive data points together form the building blocks of the newly released Baidu PCL-BAIDU Wenxin pre-trained language model (ERINIE 3.0 Titan version). The model outperforms other language models without knowledge graphics in 60 natural language processing (NLP) tasks, including reading comprehension, text classification, and semantic similarity.
Learning through modalities
Intermodal learning is a new area of AI research that seeks to improve the cognitive understanding of machines and better mimic the adaptive behavior of humans. Examples of research efforts in this area include automatic text-to-image synthesis, where a model is trained to generate images from text descriptions only, as well as algorithms built to understand visual content and express that understanding in words. The challenge with these tasks is for machines to create semantic connections between different types of data sets (eg, images, text) and understand the interdependencies between them.
The next step for AI is to merge AI technologies like computer vision, speech recognition, and natural language processing to create a multimodal system.
On this front, Baidu has launched a variant of its NLP models that unites language and visual semantic understanding. Examples of real-world applications for this type of model include digital avatars that can perceive their surroundings as human beings and handle customer service for businesses, and algorithms that can “draw” works of art and compose poems based on their understanding. of the generated artworks. .
There are even more creative and impactful potential outcomes for this technology. The PaddlePaddle platform can build semantic connections through vision and language, prompting a group of master’s students in China to create a dictionary to preserve endangered languages in regions like Yunnan and Guangxi by translating them more easily. to simplified Chinese.
Integration of AI into software and hardware, and in industry-specific use cases
As artificial intelligence systems are applied to solve increasingly complex and industry-specific problems, greater emphasis is placed on optimizing the software (deep learning framework) and hardware (artificial intelligence chip) as a whole. , rather than optimizing each one individually, taking into account factors such as computing power, power consumption, and latency.
In addition, a great innovation is taking place in the platform layer of Baidu’s artificial intelligence infrastructure, where third-party developers are using deep learning capabilities to create new applications tailored for specific use cases. The PaddlePaddle platform has a number of APIs to support artificial intelligence applications in newer technologies, such as quantum computing, life sciences, computational fluid mechanics, and molecular dynamics.
AI also has practical uses. For example, in Shouguang, a small city in Shandong province, AI is used to optimize the fruit and vegetable industry. It only takes two people and one app to manage dozens of vegetable sheds.
And this is remarkable, says Wang: “Despite the increased complexity of artificial intelligence technology, the open source deep learning platform brings together the processor and applications as an operating system, lowering the barriers to entry for applications. companies and individuals looking to incorporate artificial intelligence into their businesses. “
Reduced barrier to entry for developers and end users
On the technology front, the pre-training of large models like PCL-BAIDU Wenxin (ERNIE 3.0 Titan version) has solved many common bottlenecks faced by traditional models. For example, these general-purpose models have helped lay the foundation for executing different types of post-NLP tasks, such as text sorting and answering questions, in one consolidated place, whereas in the past, each type of task had to be solved by a separate model.
PaddlePaddle also has a number of easy-to-develop tools, such as model compression technologies to fine-tune general-purpose models to fit more specific use cases. The platform provides an officially supported library of industrial grade models with over 400 models, from large to small, that retain only a fraction of the size of general-purpose models but can achieve comparable performance, reducing model development and implementation costs.
Today, Baidu’s open source deep learning technology supports a community of more than four million AI developers who have collectively created 476,000 models, contributing to the AI-driven transformation of 157,000 companies and institutions. The examples listed above are the result of innovations occurring at all layers of Baidu’s AI infrastructure, integrating technologies such as speech recognition, computer vision, AR / VR, knowledge graphics, and large-model pre-training that are one step closer to perception. the world as humans.
In its current state, AI has reached a level of maturity that allows it to perform amazing tasks. For example, the recent launch of Metaverse XiRang would not have been possible without the PaddlePaddle platform to create digital avatars for participants around the world to connect from their devices. Furthermore, future advancements in areas such as quantum computing could significantly improve the performance of metaverses. This shows how the different Baidu offerings are intertwined and interdependent.
In a few years, AI will be close to the core of our human experience. It will be to our society what steam power, electricity and the Internet were to previous generations. As AI becomes more complex, developers like Hongzhi will work harder on the ability of artists and designers, given the creative freedom to explore use cases previously considered only theoretically possible. The sky is the limit.
This content was produced by Baidu. It was not written by the editorial staff of MIT Technology Review.