Researchers at MIT have introduced an innovative robot training method called Heterogeneous Pretrained Transformers (HPT), which significantly reduces both time and cost while enhancing robots’ adaptability to new tasks and environments. This cutting-edge approach integrates diverse data from various sources into a cohesive system, creating a shared language that generative AI models can easily interpret.
Lead researcher Lirui Wang highlights that the main challenge in robotics is not just the lack of training data but the vast variety of domains, modalities, and robot hardware. The HPT method unifies multiple data types, including camera images, language instructions, and depth maps, utilizing a transformer model akin to those used in advanced language models.
In practical evaluations, HPT outperformed traditional training methods by over 20% in both simulated and real-world scenarios, even with tasks that differed significantly from its training data. The team compiled an extensive dataset for pretraining, consisting of 52 datasets and over 200,000 robot trajectories, enabling robots to learn from diverse experiences.
A key innovation of HPT is its treatment of proprioception—awareness of a robot’s position and movement—on par with visual data, allowing for more refined dexterous movements. Looking ahead, the researchers aim to enhance HPT’s ability to process unlabeled data, with the ultimate goal of developing a universal robot brain for seamless integration across various robotic platforms.
For More latest tech news, follow Gadgetsfocus on Facebook, and TikTok. Also, subscribe to our YouTube channel.